[v2] xfstests: add test for btrfs-progs restore feature

Message ID	1393353848-26790-1-git-send-email-fdmanana@gmail.com (mailing list archive)
State	Not Applicable, archived
Headers	show Return-Path: <linux-btrfs-owner@kernel.org> From: Filipe David Borba Manana <fdmanana@gmail.com> To: xfs@oss.sgi.com Cc: linux-btrfs@vger.kernel.org, Filipe David Borba Manana <fdmanana@gmail.com> Subject: [PATCH v2] xfstests: add test for btrfs-progs restore feature Date: Tue, 25 Feb 2014 18:44:08 +0000 Message-Id: <1393353848-26790-1-git-send-email-fdmanana@gmail.com> In-Reply-To: <1393352816-26065-1-git-send-email-fdmanana@gmail.com> References: <1393352816-26065-1-git-send-email-fdmanana@gmail.com> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk

Filipe Manana Feb. 25, 2014, 6:44 p.m. UTC

This is a regression test to verify that the restore feature of btrfs-progs
is able to correctly recover files that have compressed extents, specially when
the respective file extent items have a non-zero data offset field.

This issue is fixed by the following btrfs-progs patch:

    Btrfs-progs: fix restore of files with compressed extents

Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
---

V2: Fixed title of btrfs-progs patch in the comment and commit message.

 tests/btrfs/043     |  105 +++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/btrfs/043.out |   40 ++++++++++++++++++++
 tests/btrfs/group   |    1 +
 3 files changed, 146 insertions(+)
 create mode 100755 tests/btrfs/043
 create mode 100644 tests/btrfs/043.out

Dave Chinner Feb. 25, 2014, 7:54 p.m. UTC | #1

On Tue, Feb 25, 2014 at 06:44:08PM +0000, Filipe David Borba Manana wrote:
> This is a regression test to verify that the restore feature of btrfs-progs
> is able to correctly recover files that have compressed extents, specially when
> the respective file extent items have a non-zero data offset field.
> 
> This issue is fixed by the following btrfs-progs patch:
> 
>     Btrfs-progs: fix restore of files with compressed extents
> 
> Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
....
> +seq=`basename $0`
> +seqres=$RESULT_DIR/$seq
> +echo "QA output created by $seq"
> +
> +tmp=/tmp/$$
> +status=1	# failure is the default!
> +trap "_cleanup; exit \$status" 0 1 2 3 15

here=`pwd`

> +
> +_cleanup()
> +{
> +    rm -fr $tmp
> +}
> +
> +# get standard environment, filters and checks
> +. ./common/rc
> +. ./common/filter
> +
> +# real QA test starts here
> +_supported_fs btrfs
> +_supported_os Linux
> +_require_scratch
> +_need_to_be_root
> +
> +rm -f $seqres.full
> +
> +test_btrfs_restore()
> +{
> +	if [ -z $1 ]
> +	then
> +		OPTIONS=""
> +	else
> +		OPTIONS="-o compress-force=$1"
> +	fi
> +	_scratch_mkfs >/dev/null 2>&1
> +	_scratch_mount $OPTIONS
> +
> +	$XFS_IO_PROG -f -c "pwrite -S 0xff -b 100000 0 100000" \
> +		$SCRATCH_MNT/foo | _filter_xfs_io
> +
> +	# Ensure a single file extent item is persisted.
> +	_run_btrfs_util_prog filesystem sync $SCRATCH_MNT

What's the difference here between "sync" and the command run above?
Unless there's some specific reason for using the above command (and
that needs to be commented), I think that sync(1) should be used
instead in all tests.

Indeed, why a separate command - just adding a "-c fsync" to the
xfs_io command, or even -s to make it open the file O_SYNC should do
what you need without needing a specific sync command....

> +
> +	$XFS_IO_PROG -c "pwrite -S 0xaa -b 100000 100000 100000" \
> +		$SCRATCH_MNT/foo | _filter_xfs_io
> +
> +	# Now ensure a second one is created (and not merged with previous one).
> +	_run_btrfs_util_prog filesystem sync $SCRATCH_MNT
> +
> +	# Make the extent item be split into several ones, each with a data
> +	# offset field != 0
> +	$XFS_IO_PROG -c "pwrite -S 0x1e -b 2 10000 2" $SCRATCH_MNT/foo \
> +		| _filter_xfs_io
> +	$XFS_IO_PROG -c "pwrite -S 0xd0 -b 11 33000 11" $SCRATCH_MNT/foo \
> +		| _filter_xfs_io
> +	$XFS_IO_PROG -c "pwrite -S 0xbc -b 100 99000 100" $SCRATCH_MNT/foo \
> +		| _filter_xfs_io
> +
> +	md5sum $SCRATCH_MNT/foo | _filter_scratch

So you are doing this with first having "persisted" the new extents.
Seems kind of strange that you need to persist some and not
others...

> +	_scratch_unmount
> +	_check_scratch_fs

_check_scratch_fs should be unmounting the SCRATCH_DEV itself
internally. If it's not doing that for btrfs, then the btrfs check
code needs fixing. ;)

> +
> +	_run_btrfs_util_prog restore $SCRATCH_DEV $tmp
> +	md5sum $tmp/foo | cut -d ' ' -f 1

What, exactly, are you restoring to /tmp/$$? Does this assume that
/tmp is a btrfs filesystem? If so, that is an invalid assumption -
/tmp can be any type of filesystem at all.

It's also wrong to use $tmp like this....

> +}
> +
> +mkdir $tmp
> +echo "Testing restore of file compressed with lzo"
> +test_btrfs_restore "lzo"
> +echo "Testing restore of file compressed with zlib"
> +test_btrfs_restore "zlib"
> +echo "Testing restore of file without any compression"
> +test_btrfs_restore

Yup, using $tmp like this is definitely wrong. $tmp is really for test
harness files and test logs, not for *test data*. TEST_DIR is what you
should be using here, not $tmp.

Cheers,

Dave.

Filipe Manana Feb. 25, 2014, 9:02 p.m. UTC | #2

On Tue, Feb 25, 2014 at 7:54 PM, Dave Chinner <david@fromorbit.com> wrote:
> On Tue, Feb 25, 2014 at 06:44:08PM +0000, Filipe David Borba Manana wrote:
>> This is a regression test to verify that the restore feature of btrfs-progs
>> is able to correctly recover files that have compressed extents, specially when
>> the respective file extent items have a non-zero data offset field.
>>
>> This issue is fixed by the following btrfs-progs patch:
>>
>>     Btrfs-progs: fix restore of files with compressed extents
>>
>> Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
> ....
>> +seq=`basename $0`
>> +seqres=$RESULT_DIR/$seq
>> +echo "QA output created by $seq"
>> +
>> +tmp=/tmp/$$
>> +status=1     # failure is the default!
>> +trap "_cleanup; exit \$status" 0 1 2 3 15
>
> here=`pwd`

Didn't we agree before, for a previous path, to export "here" from the
main control skip and then cleanup tests to not redefine it?
I am confused now :)

>
>> +
>> +_cleanup()
>> +{
>> +    rm -fr $tmp
>> +}
>> +
>> +# get standard environment, filters and checks
>> +. ./common/rc
>> +. ./common/filter
>> +
>> +# real QA test starts here
>> +_supported_fs btrfs
>> +_supported_os Linux
>> +_require_scratch
>> +_need_to_be_root
>> +
>> +rm -f $seqres.full
>> +
>> +test_btrfs_restore()
>> +{
>> +     if [ -z $1 ]
>> +     then
>> +             OPTIONS=""
>> +     else
>> +             OPTIONS="-o compress-force=$1"
>> +     fi
>> +     _scratch_mkfs >/dev/null 2>&1
>> +     _scratch_mount $OPTIONS
>> +
>> +     $XFS_IO_PROG -f -c "pwrite -S 0xff -b 100000 0 100000" \
>> +             $SCRATCH_MNT/foo | _filter_xfs_io
>> +
>> +     # Ensure a single file extent item is persisted.
>> +     _run_btrfs_util_prog filesystem sync $SCRATCH_MNT
>
> What's the difference here between "sync" and the command run above?
> Unless there's some specific reason for using the above command (and
> that needs to be commented), I think that sync(1) should be used
> instead in all tests.

I want to force a btrfs transaction commit. Plain old 'sync' would do
it as well for sure, but that applies against all mounted FSs, while
btrfs filesystem sync is applied against a single fs.
Plus, since this is a btrfs specific test, is it non sense to exercise
commands from btrfs-progs?

>
> Indeed, why a separate command - just adding a "-c fsync" to the
> xfs_io command, or even -s to make it open the file O_SYNC should do
> what you need without needing a specific sync command....
>
>
>> +
>> +     $XFS_IO_PROG -c "pwrite -S 0xaa -b 100000 100000 100000" \
>> +             $SCRATCH_MNT/foo | _filter_xfs_io
>> +
>> +     # Now ensure a second one is created (and not merged with previous one).
>> +     _run_btrfs_util_prog filesystem sync $SCRATCH_MNT
>> +
>> +     # Make the extent item be split into several ones, each with a data
>> +     # offset field != 0
>> +     $XFS_IO_PROG -c "pwrite -S 0x1e -b 2 10000 2" $SCRATCH_MNT/foo \
>> +             | _filter_xfs_io
>> +     $XFS_IO_PROG -c "pwrite -S 0xd0 -b 11 33000 11" $SCRATCH_MNT/foo \
>> +             | _filter_xfs_io
>> +     $XFS_IO_PROG -c "pwrite -S 0xbc -b 100 99000 100" $SCRATCH_MNT/foo \
>> +             | _filter_xfs_io
>> +
>> +     md5sum $SCRATCH_MNT/foo | _filter_scratch
>
> So you are doing this with first having "persisted" the new extents.
> Seems kind of strange that you need to persist some and not
> others...

I need to make sure there's fragmentation (i.e. several file extent
items in the fs btree with data offset fields > 0).

>
>> +     _scratch_unmount
>> +     _check_scratch_fs
>
> _check_scratch_fs should be unmounting the SCRATCH_DEV itself
> internally. If it's not doing that for btrfs, then the btrfs check
> code needs fixing. ;)

No it doesn't unmount.

>
>> +
>> +     _run_btrfs_util_prog restore $SCRATCH_DEV $tmp
>> +     md5sum $tmp/foo | cut -d ' ' -f 1
>
> What, exactly, are you restoring to /tmp/$$? Does this assume that
> /tmp is a btrfs filesystem? If so, that is an invalid assumption -
> /tmp can be any type of filesystem at all.

The restore command allows you to grab files from a (potentially
damaged) btrfs filesystem and save them to a destination path, no
matter what its filesystem is (btrfs, extN, xfs, etc)

>
> It's also wrong to use $tmp like this....
>
>> +}
>> +
>> +mkdir $tmp
>> +echo "Testing restore of file compressed with lzo"
>> +test_btrfs_restore "lzo"
>> +echo "Testing restore of file compressed with zlib"
>> +test_btrfs_restore "zlib"
>> +echo "Testing restore of file without any compression"
>> +test_btrfs_restore
>
> Yup, using $tmp like this is definitely wrong. $tmp is really for test
> harness files and test logs, not for *test data*. TEST_DIR is what you
> should be using here, not $tmp.

Alright... Many other existing tests do things like this.

Thanks Dave

>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com

Dave Chinner Feb. 25, 2014, 10:11 p.m. UTC | #3

On Tue, Feb 25, 2014 at 09:02:43PM +0000, Filipe David Manana wrote:
> On Tue, Feb 25, 2014 at 7:54 PM, Dave Chinner <david@fromorbit.com> wrote:
> > On Tue, Feb 25, 2014 at 06:44:08PM +0000, Filipe David Borba Manana wrote:
> >> This is a regression test to verify that the restore feature of btrfs-progs
> >> is able to correctly recover files that have compressed extents, specially when
> >> the respective file extent items have a non-zero data offset field.
> >>
> >> This issue is fixed by the following btrfs-progs patch:
> >>
> >>     Btrfs-progs: fix restore of files with compressed extents
> >>
> >> Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
> > ....
> >> +seq=`basename $0`
> >> +seqres=$RESULT_DIR/$seq
> >> +echo "QA output created by $seq"
> >> +
> >> +tmp=/tmp/$$
> >> +status=1     # failure is the default!
> >> +trap "_cleanup; exit \$status" 0 1 2 3 15
> >
> > here=`pwd`
> 
> Didn't we agree before, for a previous path, to export "here" from the
> main control skip and then cleanup tests to not redefine it?
> I am confused now :)

Yes, we did, but there's no patch to do that yet. Hence tests need
to define it until the infrastructure is changed.....

> >> +     _scratch_mount $OPTIONS
> >> +
> >> +     $XFS_IO_PROG -f -c "pwrite -S 0xff -b 100000 0 100000" \
> >> +             $SCRATCH_MNT/foo | _filter_xfs_io
> >> +
> >> +     # Ensure a single file extent item is persisted.
> >> +     _run_btrfs_util_prog filesystem sync $SCRATCH_MNT
> >
> > What's the difference here between "sync" and the command run above?
> > Unless there's some specific reason for using the above command (and
> > that needs to be commented), I think that sync(1) should be used
> > instead in all tests.
> 
> I want to force a btrfs transaction commit. Plain old 'sync' would do
> it as well for sure, but that applies against all mounted FSs, while
> btrfs filesystem sync is applied against a single fs.

And the problem with syncing all the filesystems is what?

> Plus, since this is a btrfs specific test, is it non sense to exercise
> commands from btrfs-progs?

At the expense of testing the command that almost every user in
the world uses to sync their filesystems?

> >> +             | _filter_xfs_io
> >> +     $XFS_IO_PROG -c "pwrite -S 0xd0 -b 11 33000 11" $SCRATCH_MNT/foo \
> >> +             | _filter_xfs_io
> >> +     $XFS_IO_PROG -c "pwrite -S 0xbc -b 100 99000 100" $SCRATCH_MNT/foo \
> >> +             | _filter_xfs_io
> >> +
> >> +     md5sum $SCRATCH_MNT/foo | _filter_scratch
> >
> > So you are doing this with first having "persisted" the new extents.
> > Seems kind of strange that you need to persist some and not
> > others...
> 
> I need to make sure there's fragmentation (i.e. several file extent
> items in the fs btree with data offset fields > 0).

Right, but my question is why you haven't ensured that btree is on
disk at the time you run the md5sum. That seems important to me
because the above sync commands indicate that having the extents on
disk rather than in memory is important here. e.g. are you expecting
the md5sum to be correct before the data is synced to disk, and then
incorrect after the data is synced to disk by the unmount?

Will you remember this detail in five years time when this
test detects a regression? Indeed, will you even be around to
explain why the test does this in five years time? These regression
tests are going to be around for the entire life of btrfs, so we
better make sure that there's enough information in them to be able
to maintain them in 10-15 years time....

> 
> >
> >> +     _scratch_unmount
> >> +     _check_scratch_fs
> >
> > _check_scratch_fs should be unmounting the SCRATCH_DEV itself
> > internally. If it's not doing that for btrfs, then the btrfs check
> > code needs fixing. ;)
> 
> No it doesn't unmount.

Then _check_btrfs_filesystem() needs fixing. It certainly does have
code in it to check and unmount the scratch device, so if that is
not happening then there's something broken that needs to be fixed.

Fix the infrastructure bug, don't work around it.

> >> +
> >> +     _run_btrfs_util_prog restore $SCRATCH_DEV $tmp
> >> +     md5sum $tmp/foo | cut -d ' ' -f 1
> >
> > What, exactly, are you restoring to /tmp/$$? Does this assume that
> > /tmp is a btrfs filesystem? If so, that is an invalid assumption -
> > /tmp can be any type of filesystem at all.
> 
> The restore command allows you to grab files from a (potentially
> damaged) btrfs filesystem and save them to a destination path, no
> matter what its filesystem is (btrfs, extN, xfs, etc)

Needs a comment.

[ Ugh. btrfs defines "restore" to mean "recover from [broken]
device", not "restore from backup" like it's used everywhere else in
the filesystem world? ]

> > It's also wrong to use $tmp like this....
> >
> >> +}
> >> +
> >> +mkdir $tmp
> >> +echo "Testing restore of file compressed with lzo"
> >> +test_btrfs_restore "lzo"
> >> +echo "Testing restore of file compressed with zlib"
> >> +test_btrfs_restore "zlib"
> >> +echo "Testing restore of file without any compression"
> >> +test_btrfs_restore
> >
> > Yup, using $tmp like this is definitely wrong. $tmp is really for test
> > harness files and test logs, not for *test data*. TEST_DIR is what you
> > should be using here, not $tmp.
> 
> Alright... Many other existing tests do things like this.

Yes, but we don't want new tests to do this. I get annoyed by tests
failing due to running of disk space on /tmp or OOMing on machines
that use a small tmpfs filesystem for /tmp because tests use it for
dumping large test files rather than TEST_DIR....

Cheers,

Dave.

Filipe Manana Feb. 25, 2014, 10:34 p.m. UTC | #4

On Tue, Feb 25, 2014 at 10:11 PM, Dave Chinner <david@fromorbit.com> wrote:
> On Tue, Feb 25, 2014 at 09:02:43PM +0000, Filipe David Manana wrote:
>> On Tue, Feb 25, 2014 at 7:54 PM, Dave Chinner <david@fromorbit.com> wrote:
>> > On Tue, Feb 25, 2014 at 06:44:08PM +0000, Filipe David Borba Manana wrote:
>> >> This is a regression test to verify that the restore feature of btrfs-progs
>> >> is able to correctly recover files that have compressed extents, specially when
>> >> the respective file extent items have a non-zero data offset field.
>> >>
>> >> This issue is fixed by the following btrfs-progs patch:
>> >>
>> >>     Btrfs-progs: fix restore of files with compressed extents
>> >>
>> >> Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
>> > ....
>> >> +seq=`basename $0`
>> >> +seqres=$RESULT_DIR/$seq
>> >> +echo "QA output created by $seq"
>> >> +
>> >> +tmp=/tmp/$$
>> >> +status=1     # failure is the default!
>> >> +trap "_cleanup; exit \$status" 0 1 2 3 15
>> >
>> > here=`pwd`
>>
>> Didn't we agree before, for a previous path, to export "here" from the
>> main control skip and then cleanup tests to not redefine it?
>> I am confused now :)
>
> Yes, we did, but there's no patch to do that yet. Hence tests need
> to define it until the infrastructure is changed.....

There's a patch flying around that adds the _require_fssum() and then
removes definition of "here" for all btrfs tests that use fssum.

>
>> >> +     _scratch_mount $OPTIONS
>> >> +
>> >> +     $XFS_IO_PROG -f -c "pwrite -S 0xff -b 100000 0 100000" \
>> >> +             $SCRATCH_MNT/foo | _filter_xfs_io
>> >> +
>> >> +     # Ensure a single file extent item is persisted.
>> >> +     _run_btrfs_util_prog filesystem sync $SCRATCH_MNT
>> >
>> > What's the difference here between "sync" and the command run above?
>> > Unless there's some specific reason for using the above command (and
>> > that needs to be commented), I think that sync(1) should be used
>> > instead in all tests.
>>
>> I want to force a btrfs transaction commit. Plain old 'sync' would do
>> it as well for sure, but that applies against all mounted FSs, while
>> btrfs filesystem sync is applied against a single fs.
>
> And the problem with syncing all the filesystems is what?
>
>> Plus, since this is a btrfs specific test, is it non sense to exercise
>> commands from btrfs-progs?
>
> At the expense of testing the command that almost every user in
> the world uses to sync their filesystems?
>
>> >> +             | _filter_xfs_io
>> >> +     $XFS_IO_PROG -c "pwrite -S 0xd0 -b 11 33000 11" $SCRATCH_MNT/foo \
>> >> +             | _filter_xfs_io
>> >> +     $XFS_IO_PROG -c "pwrite -S 0xbc -b 100 99000 100" $SCRATCH_MNT/foo \
>> >> +             | _filter_xfs_io
>> >> +
>> >> +     md5sum $SCRATCH_MNT/foo | _filter_scratch
>> >
>> > So you are doing this with first having "persisted" the new extents.
>> > Seems kind of strange that you need to persist some and not
>> > others...

All I want is to have different file extent items.

>>
>> I need to make sure there's fragmentation (i.e. several file extent
>> items in the fs btree with data offset fields > 0).
>
> Right, but my question is why you haven't ensured that btree is on
> disk at the time you run the md5sum.

Because it's not needed.
The sync is only to make sure the first 2 extent items aren't merged
together. And that is needed to trigger the failure.

> That seems important to me
> because the above sync commands indicate that having the extents on
> disk rather than in memory is important here. e.g. are you expecting
> the md5sum to be correct before the data is synced to disk, and then
> incorrect after the data is synced to disk by the unmount?

Again what's important is having multiple extent items after unmounting.

>
> Will you remember this detail in five years time when this
> test detects a regression? Indeed, will you even be around to
> explain why the test does this in five years time? These regression
> tests are going to be around for the entire life of btrfs, so we
> better make sure that there's enough information in them to be able
> to maintain them in 10-15 years time....

Ok, I hope to be alive in 5, 10 or even 15 years, and I'll make an
effort to remain healthy :)
I'll rephrase the comments to hopefully be more clear about the
intention and conditions necessary to trigger the issue.

>
>>
>> >
>> >> +     _scratch_unmount
>> >> +     _check_scratch_fs
>> >
>> > _check_scratch_fs should be unmounting the SCRATCH_DEV itself
>> > internally. If it's not doing that for btrfs, then the btrfs check
>> > code needs fixing. ;)
>>
>> No it doesn't unmount.
>
> Then _check_btrfs_filesystem() needs fixing. It certainly does have
> code in it to check and unmount the scratch device, so if that is
> not happening then there's something broken that needs to be fixed.

So _check_btrfs_filesystem() will unmount the fs if it's mounted, do
the fsck thing and then remount it. If the isn't mounted when it's
called, it will not mount/remount it after doing the fsck. Very
explicit, I don't know the motivation for that behaviour.

>
> Fix the infrastructure bug, don't work around it.
>
>> >> +
>> >> +     _run_btrfs_util_prog restore $SCRATCH_DEV $tmp
>> >> +     md5sum $tmp/foo | cut -d ' ' -f 1
>> >
>> > What, exactly, are you restoring to /tmp/$$? Does this assume that
>> > /tmp is a btrfs filesystem? If so, that is an invalid assumption -
>> > /tmp can be any type of filesystem at all.
>>
>> The restore command allows you to grab files from a (potentially
>> damaged) btrfs filesystem and save them to a destination path, no
>> matter what its filesystem is (btrfs, extN, xfs, etc)
>
> Needs a comment.

Ok... so should I make a comment to explain what btrfs restore does?
Is it unreasonable to expect an unfamiliar reader to run btrfs --help
or check the man page for example to see what this command is?

>
> [ Ugh. btrfs defines "restore" to mean "recover from [broken]
> device", not "restore from backup" like it's used everywhere else in
> the filesystem world? ]
>
>
>> > It's also wrong to use $tmp like this....
>> >
>> >> +}
>> >> +
>> >> +mkdir $tmp
>> >> +echo "Testing restore of file compressed with lzo"
>> >> +test_btrfs_restore "lzo"
>> >> +echo "Testing restore of file compressed with zlib"
>> >> +test_btrfs_restore "zlib"
>> >> +echo "Testing restore of file without any compression"
>> >> +test_btrfs_restore
>> >
>> > Yup, using $tmp like this is definitely wrong. $tmp is really for test
>> > harness files and test logs, not for *test data*. TEST_DIR is what you
>> > should be using here, not $tmp.
>>
>> Alright... Many other existing tests do things like this.
>
> Yes, but we don't want new tests to do this. I get annoyed by tests
> failing due to running of disk space on /tmp or OOMing on machines
> that use a small tmpfs filesystem for /tmp because tests use it for
> dumping large test files rather than TEST_DIR....
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com

Dave Chinner Feb. 25, 2014, 11:33 p.m. UTC | #5

On Tue, Feb 25, 2014 at 10:34:07PM +0000, Filipe David Manana wrote:
> On Tue, Feb 25, 2014 at 10:11 PM, Dave Chinner <david@fromorbit.com> wrote:
> > On Tue, Feb 25, 2014 at 09:02:43PM +0000, Filipe David Manana wrote:
> >> On Tue, Feb 25, 2014 at 7:54 PM, Dave Chinner <david@fromorbit.com> wrote:
> >> > On Tue, Feb 25, 2014 at 06:44:08PM +0000, Filipe David Borba Manana wrote:
> >> >> This is a regression test to verify that the restore feature of btrfs-progs
> >> >> is able to correctly recover files that have compressed extents, specially when
> >> >> the respective file extent items have a non-zero data offset field.
> >> >>
> >> >> This issue is fixed by the following btrfs-progs patch:
> >> >>
> >> >>     Btrfs-progs: fix restore of files with compressed extents
> >> >>
> >> >> Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
> >> > ....
> >> >> +seq=`basename $0`
> >> >> +seqres=$RESULT_DIR/$seq
> >> >> +echo "QA output created by $seq"
> >> >> +
> >> >> +tmp=/tmp/$$
> >> >> +status=1     # failure is the default!
> >> >> +trap "_cleanup; exit \$status" 0 1 2 3 15
> >> >
> >> > here=`pwd`
> >>
> >> Didn't we agree before, for a previous path, to export "here" from the
> >> main control skip and then cleanup tests to not redefine it?
> >> I am confused now :)
> >
> > Yes, we did, but there's no patch to do that yet. Hence tests need
> > to define it until the infrastructure is changed.....
> 
> There's a patch flying around that adds the _require_fssum() and then
> removes definition of "here" for all btrfs tests that use fssum.

changing how $here is defined needs to be in a patch of it's own,
and that patch needs to remove it from every single test in the
xfstests code base that declares it. test harness infrastructure
changes should not be buried in an unrelated btrfs test changes....

> >> >> +             | _filter_xfs_io
> >> >> +     $XFS_IO_PROG -c "pwrite -S 0xd0 -b 11 33000 11" $SCRATCH_MNT/foo \
> >> >> +             | _filter_xfs_io
> >> >> +     $XFS_IO_PROG -c "pwrite -S 0xbc -b 100 99000 100" $SCRATCH_MNT/foo \
> >> >> +             | _filter_xfs_io
> >> >> +
> >> >> +     md5sum $SCRATCH_MNT/foo | _filter_scratch
> >> >
> >> > So you are doing this with first having "persisted" the new extents.
> >> > Seems kind of strange that you need to persist some and not
> >> > others...
> 
> All I want is to have different file extent items.

Yes, I get that. What is not clear is where you expect
the failure to be detected - in memory or on disk?

> >> I need to make sure there's fragmentation (i.e. several file extent
> >> items in the fs btree with data offset fields > 0).
> >
> > Right, but my question is why you haven't ensured that btree is on
> > disk at the time you run the md5sum.
> 
> Because it's not needed.
> The sync is only to make sure the first 2 extent items aren't merged
> together. And that is needed to trigger the failure.

Yes, it's a pre-condition. That's not the answer to the question
I've been asking, though.

> > That seems important to me
> > because the above sync commands indicate that having the extents on
> > disk rather than in memory is important here. e.g. are you expecting
> > the md5sum to be correct before the data is synced to disk, and then
> > incorrect after the data is synced to disk by the unmount?
> 
> Again what's important is having multiple extent items after unmounting.

Ah, so syncing the data after the second set of writes is important,
and that's what you are testing. So, why aren't you testing this
with sync and fiemap? Or is it the unmount that matters, rather than
the data writeback?

Do you see what I'm trying to understand? If it's data writeback
that triggers the bug or enables it to be detected, then that's what
the test should use as the trigger. If unmount is necessary, then a
comment saying "data writeback is not sufficient to trigger or
detect the corruption - we need can only detect it via unmount and a
fsck pass"....

> > code in it to check and unmount the scratch device, so if that is
> > not happening then there's something broken that needs to be fixed.
> 
> So _check_btrfs_filesystem() will unmount the fs if it's mounted, do
> the fsck thing and then remount it. If the isn't mounted when it's
> called, it will not mount/remount it after doing the fsck. Very
> explicit, I don't know the motivation for that behaviour.

Ok, so the problem is not as you described - it's not that
_check_scratch_fs doesn't unmount the filesystem, it's that it
mounts it again after the check and the test requires it to be
unmounted *after* it has been checked.  See how much time a simple
comment can save? :)

> >> matter what its filesystem is (btrfs, extN, xfs, etc)
> >
> > Needs a comment.
> 
> Ok... so should I make a comment to explain what btrfs restore does?
> Is it unreasonable to expect an unfamiliar reader to run btrfs --help
> or check the man page for example to see what this command is?

It is unreasonable to expect them to understand why you used btrfs
restore rather than just doing "_scratch_check_fs; md5sum
$testfile". That's what the comment needs to explain, because I
still don't understand why the test uses btrfs restore or why it
would give any different result to the script fragment in the
previous sentence...

Cheers,

Dave.

Filipe Manana Feb. 26, 2014, 12:34 a.m. UTC | #6

On Tue, Feb 25, 2014 at 11:33 PM, Dave Chinner <david@fromorbit.com> wrote:
> On Tue, Feb 25, 2014 at 10:34:07PM +0000, Filipe David Manana wrote:
>> On Tue, Feb 25, 2014 at 10:11 PM, Dave Chinner <david@fromorbit.com> wrote:
>> > On Tue, Feb 25, 2014 at 09:02:43PM +0000, Filipe David Manana wrote:
>> >> On Tue, Feb 25, 2014 at 7:54 PM, Dave Chinner <david@fromorbit.com> wrote:
>> >> > On Tue, Feb 25, 2014 at 06:44:08PM +0000, Filipe David Borba Manana wrote:
>> >> >> This is a regression test to verify that the restore feature of btrfs-progs
>> >> >> is able to correctly recover files that have compressed extents, specially when
>> >> >> the respective file extent items have a non-zero data offset field.
>> >> >>
>> >> >> This issue is fixed by the following btrfs-progs patch:
>> >> >>
>> >> >>     Btrfs-progs: fix restore of files with compressed extents
>> >> >>
>> >> >> Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
>> >> > ....
>> >> >> +seq=`basename $0`
>> >> >> +seqres=$RESULT_DIR/$seq
>> >> >> +echo "QA output created by $seq"
>> >> >> +
>> >> >> +tmp=/tmp/$$
>> >> >> +status=1     # failure is the default!
>> >> >> +trap "_cleanup; exit \$status" 0 1 2 3 15
>> >> >
>> >> > here=`pwd`
>> >>
>> >> Didn't we agree before, for a previous path, to export "here" from the
>> >> main control skip and then cleanup tests to not redefine it?
>> >> I am confused now :)
>> >
>> > Yes, we did, but there's no patch to do that yet. Hence tests need
>> > to define it until the infrastructure is changed.....
>>
>> There's a patch flying around that adds the _require_fssum() and then
>> removes definition of "here" for all btrfs tests that use fssum.
>
> changing how $here is defined needs to be in a patch of it's own,
> and that patch needs to remove it from every single test in the
> xfstests code base that declares it. test harness infrastructure
> changes should not be buried in an unrelated btrfs test changes....
>
>> >> >> +             | _filter_xfs_io
>> >> >> +     $XFS_IO_PROG -c "pwrite -S 0xd0 -b 11 33000 11" $SCRATCH_MNT/foo \
>> >> >> +             | _filter_xfs_io
>> >> >> +     $XFS_IO_PROG -c "pwrite -S 0xbc -b 100 99000 100" $SCRATCH_MNT/foo \
>> >> >> +             | _filter_xfs_io
>> >> >> +
>> >> >> +     md5sum $SCRATCH_MNT/foo | _filter_scratch
>> >> >
>> >> > So you are doing this with first having "persisted" the new extents.
>> >> > Seems kind of strange that you need to persist some and not
>> >> > others...
>>
>> All I want is to have different file extent items.
>
> Yes, I get that. What is not clear is where you expect
> the failure to be detected - in memory or on disk?
>
>> >> I need to make sure there's fragmentation (i.e. several file extent
>> >> items in the fs btree with data offset fields > 0).
>> >
>> > Right, but my question is why you haven't ensured that btree is on
>> > disk at the time you run the md5sum.
>>
>> Because it's not needed.
>> The sync is only to make sure the first 2 extent items aren't merged
>> together. And that is needed to trigger the failure.
>
> Yes, it's a pre-condition. That's not the answer to the question
> I've been asking, though.
>
>> > That seems important to me
>> > because the above sync commands indicate that having the extents on
>> > disk rather than in memory is important here. e.g. are you expecting
>> > the md5sum to be correct before the data is synced to disk, and then
>> > incorrect after the data is synced to disk by the unmount?
>>
>> Again what's important is having multiple extent items after unmounting.
>
> Ah, so syncing the data after the second set of writes is important,
> and that's what you are testing. So, why aren't you testing this
> with sync and fiemap? Or is it the unmount that matters, rather than
> the data writeback?
>
> Do you see what I'm trying to understand? If it's data writeback
> that triggers the bug or enables it to be detected, then that's what
> the test should use as the trigger. If unmount is necessary, then a
> comment saying "data writeback is not sufficient to trigger or
> detect the corruption - we need can only detect it via unmount and a
> fsck pass"....

See my last reply below, which answers this.

>
>> > code in it to check and unmount the scratch device, so if that is
>> > not happening then there's something broken that needs to be fixed.
>>
>> So _check_btrfs_filesystem() will unmount the fs if it's mounted, do
>> the fsck thing and then remount it. If the isn't mounted when it's
>> called, it will not mount/remount it after doing the fsck. Very
>> explicit, I don't know the motivation for that behaviour.
>
> Ok, so the problem is not as you described - it's not that
> _check_scratch_fs doesn't unmount the filesystem, it's that it
> mounts it again after the check and the test requires it to be
> unmounted *after* it has been checked.  See how much time a simple
> comment can save? :)
>
>> >> matter what its filesystem is (btrfs, extN, xfs, etc)
>> >
>> > Needs a comment.
>>
>> Ok... so should I make a comment to explain what btrfs restore does?
>> Is it unreasonable to expect an unfamiliar reader to run btrfs --help
>> or check the man page for example to see what this command is?
>
> It is unreasonable to expect them to understand why you used btrfs
> restore rather than just doing "_scratch_check_fs; md5sum
> $testfile". That's what the comment needs to explain, because I
> still don't understand why the test uses btrfs restore or why it
> would give any different result to the script fragment in the
> previous sentence...

So, the problem is that the restore command, which only reads from an
unmounted fs (from disk directly) was incorrectly reading certain file
extents (compressed extents with a data offset field != 0). This is
basically what the comment at the top says:

+# Test that btrfs-progs' restore command is able to correctly recover files
+# that have compressed extents, specially when the respective file extent
+# items have a non-zero data offset field.

I'll add a comment just before calling restore...

Thanks Dave

>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com

[v2] xfstests: add test for btrfs-progs restore feature

Commit Message

Comments

Patch