diff mbox

fstest: btrfs/083: Test for incorrect exclusive refernce number after file clone.

Message ID 1425973594-31936-1-git-send-email-quwenruo@cn.fujitsu.com (mailing list archive)
State New, archived
Headers show

Commit Message

Qu Wenruo March 10, 2015, 7:46 a.m. UTC
[Problem]
Since commit fcebe4562dec83b3f8d308 ("Btrfs: rework qgroup accounting"),
quota data update is delayed after delayed_ref calculation, and lacks
correct protection to detect root reference which shouldn't be counted
in current sequence number but already written into extent backref.

This makes exclusive reference not decreased correctly and give incorrect
result.

[Test procedure]
1. Create a btrfs with 3 subvolumes, quota enabled and rescanned.
2. Create a file in 1st subvolume
3. Clone the file to 2nd and 3rd subvolume
4. Sync the fs to reflect the changes in qgroup.
5. Check the qgroup data

[Expected result]
None of the subvolume has exclusive reference to the file.

[Actual result]
The first subvolume still have exclusive reference to the file.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
 tests/btrfs/083     | 76 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/btrfs/083.out |  5 ++++
 tests/btrfs/group   |  1 +
 3 files changed, 82 insertions(+)
 create mode 100755 tests/btrfs/083
 create mode 100644 tests/btrfs/083.out

Comments

Filipe Manana March 12, 2015, 12:38 p.m. UTC | #1
On Tue, Mar 10, 2015 at 7:46 AM, Qu Wenruo <quwenruo@cn.fujitsu.com> wrote:
> [Problem]
> Since commit fcebe4562dec83b3f8d308 ("Btrfs: rework qgroup accounting"),
> quota data update is delayed after delayed_ref calculation, and lacks
> correct protection to detect root reference which shouldn't be counted
> in current sequence number but already written into extent backref.
>
> This makes exclusive reference not decreased correctly and give incorrect
> result.
>
> [Test procedure]
> 1. Create a btrfs with 3 subvolumes, quota enabled and rescanned.
> 2. Create a file in 1st subvolume
> 3. Clone the file to 2nd and 3rd subvolume
> 4. Sync the fs to reflect the changes in qgroup.
> 5. Check the qgroup data
>
> [Expected result]
> None of the subvolume has exclusive reference to the file.
>
> [Actual result]
> The first subvolume still have exclusive reference to the file.
>
> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
> ---
>  tests/btrfs/083     | 76 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>  tests/btrfs/083.out |  5 ++++
>  tests/btrfs/group   |  1 +
>  3 files changed, 82 insertions(+)
>  create mode 100755 tests/btrfs/083
>  create mode 100644 tests/btrfs/083.out
>
> diff --git a/tests/btrfs/083 b/tests/btrfs/083
> new file mode 100755
> index 0000000..17fd30b
> --- /dev/null
> +++ b/tests/btrfs/083
> @@ -0,0 +1,76 @@
> +#! /bin/bash
> +# FS QA Test No. 083
> +#
> +# Test for incorrect exclusive reference count after cloning file
> +# between subvolumes.
> +#
> +#-----------------------------------------------------------------------
> +# Copyright (c) 2015 Fujitsu. All Rights Reserved.
> +#
> +# This program is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU General Public License as
> +# published by the Free Software Foundation.
> +#
> +# This program is distributed in the hope that it would be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program; if not, write the Free Software Foundation,
> +# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
> +#-----------------------------------------------------------------------
> +#
> +
> +seq=`basename $0`
> +seqres=$RESULT_DIR/$seq
> +echo "QA output created by $seq"
> +
> +here=`pwd`
> +tmp=/tmp/$$
> +status=1       # failure is the default!
> +trap "_cleanup; exit \$status" 0 1 2 3 15
> +
> +_cleanup()
> +{
> +    rm -f $tmp.*
> +}
> +
> +# get standard environment, filters and checks
> +. ./common/rc
> +. ./common/filter
> +
> +# real QA test starts here
> +
> +# Modify as appropriate.
> +_need_to_be_root
> +_supported_fs btrfs
> +_supported_os Linux
> +_require_scratch
> +_require_cp_reflink
> +
> +run_check _scratch_mkfs "--nodesize 4096"

--nodesize 65536
Otherwise the test fails (unnecessarily) on platforms with a page size > 4Kb.

> +
> +# inode cache will also take space in fs tree, disable them to get consistent
> +# result.
> +run_check _scratch_mount "-o noinode_cache"

-o noinode_cache, unlike -o inode_cache, is still a relatively new
mount option (early 2014). Won't the test fail on older kernels that
don't recognize this mount option?

> +
> +_run_btrfs_util_prog subvolume create $SCRATCH_MNT/subv1
> +_run_btrfs_util_prog subvolume create $SCRATCH_MNT/subv2
> +_run_btrfs_util_prog subvolume create $SCRATCH_MNT/subv3
> +
> +_run_btrfs_util_prog quota enable $SCRATCH_MNT
> +_run_btrfs_util_prog quota rescan -w $SCRATCH_MNT
> +
> +dd if=/dev/zero of=$SCRATCH_MNT/subv1/file1 bs=4K count=64 &> /dev/null

Why ignore dd failures? Normally we want the output (specially errors)
to be matched against the golden output.

> +cp --reflink $SCRATCH_MNT/subv1/file1 $SCRATCH_MNT/subv2/file1
> +cp --reflink $SCRATCH_MNT/subv1/file1 $SCRATCH_MNT/subv3/file1
> +_run_btrfs_util_prog filesystem sync $SCRATCH_MNT

Plain old 'sync' should work here as well.

thanks Qu

> +
> +units=`_btrfs_qgroup_units`
> +$BTRFS_UTIL_PROG qgroup show $units $SCRATCH_MNT | $SED_PROG -n '/[0-9]/p' \
> +       | $AWK_PROG '{print $2" "$3}'
> +
> +# success, all done
> +status=0
> +exit
> diff --git a/tests/btrfs/083.out b/tests/btrfs/083.out
> new file mode 100644
> index 0000000..359b4a0
> --- /dev/null
> +++ b/tests/btrfs/083.out
> @@ -0,0 +1,5 @@
> +QA output created by 083
> +4096 4096
> +266240 4096
> +266240 4096
> +266240 4096
> diff --git a/tests/btrfs/group b/tests/btrfs/group
> index fd2fa76..04d5d67 100644
> --- a/tests/btrfs/group
> +++ b/tests/btrfs/group
> @@ -85,3 +85,4 @@
>  080 auto snapshot
>  081 auto quick clone
>  082 auto quick remount
> +083 auto quick qgroup
> --
> 2.3.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe fstests" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Josef Bacik March 12, 2015, 12:49 p.m. UTC | #2
On 03/12/2015 08:38 AM, Filipe David Manana wrote:
> On Tue, Mar 10, 2015 at 7:46 AM, Qu Wenruo <quwenruo@cn.fujitsu.com> wrote:
>> [Problem]
>> Since commit fcebe4562dec83b3f8d308 ("Btrfs: rework qgroup accounting"),
>> quota data update is delayed after delayed_ref calculation, and lacks
>> correct protection to detect root reference which shouldn't be counted
>> in current sequence number but already written into extent backref.
>>
>> This makes exclusive reference not decreased correctly and give incorrect
>> result.
>>
>> [Test procedure]
>> 1. Create a btrfs with 3 subvolumes, quota enabled and rescanned.
>> 2. Create a file in 1st subvolume
>> 3. Clone the file to 2nd and 3rd subvolume
>> 4. Sync the fs to reflect the changes in qgroup.
>> 5. Check the qgroup data
>>
>> [Expected result]
>> None of the subvolume has exclusive reference to the file.
>>
>> [Actual result]
>> The first subvolume still have exclusive reference to the file.
>>
>> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
>> ---
>>   tests/btrfs/083     | 76 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   tests/btrfs/083.out |  5 ++++
>>   tests/btrfs/group   |  1 +
>>   3 files changed, 82 insertions(+)
>>   create mode 100755 tests/btrfs/083
>>   create mode 100644 tests/btrfs/083.out
>>
>> diff --git a/tests/btrfs/083 b/tests/btrfs/083
>> new file mode 100755
>> index 0000000..17fd30b
>> --- /dev/null
>> +++ b/tests/btrfs/083
>> @@ -0,0 +1,76 @@
>> +#! /bin/bash
>> +# FS QA Test No. 083
>> +#
>> +# Test for incorrect exclusive reference count after cloning file
>> +# between subvolumes.
>> +#
>> +#-----------------------------------------------------------------------
>> +# Copyright (c) 2015 Fujitsu. All Rights Reserved.
>> +#
>> +# This program is free software; you can redistribute it and/or
>> +# modify it under the terms of the GNU General Public License as
>> +# published by the Free Software Foundation.
>> +#
>> +# This program is distributed in the hope that it would be useful,
>> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> +# GNU General Public License for more details.
>> +#
>> +# You should have received a copy of the GNU General Public License
>> +# along with this program; if not, write the Free Software Foundation,
>> +# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
>> +#-----------------------------------------------------------------------
>> +#
>> +
>> +seq=`basename $0`
>> +seqres=$RESULT_DIR/$seq
>> +echo "QA output created by $seq"
>> +
>> +here=`pwd`
>> +tmp=/tmp/$$
>> +status=1       # failure is the default!
>> +trap "_cleanup; exit \$status" 0 1 2 3 15
>> +
>> +_cleanup()
>> +{
>> +    rm -f $tmp.*
>> +}
>> +
>> +# get standard environment, filters and checks
>> +. ./common/rc
>> +. ./common/filter
>> +
>> +# real QA test starts here
>> +
>> +# Modify as appropriate.
>> +_need_to_be_root
>> +_supported_fs btrfs
>> +_supported_os Linux
>> +_require_scratch
>> +_require_cp_reflink
>> +
>> +run_check _scratch_mkfs "--nodesize 4096"
>
> --nodesize 65536
> Otherwise the test fails (unnecessarily) on platforms with a page size > 4Kb.
>

Leave this bit, we're going to merge the sub-page blocksize stuff soon 
anyway, and it makes the numbers add up easier for qgroup stuff.

>> +
>> +# inode cache will also take space in fs tree, disable them to get consistent
>> +# result.
>> +run_check _scratch_mount "-o noinode_cache"
>
> -o noinode_cache, unlike -o inode_cache, is still a relatively new
> mount option (early 2014). Won't the test fail on older kernels that
> don't recognize this mount option?

Yeah but my qgroup patch wasnt in until late last year anyway, we're not 
super worried about older kernels failing tests, we already know they're 
broken.  The other comments are fine.  Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe fstests" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Qu Wenruo March 13, 2015, 1:03 a.m. UTC | #3
-------- Original Message  --------
Subject: Re: [PATCH] fstest: btrfs/083: Test for incorrect exclusive 
refernce number after file clone.
From: Josef Bacik <jbacik@fb.com>
To: <fdmanana@gmail.com>, Qu Wenruo <quwenruo@cn.fujitsu.com>
Date: 2015?03?12? 20:49

> On 03/12/2015 08:38 AM, Filipe David Manana wrote:
>> On Tue, Mar 10, 2015 at 7:46 AM, Qu Wenruo <quwenruo@cn.fujitsu.com>
>> wrote:
>>> [Problem]
>>> Since commit fcebe4562dec83b3f8d308 ("Btrfs: rework qgroup accounting"),
>>> quota data update is delayed after delayed_ref calculation, and lacks
>>> correct protection to detect root reference which shouldn't be counted
>>> in current sequence number but already written into extent backref.
>>>
>>> This makes exclusive reference not decreased correctly and give
>>> incorrect
>>> result.
>>>
>>> [Test procedure]
>>> 1. Create a btrfs with 3 subvolumes, quota enabled and rescanned.
>>> 2. Create a file in 1st subvolume
>>> 3. Clone the file to 2nd and 3rd subvolume
>>> 4. Sync the fs to reflect the changes in qgroup.
>>> 5. Check the qgroup data
>>>
>>> [Expected result]
>>> None of the subvolume has exclusive reference to the file.
>>>
>>> [Actual result]
>>> The first subvolume still have exclusive reference to the file.
>>>
>>> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
>>> ---
>>>   tests/btrfs/083     | 76
>>> +++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>   tests/btrfs/083.out |  5 ++++
>>>   tests/btrfs/group   |  1 +
>>>   3 files changed, 82 insertions(+)
>>>   create mode 100755 tests/btrfs/083
>>>   create mode 100644 tests/btrfs/083.out
>>>
>>> diff --git a/tests/btrfs/083 b/tests/btrfs/083
>>> new file mode 100755
>>> index 0000000..17fd30b
>>> --- /dev/null
>>> +++ b/tests/btrfs/083
>>> @@ -0,0 +1,76 @@
>>> +#! /bin/bash
>>> +# FS QA Test No. 083
>>> +#
>>> +# Test for incorrect exclusive reference count after cloning file
>>> +# between subvolumes.
>>> +#
>>> +#-----------------------------------------------------------------------
>>>
>>> +# Copyright (c) 2015 Fujitsu. All Rights Reserved.
>>> +#
>>> +# This program is free software; you can redistribute it and/or
>>> +# modify it under the terms of the GNU General Public License as
>>> +# published by the Free Software Foundation.
>>> +#
>>> +# This program is distributed in the hope that it would be useful,
>>> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>>> +# GNU General Public License for more details.
>>> +#
>>> +# You should have received a copy of the GNU General Public License
>>> +# along with this program; if not, write the Free Software Foundation,
>>> +# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
>>> +#-----------------------------------------------------------------------
>>>
>>> +#
>>> +
>>> +seq=`basename $0`
>>> +seqres=$RESULT_DIR/$seq
>>> +echo "QA output created by $seq"
>>> +
>>> +here=`pwd`
>>> +tmp=/tmp/$$
>>> +status=1       # failure is the default!
>>> +trap "_cleanup; exit \$status" 0 1 2 3 15
>>> +
>>> +_cleanup()
>>> +{
>>> +    rm -f $tmp.*
>>> +}
>>> +
>>> +# get standard environment, filters and checks
>>> +. ./common/rc
>>> +. ./common/filter
>>> +
>>> +# real QA test starts here
>>> +
>>> +# Modify as appropriate.
>>> +_need_to_be_root
>>> +_supported_fs btrfs
>>> +_supported_os Linux
>>> +_require_scratch
>>> +_require_cp_reflink
>>> +
>>> +run_check _scratch_mkfs "--nodesize 4096"
>>
>> --nodesize 65536
>> Otherwise the test fails (unnecessarily) on platforms with a page size
>> > 4Kb.
>>
>
> Leave this bit, we're going to merge the sub-page blocksize stuff soon
> anyway, and it makes the numbers add up easier for qgroup stuff.
Agreed with Josef, 4K leaf/node size other than 64K is much easier for 
qgroup resulting comparing.

 From some respect, a fs made on one arch can't be mounted in another 
arch is already one kind of bug, so the failure is not meaningless.
>
>>> +
>>> +# inode cache will also take space in fs tree, disable them to get
>>> consistent
>>> +# result.
>>> +run_check _scratch_mount "-o noinode_cache"
>>
>> -o noinode_cache, unlike -o inode_cache, is still a relatively new
>> mount option (early 2014). Won't the test fail on older kernels that
>> don't recognize this mount option?
Yes, noinode_cache mount option is new, but it is already a bug that we 
can enable inode cache but can't disable it.

Although old kernel may not support it and can't pass the testcase, but 
it indicates a bug, so I think it's OK.

Thanks,
Qu
>
> Yeah but my qgroup patch wasnt in until late last year anyway, we're not
> super worried about older kernels failing tests, we already know they're
> broken.  The other comments are fine.

Thanks,
>
> Josef
--
To unsubscribe from this list: send the line "unsubscribe fstests" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Qu Wenruo March 13, 2015, 1:08 a.m. UTC | #4
-------- Original Message  --------
Subject: Re: [PATCH] fstest: btrfs/083: Test for incorrect exclusive 
refernce number after file clone.
From: Filipe David Manana <fdmanana@gmail.com>
To: Qu Wenruo <quwenruo@cn.fujitsu.com>
Date: 2015?03?12? 20:38

> On Tue, Mar 10, 2015 at 7:46 AM, Qu Wenruo <quwenruo@cn.fujitsu.com> wrote:
>> [Problem]
>> Since commit fcebe4562dec83b3f8d308 ("Btrfs: rework qgroup accounting"),
>> quota data update is delayed after delayed_ref calculation, and lacks
>> correct protection to detect root reference which shouldn't be counted
>> in current sequence number but already written into extent backref.
>>
>> This makes exclusive reference not decreased correctly and give incorrect
>> result.
>>
>> [Test procedure]
>> 1. Create a btrfs with 3 subvolumes, quota enabled and rescanned.
>> 2. Create a file in 1st subvolume
>> 3. Clone the file to 2nd and 3rd subvolume
>> 4. Sync the fs to reflect the changes in qgroup.
>> 5. Check the qgroup data
>>
>> [Expected result]
>> None of the subvolume has exclusive reference to the file.
>>
>> [Actual result]
>> The first subvolume still have exclusive reference to the file.
>>
>> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
>> ---
>>   tests/btrfs/083     | 76 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   tests/btrfs/083.out |  5 ++++
>>   tests/btrfs/group   |  1 +
>>   3 files changed, 82 insertions(+)
>>   create mode 100755 tests/btrfs/083
>>   create mode 100644 tests/btrfs/083.out
>>
>> diff --git a/tests/btrfs/083 b/tests/btrfs/083
>> new file mode 100755
>> index 0000000..17fd30b
>> --- /dev/null
>> +++ b/tests/btrfs/083
>> @@ -0,0 +1,76 @@
>> +#! /bin/bash
>> +# FS QA Test No. 083
>> +#
>> +# Test for incorrect exclusive reference count after cloning file
>> +# between subvolumes.
>> +#
>> +#-----------------------------------------------------------------------
>> +# Copyright (c) 2015 Fujitsu. All Rights Reserved.
>> +#
>> +# This program is free software; you can redistribute it and/or
>> +# modify it under the terms of the GNU General Public License as
>> +# published by the Free Software Foundation.
>> +#
>> +# This program is distributed in the hope that it would be useful,
>> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> +# GNU General Public License for more details.
>> +#
>> +# You should have received a copy of the GNU General Public License
>> +# along with this program; if not, write the Free Software Foundation,
>> +# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
>> +#-----------------------------------------------------------------------
>> +#
>> +
>> +seq=`basename $0`
>> +seqres=$RESULT_DIR/$seq
>> +echo "QA output created by $seq"
>> +
>> +here=`pwd`
>> +tmp=/tmp/$$
>> +status=1       # failure is the default!
>> +trap "_cleanup; exit \$status" 0 1 2 3 15
>> +
>> +_cleanup()
>> +{
>> +    rm -f $tmp.*
>> +}
>> +
>> +# get standard environment, filters and checks
>> +. ./common/rc
>> +. ./common/filter
>> +
>> +# real QA test starts here
>> +
>> +# Modify as appropriate.
>> +_need_to_be_root
>> +_supported_fs btrfs
>> +_supported_os Linux
>> +_require_scratch
>> +_require_cp_reflink
>> +
>> +run_check _scratch_mkfs "--nodesize 4096"
>
> --nodesize 65536
> Otherwise the test fails (unnecessarily) on platforms with a page size > 4Kb.
>
>> +
>> +# inode cache will also take space in fs tree, disable them to get consistent
>> +# result.
>> +run_check _scratch_mount "-o noinode_cache"
>
> -o noinode_cache, unlike -o inode_cache, is still a relatively new
> mount option (early 2014). Won't the test fail on older kernels that
> don't recognize this mount option?
>
>> +
>> +_run_btrfs_util_prog subvolume create $SCRATCH_MNT/subv1
>> +_run_btrfs_util_prog subvolume create $SCRATCH_MNT/subv2
>> +_run_btrfs_util_prog subvolume create $SCRATCH_MNT/subv3
>> +
>> +_run_btrfs_util_prog quota enable $SCRATCH_MNT
>> +_run_btrfs_util_prog quota rescan -w $SCRATCH_MNT
>> +
>> +dd if=/dev/zero of=$SCRATCH_MNT/subv1/file1 bs=4K count=64 &> /dev/null
>
> Why ignore dd failures? Normally we want the output (specially errors)
> to be matched against the golden output.
Indeed, will redirect error output into $seq.full in next version.
>
>> +cp --reflink $SCRATCH_MNT/subv1/file1 $SCRATCH_MNT/subv2/file1
>> +cp --reflink $SCRATCH_MNT/subv1/file1 $SCRATCH_MNT/subv3/file1
>> +_run_btrfs_util_prog filesystem sync $SCRATCH_MNT
>
> Plain old 'sync' should work here as well.

Yes, old sync will do it, but I prefer not to touch other unrelated fs, 
so I chose 'btrfs fi sync' which only involve the scratch mount.

Other comment is inlined in reply to Josef.

Thanks,
Qu
>
> thanks Qu
>
>> +
>> +units=`_btrfs_qgroup_units`
>> +$BTRFS_UTIL_PROG qgroup show $units $SCRATCH_MNT | $SED_PROG -n '/[0-9]/p' \
>> +       | $AWK_PROG '{print $2" "$3}'
>> +
>> +# success, all done
>> +status=0
>> +exit
>> diff --git a/tests/btrfs/083.out b/tests/btrfs/083.out
>> new file mode 100644
>> index 0000000..359b4a0
>> --- /dev/null
>> +++ b/tests/btrfs/083.out
>> @@ -0,0 +1,5 @@
>> +QA output created by 083
>> +4096 4096
>> +266240 4096
>> +266240 4096
>> +266240 4096
>> diff --git a/tests/btrfs/group b/tests/btrfs/group
>> index fd2fa76..04d5d67 100644
>> --- a/tests/btrfs/group
>> +++ b/tests/btrfs/group
>> @@ -85,3 +85,4 @@
>>   080 auto snapshot
>>   081 auto quick clone
>>   082 auto quick remount
>> +083 auto quick qgroup
>> --
>> 2.3.1
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe fstests" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe fstests" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Qu Wenruo March 13, 2015, 1:58 a.m. UTC | #5
-------- Original Message  --------
Subject: Re: [PATCH] fstest: btrfs/083: Test for incorrect exclusive 
refernce number after file clone.
From: Qu Wenruo <quwenruo@cn.fujitsu.com>
To: Josef Bacik <jbacik@fb.com>, fdmanana@gmail.com
Date: 2015?03?13? 09:03

>
>
> -------- Original Message  --------
> Subject: Re: [PATCH] fstest: btrfs/083: Test for incorrect exclusive
> refernce number after file clone.
> From: Josef Bacik <jbacik@fb.com>
> To: <fdmanana@gmail.com>, Qu Wenruo <quwenruo@cn.fujitsu.com>
> Date: 2015?03?12? 20:49
>
>> On 03/12/2015 08:38 AM, Filipe David Manana wrote:
>>> On Tue, Mar 10, 2015 at 7:46 AM, Qu Wenruo <quwenruo@cn.fujitsu.com>
>>> wrote:
>>>> [Problem]
>>>> Since commit fcebe4562dec83b3f8d308 ("Btrfs: rework qgroup
>>>> accounting"),
>>>> quota data update is delayed after delayed_ref calculation, and lacks
>>>> correct protection to detect root reference which shouldn't be counted
>>>> in current sequence number but already written into extent backref.
>>>>
>>>> This makes exclusive reference not decreased correctly and give
>>>> incorrect
>>>> result.
>>>>
>>>> [Test procedure]
>>>> 1. Create a btrfs with 3 subvolumes, quota enabled and rescanned.
>>>> 2. Create a file in 1st subvolume
>>>> 3. Clone the file to 2nd and 3rd subvolume
>>>> 4. Sync the fs to reflect the changes in qgroup.
>>>> 5. Check the qgroup data
>>>>
>>>> [Expected result]
>>>> None of the subvolume has exclusive reference to the file.
>>>>
>>>> [Actual result]
>>>> The first subvolume still have exclusive reference to the file.
[snip]
> Thanks,
>>
>> Josef
BTW, I'm somewhat worried about how to fix the problem.

Two method comes to me, but neither seems perfect.
1) Change qgroup counters update timing.
Just change the actual qgroup update timing from current 
btrfs_delayed_qgroup_accouting() to btrfs_qgroup_record_ref().

Although it seems we can get accurate old/new_roots, but delayed tree 
block refs can be ordered/merged, causing data ref is run before its 
tree block, causing btrfs_find_all_roots() always return 0 root.

So this method needs extra modification on delayed refs.

2) Do "oper-replay" in qgroup codes.
We can do things like log replay in qgroup operations, and do some black 
magic to calculate the correct old/new_roots number even we can only see 
the final result backrefs.

But I didn't see a good algorithm to do this, and the complexity may 
cause even more bugs.

Any ideas?

Thanks,
Qu
--
To unsubscribe from this list: send the line "unsubscribe fstests" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/tests/btrfs/083 b/tests/btrfs/083
new file mode 100755
index 0000000..17fd30b
--- /dev/null
+++ b/tests/btrfs/083
@@ -0,0 +1,76 @@ 
+#! /bin/bash
+# FS QA Test No. 083
+#
+# Test for incorrect exclusive reference count after cloning file
+# between subvolumes.
+#
+#-----------------------------------------------------------------------
+# Copyright (c) 2015 Fujitsu. All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#-----------------------------------------------------------------------
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1	# failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+    rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+
+# real QA test starts here
+
+# Modify as appropriate.
+_need_to_be_root
+_supported_fs btrfs
+_supported_os Linux
+_require_scratch
+_require_cp_reflink
+
+run_check _scratch_mkfs "--nodesize 4096"
+
+# inode cache will also take space in fs tree, disable them to get consistent
+# result.
+run_check _scratch_mount "-o noinode_cache"
+
+_run_btrfs_util_prog subvolume create $SCRATCH_MNT/subv1
+_run_btrfs_util_prog subvolume create $SCRATCH_MNT/subv2
+_run_btrfs_util_prog subvolume create $SCRATCH_MNT/subv3
+
+_run_btrfs_util_prog quota enable $SCRATCH_MNT
+_run_btrfs_util_prog quota rescan -w $SCRATCH_MNT
+
+dd if=/dev/zero of=$SCRATCH_MNT/subv1/file1 bs=4K count=64 &> /dev/null
+cp --reflink $SCRATCH_MNT/subv1/file1 $SCRATCH_MNT/subv2/file1
+cp --reflink $SCRATCH_MNT/subv1/file1 $SCRATCH_MNT/subv3/file1
+_run_btrfs_util_prog filesystem sync $SCRATCH_MNT
+
+units=`_btrfs_qgroup_units`
+$BTRFS_UTIL_PROG qgroup show $units $SCRATCH_MNT | $SED_PROG -n '/[0-9]/p' \
+	| $AWK_PROG '{print $2" "$3}'
+
+# success, all done
+status=0
+exit
diff --git a/tests/btrfs/083.out b/tests/btrfs/083.out
new file mode 100644
index 0000000..359b4a0
--- /dev/null
+++ b/tests/btrfs/083.out
@@ -0,0 +1,5 @@ 
+QA output created by 083
+4096 4096
+266240 4096
+266240 4096
+266240 4096
diff --git a/tests/btrfs/group b/tests/btrfs/group
index fd2fa76..04d5d67 100644
--- a/tests/btrfs/group
+++ b/tests/btrfs/group
@@ -85,3 +85,4 @@ 
 080 auto snapshot
 081 auto quick clone
 082 auto quick remount
+083 auto quick qgroup