[1/2] btrfs: qgroup: Fix the wrong target io_tree when freeing reserved data space
diff mbox series

Message ID 20190913015127.14953-1-wqu@suse.com
State New
Headers show
Series
  • [1/2] btrfs: qgroup: Fix the wrong target io_tree when freeing reserved data space
Related show

Commit Message

Qu Wenruo Sept. 13, 2019, 1:51 a.m. UTC
[BUG]
Under the follow case with qgroup enabled, if some error happened after
we have reserved delalloc space, then in error handling path, we could
cause qgroup data space leakage:

From btrfs_truncate_block() in inode.c:

	ret = btrfs_delalloc_reserve_space(inode, &data_reserved,
					   block_start, blocksize);
	if (ret)
		goto out;

again:
	page = find_or_create_page(mapping, index, mask);
	if (!page) {
		btrfs_delalloc_release_space(inode, data_reserved,
					     block_start, blocksize, true);
		btrfs_delalloc_release_extents(BTRFS_I(inode), blocksize, true);
		ret = -ENOMEM;
		goto out;
	}

[CAUSE]
In above case, btrfs_delalloc_reserve_space() will call
btrfs_qgroup_reserve_data() and mark the io_tree range with
EXTENT_QGROUP_RESERVED flag.

In the error handling path, btrfs_delalloc_release_space() calls
btrfs_qgroup_free_data() which should clear EXTENT_QGROUP_RESERVED flag
and reduce the reserved data space accroding to the cleared range.

However due to a completion bug, btrfs_qgroup_free_data() will clear
EXTENT_QGROUP_RESERVED flag in BTRFS_I(inode)->io_failure_tree, other
than the correct BTRFS_I(inode)->io_tree.
Since io_failure_tree is never marked with that flag,
btrfs_qgroup_free_data() will not free any data reserved space at all,
causing a leakage.

All of such error handling cases can only be triggered some errors not
from qgroup, so regular EDQUOT error won't trigger the bug.
Normally we need error injection to trigger such bug.

[FIX]
Fix the wrong target io_tree.

Reported-by: Josef Bacik <josef@toxicpanda.com>
Fixes: bc42bda22345 ("btrfs: qgroup: Fix qgroup reserved space underflow by only freeing reserved ranges")
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/qgroup.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Nikolay Borisov Sept. 13, 2019, 12:57 p.m. UTC | #1
On 13.09.19 г. 4:51 ч., Qu Wenruo wrote:
> [BUG]
> Under the follow case with qgroup enabled, if some error happened after
> we have reserved delalloc space, then in error handling path, we could
> cause qgroup data space leakage:
> 
> From btrfs_truncate_block() in inode.c:
> 
> 	ret = btrfs_delalloc_reserve_space(inode, &data_reserved,
> 					   block_start, blocksize);
> 	if (ret)
> 		goto out;
> 
> again:
> 	page = find_or_create_page(mapping, index, mask);
> 	if (!page) {
> 		btrfs_delalloc_release_space(inode, data_reserved,
> 					     block_start, blocksize, true);
> 		btrfs_delalloc_release_extents(BTRFS_I(inode), blocksize, true);
> 		ret = -ENOMEM;
> 		goto out;
> 	}
> 
> [CAUSE]
> In above case, btrfs_delalloc_reserve_space() will call
> btrfs_qgroup_reserve_data() and mark the io_tree range with
> EXTENT_QGROUP_RESERVED flag.
> 
> In the error handling path, btrfs_delalloc_release_space() calls
> btrfs_qgroup_free_data() which should clear EXTENT_QGROUP_RESERVED flag
> and reduce the reserved data space accroding to the cleared range.
> 
> However due to a completion bug, btrfs_qgroup_free_data() will clear
> EXTENT_QGROUP_RESERVED flag in BTRFS_I(inode)->io_failure_tree, other
> than the correct BTRFS_I(inode)->io_tree.

This is a bit confusing because the error is actually in
qgroup_free_reserved_data, which is called from
__btrfs_qgroup_release_data. But in the latter function there is also a
call to clear_record_extent_bits with the correct tree. Just fix the
function name by using qgroup_free_reserved_data.

> Since io_failure_tree is never marked with that flag,
> btrfs_qgroup_free_data() will not free any data reserved space at all,
> causing a leakage.
> 
> All of such error handling cases can only be triggered some errors not

I take it you meant:

This error handling can only be triggered by errors outside of qgroup
e.g. EDQUOT can't triger the bug?

The first part of the sentence is hard to parse.

> from qgroup, so regular EDQUOT error won't trigger the bug.
> Normally we need error injection to trigger such bug.
> 
> [FIX]
> Fix the wrong target io_tree.
> 
> Reported-by: Josef Bacik <josef@toxicpanda.com>
> Fixes: bc42bda22345 ("btrfs: qgroup: Fix qgroup reserved space underflow by only freeing reserved ranges")
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>  fs/btrfs/qgroup.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
> index 2891b57b9e1e..64bdc3e3652d 100644
> --- a/fs/btrfs/qgroup.c
> +++ b/fs/btrfs/qgroup.c
> @@ -3492,7 +3492,7 @@ static int qgroup_free_reserved_data(struct inode *inode,
>  		 * EXTENT_QGROUP_RESERVED, we won't double free.
>  		 * So not need to rush.
>  		 */
> -		ret = clear_record_extent_bits(&BTRFS_I(inode)->io_failure_tree,
> +		ret = clear_record_extent_bits(&BTRFS_I(inode)->io_tree,
>  				free_start, free_start + free_len - 1,
>  				EXTENT_QGROUP_RESERVED, &changeset);
>  		if (ret < 0)
>
Qu Wenruo Sept. 13, 2019, 1:02 p.m. UTC | #2
On 2019/9/13 下午8:57, Nikolay Borisov wrote:
>
>
> On 13.09.19 г. 4:51 ч., Qu Wenruo wrote:
>> [BUG]
>> Under the follow case with qgroup enabled, if some error happened after
>> we have reserved delalloc space, then in error handling path, we could
>> cause qgroup data space leakage:
>>
>> From btrfs_truncate_block() in inode.c:
>>
>> 	ret = btrfs_delalloc_reserve_space(inode, &data_reserved,
>> 					   block_start, blocksize);
>> 	if (ret)
>> 		goto out;
>>
>> again:
>> 	page = find_or_create_page(mapping, index, mask);
>> 	if (!page) {
>> 		btrfs_delalloc_release_space(inode, data_reserved,
>> 					     block_start, blocksize, true);
>> 		btrfs_delalloc_release_extents(BTRFS_I(inode), blocksize, true);
>> 		ret = -ENOMEM;
>> 		goto out;
>> 	}
>>
>> [CAUSE]
>> In above case, btrfs_delalloc_reserve_space() will call
>> btrfs_qgroup_reserve_data() and mark the io_tree range with
>> EXTENT_QGROUP_RESERVED flag.
>>
>> In the error handling path, btrfs_delalloc_release_space() calls
>> btrfs_qgroup_free_data() which should clear EXTENT_QGROUP_RESERVED flag
>> and reduce the reserved data space accroding to the cleared range.
>>
>> However due to a completion bug, btrfs_qgroup_free_data() will clear
>> EXTENT_QGROUP_RESERVED flag in BTRFS_I(inode)->io_failure_tree, other
>> than the correct BTRFS_I(inode)->io_tree.
>
> This is a bit confusing because the error is actually in
> qgroup_free_reserved_data, which is called from
> __btrfs_qgroup_release_data. But in the latter function there is also a
> call to clear_record_extent_bits with the correct tree. Just fix the
> function name by using qgroup_free_reserved_data.

Right, I ignored some caller here, as the caller chain is not only
dependent on btrfs_qgroup_free_data() but also on the parameter.
E.g. only when reserved is non-null we go qgroup_free_reserved_data().

>
>> Since io_failure_tree is never marked with that flag,
>> btrfs_qgroup_free_data() will not free any data reserved space at all,
>> causing a leakage.
>>
>> All of such error handling cases can only be triggered some errors not
>
> I take it you meant:
>
> This error handling can only be triggered by errors outside of qgroup
> e.g. EDQUOT can't triger the bug?

Right.

I'll change it too something like "such leakage can only be triggered by
errors outside of qgroup."

Thanks,
Qu

>
> The first part of the sentence is hard to parse.
>
>> from qgroup, so regular EDQUOT error won't trigger the bug.
>> Normally we need error injection to trigger such bug.
>>
>> [FIX]
>> Fix the wrong target io_tree.
>>
>> Reported-by: Josef Bacik <josef@toxicpanda.com>
>> Fixes: bc42bda22345 ("btrfs: qgroup: Fix qgroup reserved space underflow by only freeing reserved ranges")
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>>  fs/btrfs/qgroup.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
>> index 2891b57b9e1e..64bdc3e3652d 100644
>> --- a/fs/btrfs/qgroup.c
>> +++ b/fs/btrfs/qgroup.c
>> @@ -3492,7 +3492,7 @@ static int qgroup_free_reserved_data(struct inode *inode,
>>  		 * EXTENT_QGROUP_RESERVED, we won't double free.
>>  		 * So not need to rush.
>>  		 */
>> -		ret = clear_record_extent_bits(&BTRFS_I(inode)->io_failure_tree,
>> +		ret = clear_record_extent_bits(&BTRFS_I(inode)->io_tree,
>>  				free_start, free_start + free_len - 1,
>>  				EXTENT_QGROUP_RESERVED, &changeset);
>>  		if (ret < 0)
>>

Patch
diff mbox series

diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index 2891b57b9e1e..64bdc3e3652d 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -3492,7 +3492,7 @@  static int qgroup_free_reserved_data(struct inode *inode,
 		 * EXTENT_QGROUP_RESERVED, we won't double free.
 		 * So not need to rush.
 		 */
-		ret = clear_record_extent_bits(&BTRFS_I(inode)->io_failure_tree,
+		ret = clear_record_extent_bits(&BTRFS_I(inode)->io_tree,
 				free_start, free_start + free_len - 1,
 				EXTENT_QGROUP_RESERVED, &changeset);
 		if (ret < 0)