diff mbox series

btrfs: do not ASSERT() on duplicated global roots

Message ID 89dec0414ce198ef49d3014232bd1b88e3e74260.1686441969.git.wqu@suse.com (mailing list archive)
State New, archived
Headers show
Series btrfs: do not ASSERT() on duplicated global roots | expand

Commit Message

Qu Wenruo June 11, 2023, 12:09 a.m. UTC
[BUG]
Syzbot reports a reproducible ASSERT() when using rescue=usebackuproot
mount option on a corrupted fs.

The full report can be found here:
https://syzkaller.appspot.com/bug?extid=c4614eae20a166c25bf0

[CAUSE]
Since the introduction of global roots, we handle
csum/extent/free-space-tree roots as global roots, even if no
extent-tree-v2 feature is enabled.

So for regular csum/extent/fst roots, we load them into
fs_info::global_root_tree rb tree.

And we should not expect any conflicts in that rb tree, thus we have an
ASSERT() inside btrfs_global_root_insert().

But rescue=usebackuproot can break the assumption, as we will try to
load those trees again and again as long as we have bad roots and have
backup roots slot remaining.

So in that case we can have conflicting roots in the rb tree, and
triggering the ASSERT() crash.

[FIX]
We can safely remove that ASSERT(), as the caller will properly put the
offending root.

To make further debugging easier, also add two explicit error messages:

- Error message for conflicting global roots
- Error message when using backup roots slot

Reported-by: syzbot+a694851c6ab28cbcfb9c@syzkaller.appspotmail.com
Fixes: abed4aaae4f7 ("btrfs: track the csum, extent, and free space trees in a rb tree")
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/disk-io.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

Comments

David Sterba June 12, 2023, 9:37 p.m. UTC | #1
On Sun, Jun 11, 2023 at 08:09:13AM +0800, Qu Wenruo wrote:
> [BUG]
> Syzbot reports a reproducible ASSERT() when using rescue=usebackuproot
> mount option on a corrupted fs.
> 
> The full report can be found here:
> https://syzkaller.appspot.com/bug?extid=c4614eae20a166c25bf0

I'll copy the stack trace here, the link is good for reference but we
want to record the contents too.

> [CAUSE]
> Since the introduction of global roots, we handle
> csum/extent/free-space-tree roots as global roots, even if no
> extent-tree-v2 feature is enabled.
> 
> So for regular csum/extent/fst roots, we load them into
> fs_info::global_root_tree rb tree.
> 
> And we should not expect any conflicts in that rb tree, thus we have an
> ASSERT() inside btrfs_global_root_insert().
> 
> But rescue=usebackuproot can break the assumption, as we will try to
> load those trees again and again as long as we have bad roots and have
> backup roots slot remaining.
> 
> So in that case we can have conflicting roots in the rb tree, and
> triggering the ASSERT() crash.
> 
> [FIX]
> We can safely remove that ASSERT(), as the caller will properly put the
> offending root.
> 
> To make further debugging easier, also add two explicit error messages:
> 
> - Error message for conflicting global roots
> - Error message when using backup roots slot
> 
> Reported-by: syzbot+a694851c6ab28cbcfb9c@syzkaller.appspotmail.com
> Fixes: abed4aaae4f7 ("btrfs: track the csum, extent, and free space trees in a rb tree")
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>  fs/btrfs/disk-io.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index 1168e5df8bae..7f201975a303 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -748,13 +748,18 @@ int btrfs_global_root_insert(struct btrfs_root *root)
>  {
>  	struct btrfs_fs_info *fs_info = root->fs_info;
>  	struct rb_node *tmp;
> +	int ret = 0;
>  
>  	write_lock(&fs_info->global_root_lock);
>  	tmp = rb_find_add(&root->rb_node, &fs_info->global_root_tree, global_root_cmp);
>  	write_unlock(&fs_info->global_root_lock);
> -	ASSERT(!tmp);
>  
> -	return tmp ? -EEXIST : 0;
> +	if (tmp) {
> +		ret = -EEXIST;
> +		btrfs_warn(fs_info, "global root %lld %llu already exists",

%lld is to show the negative numbers, right? Are there any actual global
roots with such numbers?

BTRFS_DATA_RELOC_TREE_OBJECTID		-9
BTRFS_TREE_RELOC_OBJECTID		-8
BTRFS_TREE_LOG_FIXUP_OBJECTID		-7
BTRFS_TREE_LOG_OBJECTID			-6

None of them seems to be among the global roots.

> +				root->root_key.objectid, root->root_key.offset);
> +	}
> +	return ret;
>  }
>  
>  void btrfs_global_root_delete(struct btrfs_root *root)
> @@ -2591,6 +2596,7 @@ static int __cold init_tree_roots(struct btrfs_fs_info *fs_info)
>  			/* We can't trust the free space cache either */
>  			btrfs_set_opt(fs_info->mount_opt, CLEAR_CACHE);
>  
> +			btrfs_warn(fs_info, "try to load backup roots slot %u", i);

is int, so %d

>  			ret = read_backup_root(fs_info, i);
>  			backup_index = ret;
>  			if (ret < 0)
> -- 
> 2.41.0
Qu Wenruo June 12, 2023, 11:10 p.m. UTC | #2
On 2023/6/13 05:37, David Sterba wrote:
> On Sun, Jun 11, 2023 at 08:09:13AM +0800, Qu Wenruo wrote:
>> [BUG]
>> Syzbot reports a reproducible ASSERT() when using rescue=usebackuproot
>> mount option on a corrupted fs.
>>
>> The full report can be found here:
>> https://syzkaller.appspot.com/bug?extid=c4614eae20a166c25bf0
>
> I'll copy the stack trace here, the link is good for reference but we
> want to record the contents too.
>
>> [CAUSE]
>> Since the introduction of global roots, we handle
>> csum/extent/free-space-tree roots as global roots, even if no
>> extent-tree-v2 feature is enabled.
>>
>> So for regular csum/extent/fst roots, we load them into
>> fs_info::global_root_tree rb tree.
>>
>> And we should not expect any conflicts in that rb tree, thus we have an
>> ASSERT() inside btrfs_global_root_insert().
>>
>> But rescue=usebackuproot can break the assumption, as we will try to
>> load those trees again and again as long as we have bad roots and have
>> backup roots slot remaining.
>>
>> So in that case we can have conflicting roots in the rb tree, and
>> triggering the ASSERT() crash.
>>
>> [FIX]
>> We can safely remove that ASSERT(), as the caller will properly put the
>> offending root.
>>
>> To make further debugging easier, also add two explicit error messages:
>>
>> - Error message for conflicting global roots
>> - Error message when using backup roots slot
>>
>> Reported-by: syzbot+a694851c6ab28cbcfb9c@syzkaller.appspotmail.com
>> Fixes: abed4aaae4f7 ("btrfs: track the csum, extent, and free space trees in a rb tree")
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>>   fs/btrfs/disk-io.c | 10 ++++++++--
>>   1 file changed, 8 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
>> index 1168e5df8bae..7f201975a303 100644
>> --- a/fs/btrfs/disk-io.c
>> +++ b/fs/btrfs/disk-io.c
>> @@ -748,13 +748,18 @@ int btrfs_global_root_insert(struct btrfs_root *root)
>>   {
>>   	struct btrfs_fs_info *fs_info = root->fs_info;
>>   	struct rb_node *tmp;
>> +	int ret = 0;
>>
>>   	write_lock(&fs_info->global_root_lock);
>>   	tmp = rb_find_add(&root->rb_node, &fs_info->global_root_tree, global_root_cmp);
>>   	write_unlock(&fs_info->global_root_lock);
>> -	ASSERT(!tmp);
>>
>> -	return tmp ? -EEXIST : 0;
>> +	if (tmp) {
>> +		ret = -EEXIST;
>> +		btrfs_warn(fs_info, "global root %lld %llu already exists",
>
> %lld is to show the negative numbers, right? Are there any actual global
> roots with such numbers?
>
> BTRFS_DATA_RELOC_TREE_OBJECTID		-9
> BTRFS_TREE_RELOC_OBJECTID		-8
> BTRFS_TREE_LOG_FIXUP_OBJECTID		-7
> BTRFS_TREE_LOG_OBJECTID			-6
>
> None of them seems to be among the global roots.

Right, data reloc tree is not a global one, so %llu is OK.
>
>> +				root->root_key.objectid, root->root_key.offset);
>> +	}
>> +	return ret;
>>   }
>>
>>   void btrfs_global_root_delete(struct btrfs_root *root)
>> @@ -2591,6 +2596,7 @@ static int __cold init_tree_roots(struct btrfs_fs_info *fs_info)
>>   			/* We can't trust the free space cache either */
>>   			btrfs_set_opt(fs_info->mount_opt, CLEAR_CACHE);
>>
>> +			btrfs_warn(fs_info, "try to load backup roots slot %u", i);
>
> is int, so %d

But @i is only between [0, 4), thus %u is fine.

Thanks,
Qu

>
>>   			ret = read_backup_root(fs_info, i);
>>   			backup_index = ret;
>>   			if (ret < 0)
>> --
>> 2.41.0
diff mbox series

Patch

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 1168e5df8bae..7f201975a303 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -748,13 +748,18 @@  int btrfs_global_root_insert(struct btrfs_root *root)
 {
 	struct btrfs_fs_info *fs_info = root->fs_info;
 	struct rb_node *tmp;
+	int ret = 0;
 
 	write_lock(&fs_info->global_root_lock);
 	tmp = rb_find_add(&root->rb_node, &fs_info->global_root_tree, global_root_cmp);
 	write_unlock(&fs_info->global_root_lock);
-	ASSERT(!tmp);
 
-	return tmp ? -EEXIST : 0;
+	if (tmp) {
+		ret = -EEXIST;
+		btrfs_warn(fs_info, "global root %lld %llu already exists",
+				root->root_key.objectid, root->root_key.offset);
+	}
+	return ret;
 }
 
 void btrfs_global_root_delete(struct btrfs_root *root)
@@ -2591,6 +2596,7 @@  static int __cold init_tree_roots(struct btrfs_fs_info *fs_info)
 			/* We can't trust the free space cache either */
 			btrfs_set_opt(fs_info->mount_opt, CLEAR_CACHE);
 
+			btrfs_warn(fs_info, "try to load backup roots slot %u", i);
 			ret = read_backup_root(fs_info, i);
 			backup_index = ret;
 			if (ret < 0)