Message ID | 89dec0414ce198ef49d3014232bd1b88e3e74260.1686441969.git.wqu@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: do not ASSERT() on duplicated global roots | expand |
On Sun, Jun 11, 2023 at 08:09:13AM +0800, Qu Wenruo wrote: > [BUG] > Syzbot reports a reproducible ASSERT() when using rescue=usebackuproot > mount option on a corrupted fs. > > The full report can be found here: > https://syzkaller.appspot.com/bug?extid=c4614eae20a166c25bf0 I'll copy the stack trace here, the link is good for reference but we want to record the contents too. > [CAUSE] > Since the introduction of global roots, we handle > csum/extent/free-space-tree roots as global roots, even if no > extent-tree-v2 feature is enabled. > > So for regular csum/extent/fst roots, we load them into > fs_info::global_root_tree rb tree. > > And we should not expect any conflicts in that rb tree, thus we have an > ASSERT() inside btrfs_global_root_insert(). > > But rescue=usebackuproot can break the assumption, as we will try to > load those trees again and again as long as we have bad roots and have > backup roots slot remaining. > > So in that case we can have conflicting roots in the rb tree, and > triggering the ASSERT() crash. > > [FIX] > We can safely remove that ASSERT(), as the caller will properly put the > offending root. > > To make further debugging easier, also add two explicit error messages: > > - Error message for conflicting global roots > - Error message when using backup roots slot > > Reported-by: syzbot+a694851c6ab28cbcfb9c@syzkaller.appspotmail.com > Fixes: abed4aaae4f7 ("btrfs: track the csum, extent, and free space trees in a rb tree") > Signed-off-by: Qu Wenruo <wqu@suse.com> > --- > fs/btrfs/disk-io.c | 10 ++++++++-- > 1 file changed, 8 insertions(+), 2 deletions(-) > > diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c > index 1168e5df8bae..7f201975a303 100644 > --- a/fs/btrfs/disk-io.c > +++ b/fs/btrfs/disk-io.c > @@ -748,13 +748,18 @@ int btrfs_global_root_insert(struct btrfs_root *root) > { > struct btrfs_fs_info *fs_info = root->fs_info; > struct rb_node *tmp; > + int ret = 0; > > write_lock(&fs_info->global_root_lock); > tmp = rb_find_add(&root->rb_node, &fs_info->global_root_tree, global_root_cmp); > write_unlock(&fs_info->global_root_lock); > - ASSERT(!tmp); > > - return tmp ? -EEXIST : 0; > + if (tmp) { > + ret = -EEXIST; > + btrfs_warn(fs_info, "global root %lld %llu already exists", %lld is to show the negative numbers, right? Are there any actual global roots with such numbers? BTRFS_DATA_RELOC_TREE_OBJECTID -9 BTRFS_TREE_RELOC_OBJECTID -8 BTRFS_TREE_LOG_FIXUP_OBJECTID -7 BTRFS_TREE_LOG_OBJECTID -6 None of them seems to be among the global roots. > + root->root_key.objectid, root->root_key.offset); > + } > + return ret; > } > > void btrfs_global_root_delete(struct btrfs_root *root) > @@ -2591,6 +2596,7 @@ static int __cold init_tree_roots(struct btrfs_fs_info *fs_info) > /* We can't trust the free space cache either */ > btrfs_set_opt(fs_info->mount_opt, CLEAR_CACHE); > > + btrfs_warn(fs_info, "try to load backup roots slot %u", i); is int, so %d > ret = read_backup_root(fs_info, i); > backup_index = ret; > if (ret < 0) > -- > 2.41.0
On 2023/6/13 05:37, David Sterba wrote: > On Sun, Jun 11, 2023 at 08:09:13AM +0800, Qu Wenruo wrote: >> [BUG] >> Syzbot reports a reproducible ASSERT() when using rescue=usebackuproot >> mount option on a corrupted fs. >> >> The full report can be found here: >> https://syzkaller.appspot.com/bug?extid=c4614eae20a166c25bf0 > > I'll copy the stack trace here, the link is good for reference but we > want to record the contents too. > >> [CAUSE] >> Since the introduction of global roots, we handle >> csum/extent/free-space-tree roots as global roots, even if no >> extent-tree-v2 feature is enabled. >> >> So for regular csum/extent/fst roots, we load them into >> fs_info::global_root_tree rb tree. >> >> And we should not expect any conflicts in that rb tree, thus we have an >> ASSERT() inside btrfs_global_root_insert(). >> >> But rescue=usebackuproot can break the assumption, as we will try to >> load those trees again and again as long as we have bad roots and have >> backup roots slot remaining. >> >> So in that case we can have conflicting roots in the rb tree, and >> triggering the ASSERT() crash. >> >> [FIX] >> We can safely remove that ASSERT(), as the caller will properly put the >> offending root. >> >> To make further debugging easier, also add two explicit error messages: >> >> - Error message for conflicting global roots >> - Error message when using backup roots slot >> >> Reported-by: syzbot+a694851c6ab28cbcfb9c@syzkaller.appspotmail.com >> Fixes: abed4aaae4f7 ("btrfs: track the csum, extent, and free space trees in a rb tree") >> Signed-off-by: Qu Wenruo <wqu@suse.com> >> --- >> fs/btrfs/disk-io.c | 10 ++++++++-- >> 1 file changed, 8 insertions(+), 2 deletions(-) >> >> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c >> index 1168e5df8bae..7f201975a303 100644 >> --- a/fs/btrfs/disk-io.c >> +++ b/fs/btrfs/disk-io.c >> @@ -748,13 +748,18 @@ int btrfs_global_root_insert(struct btrfs_root *root) >> { >> struct btrfs_fs_info *fs_info = root->fs_info; >> struct rb_node *tmp; >> + int ret = 0; >> >> write_lock(&fs_info->global_root_lock); >> tmp = rb_find_add(&root->rb_node, &fs_info->global_root_tree, global_root_cmp); >> write_unlock(&fs_info->global_root_lock); >> - ASSERT(!tmp); >> >> - return tmp ? -EEXIST : 0; >> + if (tmp) { >> + ret = -EEXIST; >> + btrfs_warn(fs_info, "global root %lld %llu already exists", > > %lld is to show the negative numbers, right? Are there any actual global > roots with such numbers? > > BTRFS_DATA_RELOC_TREE_OBJECTID -9 > BTRFS_TREE_RELOC_OBJECTID -8 > BTRFS_TREE_LOG_FIXUP_OBJECTID -7 > BTRFS_TREE_LOG_OBJECTID -6 > > None of them seems to be among the global roots. Right, data reloc tree is not a global one, so %llu is OK. > >> + root->root_key.objectid, root->root_key.offset); >> + } >> + return ret; >> } >> >> void btrfs_global_root_delete(struct btrfs_root *root) >> @@ -2591,6 +2596,7 @@ static int __cold init_tree_roots(struct btrfs_fs_info *fs_info) >> /* We can't trust the free space cache either */ >> btrfs_set_opt(fs_info->mount_opt, CLEAR_CACHE); >> >> + btrfs_warn(fs_info, "try to load backup roots slot %u", i); > > is int, so %d But @i is only between [0, 4), thus %u is fine. Thanks, Qu > >> ret = read_backup_root(fs_info, i); >> backup_index = ret; >> if (ret < 0) >> -- >> 2.41.0
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 1168e5df8bae..7f201975a303 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -748,13 +748,18 @@ int btrfs_global_root_insert(struct btrfs_root *root) { struct btrfs_fs_info *fs_info = root->fs_info; struct rb_node *tmp; + int ret = 0; write_lock(&fs_info->global_root_lock); tmp = rb_find_add(&root->rb_node, &fs_info->global_root_tree, global_root_cmp); write_unlock(&fs_info->global_root_lock); - ASSERT(!tmp); - return tmp ? -EEXIST : 0; + if (tmp) { + ret = -EEXIST; + btrfs_warn(fs_info, "global root %lld %llu already exists", + root->root_key.objectid, root->root_key.offset); + } + return ret; } void btrfs_global_root_delete(struct btrfs_root *root) @@ -2591,6 +2596,7 @@ static int __cold init_tree_roots(struct btrfs_fs_info *fs_info) /* We can't trust the free space cache either */ btrfs_set_opt(fs_info->mount_opt, CLEAR_CACHE); + btrfs_warn(fs_info, "try to load backup roots slot %u", i); ret = read_backup_root(fs_info, i); backup_index = ret; if (ret < 0)
[BUG] Syzbot reports a reproducible ASSERT() when using rescue=usebackuproot mount option on a corrupted fs. The full report can be found here: https://syzkaller.appspot.com/bug?extid=c4614eae20a166c25bf0 [CAUSE] Since the introduction of global roots, we handle csum/extent/free-space-tree roots as global roots, even if no extent-tree-v2 feature is enabled. So for regular csum/extent/fst roots, we load them into fs_info::global_root_tree rb tree. And we should not expect any conflicts in that rb tree, thus we have an ASSERT() inside btrfs_global_root_insert(). But rescue=usebackuproot can break the assumption, as we will try to load those trees again and again as long as we have bad roots and have backup roots slot remaining. So in that case we can have conflicting roots in the rb tree, and triggering the ASSERT() crash. [FIX] We can safely remove that ASSERT(), as the caller will properly put the offending root. To make further debugging easier, also add two explicit error messages: - Error message for conflicting global roots - Error message when using backup roots slot Reported-by: syzbot+a694851c6ab28cbcfb9c@syzkaller.appspotmail.com Fixes: abed4aaae4f7 ("btrfs: track the csum, extent, and free space trees in a rb tree") Signed-off-by: Qu Wenruo <wqu@suse.com> --- fs/btrfs/disk-io.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)