diff mbox series

[V2] btrfs: btrfs can not be mounted with corrupted extent root

Message ID 20250102063026.12223-1-lizhi.xu@windriver.com (mailing list archive)
State New
Headers show
Series [V2] btrfs: btrfs can not be mounted with corrupted extent root | expand

Commit Message

Lizhi Xu Jan. 2, 2025, 6:30 a.m. UTC
syzbot reported a null-ptr-deref in find_first_extent_item. [1]

The btrfs filesystem did not successfully initialize extent root to the
global root tree when mounted(as the mount options contain ignorebadroots),
this is because extent buffer is not uptodate, which causes the failure to
read the tree root, which in turn causes extent root to not be inserted into
the global root tree.

To prevent this issue, if extent root is corrupt then exit the mount.

[1]
Unable to handle kernel paging request at virtual address dfff800000000041
KASAN: null-ptr-deref in range [0x0000000000000208-0x000000000000020f]
Mem abort info:
  ESR = 0x0000000096000005
  EC = 0x25: DABT (current EL), IL = 32 bits
  SET = 0, FnV = 0
  EA = 0, S1PTW = 0
  FSC = 0x05: level 1 translation fault
Data abort info:
  ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
  CM = 0, WnR = 0, TnD = 0, TagAccess = 0
  GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[dfff800000000041] address between user and kernel address ranges
Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
Modules linked in:
CPU: 1 UID: 0 PID: 6417 Comm: syz-executor153 Not tainted 6.13.0-rc3-syzkaller-g573067a5a685 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : find_first_extent_item+0xac/0x674 fs/btrfs/scrub.c:1375
lr : find_first_extent_item+0xa4/0x674 fs/btrfs/scrub.c:1374
sp : ffff8000a5be6e60
x29: ffff8000a5be6f80 x28: dfff800000000000 x27: 0000000000000000
x26: 0000000000400000 x25: 0000000000400000 x24: 1fffe0001848ab0a
x23: 0000000000000208 x22: ffff8000a5be6f20 x21: ffff0000c2455858
x20: ffff8000a5be6ec0 x19: ffff0000db072010 x18: ffff0000db072010
x17: 000000000000e32c x16: ffff80008b5fea08 x15: 0000000000000004
x14: 1fffe0001b60c031 x13: 0000000000000000 x12: ffff700014b7cdd8
x11: ffff80008257f234 x10: 0000000000ff0100 x9 : 0000000000000000
x8 : 0000000000000041 x7 : 0000000000000000 x6 : 000000000000003f
x5 : 0000000000000040 x4 : 0000000000000008 x3 : 0000000000400000
x2 : 0000000000100000 x1 : ffff0000db072010 x0 : 0000000000000000
Call trace:
 find_first_extent_item+0xac/0x674 fs/btrfs/scrub.c:1375 (P)
 scrub_find_fill_first_stripe+0x2c0/0xab8 fs/btrfs/scrub.c:1551
 queue_scrub_stripe fs/btrfs/scrub.c:1921 [inline]
 scrub_simple_mirror+0x440/0x7e4 fs/btrfs/scrub.c:2152
 scrub_stripe+0x7e4/0x2174 fs/btrfs/scrub.c:2317
 scrub_chunk+0x268/0x41c fs/btrfs/scrub.c:2443
 scrub_enumerate_chunks+0xd38/0x1784 fs/btrfs/scrub.c:2707
 btrfs_scrub_dev+0x5a8/0xb34 fs/btrfs/scrub.c:3029
 btrfs_ioctl_scrub+0x1f4/0x3e8 fs/btrfs/ioctl.c:3248
 btrfs_ioctl+0x6a8/0xb04 fs/btrfs/ioctl.c:5246
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:906 [inline]
 __se_sys_ioctl fs/ioctl.c:892 [inline]
 __arm64_sys_ioctl+0x14c/0x1cc fs/ioctl.c:892
 __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]

Fixes: abed4aaae4f7 ("btrfs: track the csum, extent, and free space trees in a rbtree")
Reported-by: syzbot+339e9dbe3a2ca419b85d@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=339e9dbe3a2ca419b85d
Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com>
---
V1 -> V2: exit mount when extent root is corrupt

 fs/btrfs/disk-io.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Qu Wenruo Jan. 2, 2025, 8:21 a.m. UTC | #1
在 2025/1/2 17:00, Lizhi Xu 写道:
> syzbot reported a null-ptr-deref in find_first_extent_item. [1]
>
> The btrfs filesystem did not successfully initialize extent root to the
> global root tree when mounted(as the mount options contain ignorebadroots),
> this is because extent buffer is not uptodate,

The "not uptodate" is only the symptom, if you check the console output
carefully enough, it's because the extent tree root (and must be extent
tree root, any child node won't cause problem) is corrupted (csum mismatch).


> which causes the failure to
> read the tree root, which in turn causes extent root to not be inserted into
> the global root tree.
>
> To prevent this issue, if extent root is corrupt then exit the mount.
>
> [1]
> Unable to handle kernel paging request at virtual address dfff800000000041
> KASAN: null-ptr-deref in range [0x0000000000000208-0x000000000000020f]
> Mem abort info:
>    ESR = 0x0000000096000005
>    EC = 0x25: DABT (current EL), IL = 32 bits
>    SET = 0, FnV = 0
>    EA = 0, S1PTW = 0
>    FSC = 0x05: level 1 translation fault
> Data abort info:
>    ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
>    CM = 0, WnR = 0, TnD = 0, TagAccess = 0
>    GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> [dfff800000000041] address between user and kernel address ranges
> Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
> Modules linked in:
> CPU: 1 UID: 0 PID: 6417 Comm: syz-executor153 Not tainted 6.13.0-rc3-syzkaller-g573067a5a685 #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
> pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : find_first_extent_item+0xac/0x674 fs/btrfs/scrub.c:1375
> lr : find_first_extent_item+0xa4/0x674 fs/btrfs/scrub.c:1374
> sp : ffff8000a5be6e60
> x29: ffff8000a5be6f80 x28: dfff800000000000 x27: 0000000000000000
> x26: 0000000000400000 x25: 0000000000400000 x24: 1fffe0001848ab0a
> x23: 0000000000000208 x22: ffff8000a5be6f20 x21: ffff0000c2455858
> x20: ffff8000a5be6ec0 x19: ffff0000db072010 x18: ffff0000db072010
> x17: 000000000000e32c x16: ffff80008b5fea08 x15: 0000000000000004
> x14: 1fffe0001b60c031 x13: 0000000000000000 x12: ffff700014b7cdd8
> x11: ffff80008257f234 x10: 0000000000ff0100 x9 : 0000000000000000
> x8 : 0000000000000041 x7 : 0000000000000000 x6 : 000000000000003f
> x5 : 0000000000000040 x4 : 0000000000000008 x3 : 0000000000400000
> x2 : 0000000000100000 x1 : ffff0000db072010 x0 : 0000000000000000
> Call trace:
>   find_first_extent_item+0xac/0x674 fs/btrfs/scrub.c:1375 (P)
>   scrub_find_fill_first_stripe+0x2c0/0xab8 fs/btrfs/scrub.c:1551
>   queue_scrub_stripe fs/btrfs/scrub.c:1921 [inline]
>   scrub_simple_mirror+0x440/0x7e4 fs/btrfs/scrub.c:2152
>   scrub_stripe+0x7e4/0x2174 fs/btrfs/scrub.c:2317
>   scrub_chunk+0x268/0x41c fs/btrfs/scrub.c:2443
>   scrub_enumerate_chunks+0xd38/0x1784 fs/btrfs/scrub.c:2707
>   btrfs_scrub_dev+0x5a8/0xb34 fs/btrfs/scrub.c:3029
>   btrfs_ioctl_scrub+0x1f4/0x3e8 fs/btrfs/ioctl.c:3248
>   btrfs_ioctl+0x6a8/0xb04 fs/btrfs/ioctl.c:5246
>   vfs_ioctl fs/ioctl.c:51 [inline]
>   __do_sys_ioctl fs/ioctl.c:906 [inline]
>   __se_sys_ioctl fs/ioctl.c:892 [inline]
>   __arm64_sys_ioctl+0x14c/0x1cc fs/ioctl.c:892
>   __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
>
> Fixes: abed4aaae4f7 ("btrfs: track the csum, extent, and free space trees in a rbtree")

I do not think that's the correct commit.

Even before that commit, inside scrub_stripe() we directly use
fs_info->extent_root without checking if it's NULL.

And pass that extent_root into btrfs_reada_add(), which later calls
"rc->fs_info = root->fs_info;", triggering exactly the same error.

Before 42437a6386ff ("btrfs: introduce mount option
rescue=ignorebadroots"), there is no error path like this at all,
because btrfs will just refuse to mount such fs with extent tree root
corrupted.

So please, understand what is really causing the problem then submit a fix.

Thanks,
Qu
> Reported-by: syzbot+339e9dbe3a2ca419b85d@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=339e9dbe3a2ca419b85d
> Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com>
> ---
> V1 -> V2: exit mount when extent root is corrupt
>
>   fs/btrfs/disk-io.c | 6 ++++--
>   1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index eff0dd1ae62f..beb236c7fe1c 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -2167,7 +2167,8 @@ static int load_global_roots_objectid(struct btrfs_root *tree_root,
>   		found = true;
>   		root = read_tree_root_path(tree_root, path, &key);
>   		if (IS_ERR(root)) {
> -			if (!btrfs_test_opt(fs_info, IGNOREBADROOTS))
> +			if (!btrfs_test_opt(fs_info, IGNOREBADROOTS) ||
> +			   objectid == BTRFS_EXTENT_TREE_OBJECTID)
>   				ret = PTR_ERR(root);
>   			break;
>   		}
> @@ -2188,7 +2189,8 @@ static int load_global_roots_objectid(struct btrfs_root *tree_root,
>   		if (objectid == BTRFS_CSUM_TREE_OBJECTID)
>   			set_bit(BTRFS_FS_STATE_NO_DATA_CSUMS, &fs_info->fs_state);
>
> -		if (!btrfs_test_opt(fs_info, IGNOREBADROOTS))
> +		if (!btrfs_test_opt(fs_info, IGNOREBADROOTS) ||
> +		   (ret && objectid == BTRFS_EXTENT_TREE_OBJECTID))
>   			ret = ret ? ret : -ENOENT;
>   		else
>   			ret = 0;
Lizhi Xu Jan. 2, 2025, 9:45 a.m. UTC | #2
On Thu, 2 Jan 2025 18:51:34 +1030, Qu Wenruo wrote:
> > syzbot reported a null-ptr-deref in find_first_extent_item. [1]
> >
> > The btrfs filesystem did not successfully initialize extent root to the
> > global root tree when mounted(as the mount options contain ignorebadroots),
> > this is because extent buffer is not uptodate,
> 
> The "not uptodate" is only the symptom, if you check the console output
> carefully enough, it's because the extent tree root (and must be extent
> tree root, any child node won't cause problem) is corrupted (csum mismatch).
> 
[   35.752834][ T3330] BTRFS warning (device loop0): checksum verify failed on logical 5337088 mirror 1 wanted 0x324c5e2d0cac2dc8f61cbfdfc8cd69d9816061b1498b9e1bff0
According to the above log, it is clear that the failure of btrfs_validate_extent_buffer()
causes the extent buffer to be not uptodate, and it can be judged that extent root is corrupted.

BR,
Lizhi
diff mbox series

Patch

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index eff0dd1ae62f..beb236c7fe1c 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2167,7 +2167,8 @@  static int load_global_roots_objectid(struct btrfs_root *tree_root,
 		found = true;
 		root = read_tree_root_path(tree_root, path, &key);
 		if (IS_ERR(root)) {
-			if (!btrfs_test_opt(fs_info, IGNOREBADROOTS))
+			if (!btrfs_test_opt(fs_info, IGNOREBADROOTS) ||
+			   objectid == BTRFS_EXTENT_TREE_OBJECTID)
 				ret = PTR_ERR(root);
 			break;
 		}
@@ -2188,7 +2189,8 @@  static int load_global_roots_objectid(struct btrfs_root *tree_root,
 		if (objectid == BTRFS_CSUM_TREE_OBJECTID)
 			set_bit(BTRFS_FS_STATE_NO_DATA_CSUMS, &fs_info->fs_state);
 
-		if (!btrfs_test_opt(fs_info, IGNOREBADROOTS))
+		if (!btrfs_test_opt(fs_info, IGNOREBADROOTS) ||
+		   (ret && objectid == BTRFS_EXTENT_TREE_OBJECTID))
 			ret = ret ? ret : -ENOENT;
 		else
 			ret = 0;