Message ID | 9d97cc69e823574da0c7586cdfbd49d210b8a246.1606938211.git.josef@toxicpanda.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Cleanup error handling in relocation | expand |
On 2020/12/3 上午3:51, Josef Bacik wrote: > While doing error injection testing with my relocation patches I hit the > following ASSERT() > > assertion failed: list_empty(&block_group->dirty_list), in fs/btrfs/block-group.c:3356 > ------------[ cut here ]------------ > kernel BUG at fs/btrfs/ctree.h:3357! > invalid opcode: 0000 [#1] SMP NOPTI > CPU: 0 PID: 24351 Comm: umount Tainted: G W 5.10.0-rc3+ #193 > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014 > RIP: 0010:assertfail.constprop.0+0x18/0x1a > RSP: 0018:ffffa09b019c7e00 EFLAGS: 00010282 > RAX: 0000000000000056 RBX: ffff8f6492c18000 RCX: 0000000000000000 > RDX: ffff8f64fbc27c60 RSI: ffff8f64fbc19050 RDI: ffff8f64fbc19050 > RBP: ffff8f6483bbdc00 R08: 0000000000000000 R09: 0000000000000000 > R10: ffffa09b019c7c38 R11: ffffffff85d70928 R12: ffff8f6492c18100 > R13: ffff8f6492c18148 R14: ffff8f6483bbdd70 R15: dead000000000100 > FS: 00007fbfda4cdc40(0000) GS:ffff8f64fbc00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00007fbfda666fd0 CR3: 000000013cf66002 CR4: 0000000000370ef0 > Call Trace: > btrfs_free_block_groups.cold+0x55/0x55 > close_ctree+0x2c5/0x306 > ? fsnotify_destroy_marks+0x14/0x100 > generic_shutdown_super+0x6c/0x100 > kill_anon_super+0x14/0x30 > btrfs_kill_super+0x12/0x20 > deactivate_locked_super+0x36/0xa0 > cleanup_mnt+0x12d/0x190 > task_work_run+0x5c/0xa0 > exit_to_user_mode_prepare+0x1b1/0x1d0 > syscall_exit_to_user_mode+0x54/0x280 > entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > This happened because I injected an error in btrfs_cow_block() while > running the dirty block groups. When we run the dirty block groups, we > splice the list onto a local list to process. However if an error > occurs, we only cleanup the transactions dirty block group list, not any > pending block groups we have on our locally spliced list. Fix this by > splicing the list back onto the transactions dirty block group list, so > any remaining block groups are cleaned up. > > Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Thanks, Qu > --- > fs/btrfs/block-group.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c > index 0886e81e5540..5cfa52b1a3b8 100644 > --- a/fs/btrfs/block-group.c > +++ b/fs/btrfs/block-group.c > @@ -2685,6 +2685,9 @@ int btrfs_start_dirty_block_groups(struct btrfs_trans_handle *trans) > } > spin_unlock(&cur_trans->dirty_bgs_lock); > } else if (ret < 0) { > + spin_lock(&cur_trans->dirty_bgs_lock); > + list_splice_init(&dirty, &cur_trans->dirty_bgs); > + spin_unlock(&cur_trans->dirty_bgs_lock); > btrfs_cleanup_dirty_bgs(cur_trans, fs_info); > } > >
diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c index 0886e81e5540..5cfa52b1a3b8 100644 --- a/fs/btrfs/block-group.c +++ b/fs/btrfs/block-group.c @@ -2685,6 +2685,9 @@ int btrfs_start_dirty_block_groups(struct btrfs_trans_handle *trans) } spin_unlock(&cur_trans->dirty_bgs_lock); } else if (ret < 0) { + spin_lock(&cur_trans->dirty_bgs_lock); + list_splice_init(&dirty, &cur_trans->dirty_bgs); + spin_unlock(&cur_trans->dirty_bgs_lock); btrfs_cleanup_dirty_bgs(cur_trans, fs_info); }
While doing error injection testing with my relocation patches I hit the following ASSERT() assertion failed: list_empty(&block_group->dirty_list), in fs/btrfs/block-group.c:3356 ------------[ cut here ]------------ kernel BUG at fs/btrfs/ctree.h:3357! invalid opcode: 0000 [#1] SMP NOPTI CPU: 0 PID: 24351 Comm: umount Tainted: G W 5.10.0-rc3+ #193 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014 RIP: 0010:assertfail.constprop.0+0x18/0x1a RSP: 0018:ffffa09b019c7e00 EFLAGS: 00010282 RAX: 0000000000000056 RBX: ffff8f6492c18000 RCX: 0000000000000000 RDX: ffff8f64fbc27c60 RSI: ffff8f64fbc19050 RDI: ffff8f64fbc19050 RBP: ffff8f6483bbdc00 R08: 0000000000000000 R09: 0000000000000000 R10: ffffa09b019c7c38 R11: ffffffff85d70928 R12: ffff8f6492c18100 R13: ffff8f6492c18148 R14: ffff8f6483bbdd70 R15: dead000000000100 FS: 00007fbfda4cdc40(0000) GS:ffff8f64fbc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fbfda666fd0 CR3: 000000013cf66002 CR4: 0000000000370ef0 Call Trace: btrfs_free_block_groups.cold+0x55/0x55 close_ctree+0x2c5/0x306 ? fsnotify_destroy_marks+0x14/0x100 generic_shutdown_super+0x6c/0x100 kill_anon_super+0x14/0x30 btrfs_kill_super+0x12/0x20 deactivate_locked_super+0x36/0xa0 cleanup_mnt+0x12d/0x190 task_work_run+0x5c/0xa0 exit_to_user_mode_prepare+0x1b1/0x1d0 syscall_exit_to_user_mode+0x54/0x280 entry_SYSCALL_64_after_hwframe+0x44/0xa9 This happened because I injected an error in btrfs_cow_block() while running the dirty block groups. When we run the dirty block groups, we splice the list onto a local list to process. However if an error occurs, we only cleanup the transactions dirty block group list, not any pending block groups we have on our locally spliced list. Fix this by splicing the list back onto the transactions dirty block group list, so any remaining block groups are cleaned up. Signed-off-by: Josef Bacik <josef@toxicpanda.com> --- fs/btrfs/block-group.c | 3 +++ 1 file changed, 3 insertions(+)