Message ID | 20240220153059.11233-1-xni@redhat.com (mailing list archive) |
---|---|
Headers | show |
Series | Fix regression bugs | expand |
On Tue, Feb 20, 2024 at 11:30:55PM +0800, Xiao Ni wrote: > Hi all > > Sorry, I know this patch set conflict with Yu Kuai's patch set. But > I have to send out this patch set. Now we're facing some deadlock > regression problems. So it's better to figure out the root cause and > fix them. But Kuai's patch set looks too complicate for me. And like > we're talking in the emails, Kuai's patch set breaks some rules. It's > not good to fix some problem by breaking the original logic. If we really > need to break some logic. It's better to use a distinct patch set to > describe why we need them. > > This patch is based on linus's tree. The tag is 6.8-rc5. If this patch set > can be accepted. We need to revert Kuai's patches which have been merged > in Song's tree (md-6.8-20240216 tag). This patch set has four patches. > The first two resolves deadlock problems. With these two patches, it can > resolve most deadlock problem. The third one fixes active_io counter bug. > The fouth one fixes the raid5 reshape deadlock problem. With this patchset on top of the v6.8-rc5 kernel I can still see a hang tearing down the devices at the end of lvconvert-raid-reshape.sh if I run it repeatedly. I haven't dug into this enough to be certain, but it appears that when this hangs, stripe_result make_stripe_request() is returning STRIPE_SCHEDULE_AND_RETRY because of ahead_of_reshape(mddev, logical_sector, conf->reshape_safe)) this never runs stripe_across_reshape() from you last patch. It hangs with the following hung-task backtrace: [ 4569.331345] sysrq: Show Blocked State [ 4569.332640] task:mdX_resync state:D stack:0 pid:155469 tgid:155469 ppid:2 flags:0x00004000 [ 4569.335367] Call Trace: [ 4569.336122] <TASK> [ 4569.336758] __schedule+0x3ec/0x15c0 [ 4569.337789] ? __schedule+0x3f4/0x15c0 [ 4569.338433] ? __wake_up_klogd.part.0+0x3c/0x60 [ 4569.339186] schedule+0x32/0xd0 [ 4569.339709] md_do_sync+0xede/0x11c0 [ 4569.340324] ? __pfx_autoremove_wake_function+0x10/0x10 [ 4569.341183] ? __pfx_md_thread+0x10/0x10 [ 4569.341831] md_thread+0xab/0x190 [ 4569.342397] kthread+0xe5/0x120 [ 4569.342933] ? __pfx_kthread+0x10/0x10 [ 4569.343554] ret_from_fork+0x31/0x50 [ 4569.344152] ? __pfx_kthread+0x10/0x10 [ 4569.344761] ret_from_fork_asm+0x1b/0x30 [ 4569.345193] </TASK> [ 4569.345403] task:dmsetup state:D stack:0 pid:156091 tgid:156091 ppid:155933 flags:0x00004002 [ 4569.346300] Call Trace: [ 4569.346538] <TASK> [ 4569.346746] __schedule+0x3ec/0x15c0 [ 4569.347097] ? __schedule+0x3f4/0x15c0 [ 4569.347440] ? sysvec_call_function_single+0xe/0x90 [ 4569.347905] ? asm_sysvec_call_function_single+0x1a/0x20 [ 4569.348401] ? __pfx_dev_remove+0x10/0x10 [ 4569.348779] schedule+0x32/0xd0 [ 4569.349079] stop_sync_thread+0x136/0x1d0 [ 4569.349465] ? __pfx_autoremove_wake_function+0x10/0x10 [ 4569.349965] __md_stop_writes+0x15/0xe0 [ 4569.350341] md_stop_writes+0x29/0x40 [ 4569.350698] raid_postsuspend+0x53/0x60 [dm_raid] [ 4569.351159] dm_table_postsuspend_targets+0x3d/0x60 [ 4569.351627] __dm_destroy+0x1c5/0x1e0 [ 4569.351984] dev_remove+0x11d/0x190 [ 4569.352328] ctl_ioctl+0x30e/0x5e0 [ 4569.352659] dm_ctl_ioctl+0xe/0x20 [ 4569.352992] __x64_sys_ioctl+0x94/0xd0 [ 4569.353352] do_syscall_64+0x86/0x170 [ 4569.353703] ? dm_ctl_ioctl+0xe/0x20 [ 4569.354059] ? syscall_exit_to_user_mode+0x89/0x230 [ 4569.354517] ? do_syscall_64+0x96/0x170 [ 4569.354891] ? exc_page_fault+0x7f/0x180 [ 4569.355258] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 4569.355744] RIP: 0033:0x7f49e5dbc13d [ 4569.356113] RSP: 002b:00007ffc365585f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 4569.356804] RAX: ffffffffffffffda RBX: 000055638c4932c0 RCX: 00007f49e5dbc13d [ 4569.357488] RDX: 000055638c493af0 RSI: 00000000c138fd04 RDI: 0000000000000003 [ 4569.358140] RBP: 00007ffc36558640 R08: 00007f49e5fbc690 R09: 00007ffc365584a8 [ 4569.358783] R10: 00007f49e5fbb97d R11: 0000000000000246 R12: 00007f49e5fbb97d [ 4569.359442] R13: 000055638c493ba0 R14: 00007f49e5fbb97d R15: 00007f49e5fbb97d [ 4569.360090] </TASK> > > I have run lvm2 regression test. There are 4 failed cases: > shell/dmsetup-integrity-keys.sh > shell/lvresize-fs-crypt.sh > shell/pvck-dump.sh > shell/select-report.sh > > Xiao Ni (4): > Clear MD_RECOVERY_WAIT when stopping dmraid > Set MD_RECOVERY_FROZEN before stop sync thread > md: Missing decrease active_io for flush io > Don't check crossing reshape when reshape hasn't started > > drivers/md/dm-raid.c | 2 ++ > drivers/md/md.c | 8 +++++++- > drivers/md/raid5.c | 22 ++++++++++------------ > 3 files changed, 19 insertions(+), 13 deletions(-) > > -- > 2.32.0 (Apple Git-132)
On Wed, Feb 21, 2024 at 1:45 PM Benjamin Marzinski <bmarzins@redhat.com> wrote: > > On Tue, Feb 20, 2024 at 11:30:55PM +0800, Xiao Ni wrote: > > Hi all > > > > Sorry, I know this patch set conflict with Yu Kuai's patch set. But > > I have to send out this patch set. Now we're facing some deadlock > > regression problems. So it's better to figure out the root cause and > > fix them. But Kuai's patch set looks too complicate for me. And like > > we're talking in the emails, Kuai's patch set breaks some rules. It's > > not good to fix some problem by breaking the original logic. If we really > > need to break some logic. It's better to use a distinct patch set to > > describe why we need them. > > > > This patch is based on linus's tree. The tag is 6.8-rc5. If this patch set > > can be accepted. We need to revert Kuai's patches which have been merged > > in Song's tree (md-6.8-20240216 tag). This patch set has four patches. > > The first two resolves deadlock problems. With these two patches, it can > > resolve most deadlock problem. The third one fixes active_io counter bug. > > The fouth one fixes the raid5 reshape deadlock problem. > > With this patchset on top of the v6.8-rc5 kernel I can still see a hang > tearing down the devices at the end of lvconvert-raid-reshape.sh if I > run it repeatedly. I haven't dug into this enough to be certain, but it > appears that when this hangs, stripe_result make_stripe_request() is > returning STRIPE_SCHEDULE_AND_RETRY because of > > ahead_of_reshape(mddev, logical_sector, conf->reshape_safe)) > > this never runs stripe_across_reshape() from you last patch. > > It hangs with the following hung-task backtrace: > > [ 4569.331345] sysrq: Show Blocked State > [ 4569.332640] task:mdX_resync state:D stack:0 pid:155469 tgid:155469 ppid:2 flags:0x00004000 > [ 4569.335367] Call Trace: > [ 4569.336122] <TASK> > [ 4569.336758] __schedule+0x3ec/0x15c0 > [ 4569.337789] ? __schedule+0x3f4/0x15c0 > [ 4569.338433] ? __wake_up_klogd.part.0+0x3c/0x60 > [ 4569.339186] schedule+0x32/0xd0 > [ 4569.339709] md_do_sync+0xede/0x11c0 > [ 4569.340324] ? __pfx_autoremove_wake_function+0x10/0x10 > [ 4569.341183] ? __pfx_md_thread+0x10/0x10 > [ 4569.341831] md_thread+0xab/0x190 > [ 4569.342397] kthread+0xe5/0x120 > [ 4569.342933] ? __pfx_kthread+0x10/0x10 > [ 4569.343554] ret_from_fork+0x31/0x50 > [ 4569.344152] ? __pfx_kthread+0x10/0x10 > [ 4569.344761] ret_from_fork_asm+0x1b/0x30 > [ 4569.345193] </TASK> > [ 4569.345403] task:dmsetup state:D stack:0 pid:156091 tgid:156091 ppid:155933 flags:0x00004002 > [ 4569.346300] Call Trace: > [ 4569.346538] <TASK> > [ 4569.346746] __schedule+0x3ec/0x15c0 > [ 4569.347097] ? __schedule+0x3f4/0x15c0 > [ 4569.347440] ? sysvec_call_function_single+0xe/0x90 > [ 4569.347905] ? asm_sysvec_call_function_single+0x1a/0x20 > [ 4569.348401] ? __pfx_dev_remove+0x10/0x10 > [ 4569.348779] schedule+0x32/0xd0 > [ 4569.349079] stop_sync_thread+0x136/0x1d0 > [ 4569.349465] ? __pfx_autoremove_wake_function+0x10/0x10 > [ 4569.349965] __md_stop_writes+0x15/0xe0 > [ 4569.350341] md_stop_writes+0x29/0x40 > [ 4569.350698] raid_postsuspend+0x53/0x60 [dm_raid] > [ 4569.351159] dm_table_postsuspend_targets+0x3d/0x60 > [ 4569.351627] __dm_destroy+0x1c5/0x1e0 > [ 4569.351984] dev_remove+0x11d/0x190 > [ 4569.352328] ctl_ioctl+0x30e/0x5e0 > [ 4569.352659] dm_ctl_ioctl+0xe/0x20 > [ 4569.352992] __x64_sys_ioctl+0x94/0xd0 > [ 4569.353352] do_syscall_64+0x86/0x170 > [ 4569.353703] ? dm_ctl_ioctl+0xe/0x20 > [ 4569.354059] ? syscall_exit_to_user_mode+0x89/0x230 > [ 4569.354517] ? do_syscall_64+0x96/0x170 > [ 4569.354891] ? exc_page_fault+0x7f/0x180 > [ 4569.355258] entry_SYSCALL_64_after_hwframe+0x6e/0x76 > [ 4569.355744] RIP: 0033:0x7f49e5dbc13d > [ 4569.356113] RSP: 002b:00007ffc365585f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 > [ 4569.356804] RAX: ffffffffffffffda RBX: 000055638c4932c0 RCX: 00007f49e5dbc13d > [ 4569.357488] RDX: 000055638c493af0 RSI: 00000000c138fd04 RDI: 0000000000000003 > [ 4569.358140] RBP: 00007ffc36558640 R08: 00007f49e5fbc690 R09: 00007ffc365584a8 > [ 4569.358783] R10: 00007f49e5fbb97d R11: 0000000000000246 R12: 00007f49e5fbb97d > [ 4569.359442] R13: 000055638c493ba0 R14: 00007f49e5fbb97d R15: 00007f49e5fbb97d > [ 4569.360090] </TASK> Hi Ben I can reproduce this with 6.6 too. So it's not a regression by the change (stop sync thread asynchronously). I'm trying to debug it and find the root cause. In 6.8 with my patch set, the logs show it's stuck at: wait_event(mddev->recovery_wait, !atomic_read(&mddev->recovery_active)); But raid5 conf->active_stripes is 0. So I'm still looking at why this can happen. Best Regards Xiao > > > > > > I have run lvm2 regression test. There are 4 failed cases: > > shell/dmsetup-integrity-keys.sh > > shell/lvresize-fs-crypt.sh > > shell/pvck-dump.sh > > shell/select-report.sh > > > > Xiao Ni (4): > > Clear MD_RECOVERY_WAIT when stopping dmraid > > Set MD_RECOVERY_FROZEN before stop sync thread > > md: Missing decrease active_io for flush io > > Don't check crossing reshape when reshape hasn't started > > > > drivers/md/dm-raid.c | 2 ++ > > drivers/md/md.c | 8 +++++++- > > drivers/md/raid5.c | 22 ++++++++++------------ > > 3 files changed, 19 insertions(+), 13 deletions(-) > > > > -- > > 2.32.0 (Apple Git-132) >