Message ID | 20240220153059.11233-3-xni@redhat.com (mailing list archive) |
---|---|
State | Superseded, archived |
Delegated to: | Mike Snitzer |
Headers | show |
Series | Fix regression bugs | expand |
Hi, 在 2024/02/20 23:30, Xiao Ni 写道: > After patch commit f52f5c71f3d4b ("md: fix stopping sync thread"), dmraid > stops sync thread asynchronously. The calling process is: > dev_remove->dm_destroy->__dm_destroy->raid_postsuspend->raid_dtr > > raid_postsuspend does two jobs. First, it stops sync thread. Then it > suspend array. Now it can stop sync thread successfully. But it doesn't > set MD_RECOVERY_FROZEN. It's introduced by patch f52f5c71f3d4b. So after > raid_postsuspend, the sync thread starts again. raid_dtr can't stop the > sync thread because the array is already suspended. > > This can be reproduced easily by those commands: > while [ 1 ]; do > vgcreate test_vg /dev/loop0 /dev/loop1 > lvcreate --type raid1 -L 400M -m 1 -n test_lv test_vg > lvchange -an test_vg > vgremove test_vg -ff > done > > Fixes: f52f5c71f3d4 ("md: fix stopping sync thread") > Signed-off-by: Xiao Ni <xni@redhat.com> I agree with this change, but this patch is part of my patch in the other thread: dm-raid: really frozen sync_thread during suspend I still think that fix found problems completely is better, however, we'll let Song to make decision. Thanks, Kuai > --- > drivers/md/md.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/md/md.c b/drivers/md/md.c > index 54790261254d..77e3af7cf6bb 100644 > --- a/drivers/md/md.c > +++ b/drivers/md/md.c > @@ -6337,6 +6337,7 @@ static void __md_stop_writes(struct mddev *mddev) > void md_stop_writes(struct mddev *mddev) > { > mddev_lock_nointr(mddev); > + set_bit(MD_RECOVERY_FROZEN, &mddev->recovery); > __md_stop_writes(mddev); > mddev_unlock(mddev); > } >
On Thu, Feb 22, 2024 at 7:12 PM Yu Kuai <yukuai1@huaweicloud.com> wrote: > > Hi, > > 在 2024/02/20 23:30, Xiao Ni 写道: > > After patch commit f52f5c71f3d4b ("md: fix stopping sync thread"), dmraid > > stops sync thread asynchronously. The calling process is: > > dev_remove->dm_destroy->__dm_destroy->raid_postsuspend->raid_dtr > > > > raid_postsuspend does two jobs. First, it stops sync thread. Then it > > suspend array. Now it can stop sync thread successfully. But it doesn't > > set MD_RECOVERY_FROZEN. It's introduced by patch f52f5c71f3d4b. So after > > raid_postsuspend, the sync thread starts again. raid_dtr can't stop the > > sync thread because the array is already suspended. > > > > This can be reproduced easily by those commands: > > while [ 1 ]; do > > vgcreate test_vg /dev/loop0 /dev/loop1 > > lvcreate --type raid1 -L 400M -m 1 -n test_lv test_vg > > lvchange -an test_vg > > vgremove test_vg -ff > > done > > > > Fixes: f52f5c71f3d4 ("md: fix stopping sync thread") > > Signed-off-by: Xiao Ni <xni@redhat.com> > > I agree with this change, but this patch is part of my patch in the > other thread: > > dm-raid: really frozen sync_thread during suspend > > I still think that fix found problems completely is better, however, > we'll let Song to make decision. I think we still need more time (and maybe more iterations) for the other thread, so we can ship this change sooner. We should add SoB Kuai here. Thanks, Song
diff --git a/drivers/md/md.c b/drivers/md/md.c index 54790261254d..77e3af7cf6bb 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -6337,6 +6337,7 @@ static void __md_stop_writes(struct mddev *mddev) void md_stop_writes(struct mddev *mddev) { mddev_lock_nointr(mddev); + set_bit(MD_RECOVERY_FROZEN, &mddev->recovery); __md_stop_writes(mddev); mddev_unlock(mddev); }
After patch commit f52f5c71f3d4b ("md: fix stopping sync thread"), dmraid stops sync thread asynchronously. The calling process is: dev_remove->dm_destroy->__dm_destroy->raid_postsuspend->raid_dtr raid_postsuspend does two jobs. First, it stops sync thread. Then it suspend array. Now it can stop sync thread successfully. But it doesn't set MD_RECOVERY_FROZEN. It's introduced by patch f52f5c71f3d4b. So after raid_postsuspend, the sync thread starts again. raid_dtr can't stop the sync thread because the array is already suspended. This can be reproduced easily by those commands: while [ 1 ]; do vgcreate test_vg /dev/loop0 /dev/loop1 lvcreate --type raid1 -L 400M -m 1 -n test_lv test_vg lvchange -an test_vg vgremove test_vg -ff done Fixes: f52f5c71f3d4 ("md: fix stopping sync thread") Signed-off-by: Xiao Ni <xni@redhat.com> --- drivers/md/md.c | 1 + 1 file changed, 1 insertion(+)