Message ID | 20231221071109.1562530-2-linan666@huaweicloud.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | md: bugfix of creating symlink with disk holder | expand |
Hi, On Wed, Dec 20, 2023 at 11:13 PM <linan666@huaweicloud.com> wrote: > > From: Li Nan <linan122@huawei.com> > > Removing a device can trigger WARN_ON in bd_unlink_disk_holder() if creating > symlink failed while adding device. > > WARNING: CPU: 0 PID: 742 at block/holder.c:145 bd_unlink_disk_holder+0x17b/0x1a0 > > Fix it by adding the flag 'SymlinkCreated', which only be set after > creating symlink success. > > Signed-off-by: Li Nan <linan122@huawei.com> > --- > drivers/md/md.h | 3 +++ > drivers/md/md.c | 8 ++++++-- > 2 files changed, 9 insertions(+), 2 deletions(-) > > diff --git a/drivers/md/md.h b/drivers/md/md.h > index 8d881cc59799..427d17713a8c 100644 > --- a/drivers/md/md.h > +++ b/drivers/md/md.h > @@ -207,6 +207,9 @@ enum flag_bits { > * check if there is collision between raid1 > * serial bios. > */ > + SymlinkCreated, /* This device has created the symlink > + * with gendisk. > + */ In general, I would like to avoid adding flags if possible. > }; > > static inline int is_badblock(struct md_rdev *rdev, sector_t s, int sectors, > diff --git a/drivers/md/md.c b/drivers/md/md.c > index e05858653a41..d6612b922c76 100644 > --- a/drivers/md/md.c > +++ b/drivers/md/md.c > @@ -2526,7 +2526,8 @@ static int bind_rdev_to_array(struct md_rdev *rdev, struct mddev *mddev) > sysfs_get_dirent_safe(rdev->kobj.sd, "bad_blocks"); > > list_add_rcu(&rdev->same_set, &mddev->disks); > - bd_link_disk_holder(rdev->bdev, mddev->gendisk); > + if (!bd_link_disk_holder(rdev->bdev, mddev->gendisk)) > + set_bit(SymlinkCreated, &rdev->flags); Shall we just fail bind_rdev_to_array() if bd_link_disk_holder() returns non-zero? Thanks, Song
在 2023/12/22 2:58, Song Liu 写道: > Hi, > > On Wed, Dec 20, 2023 at 11:13 PM <linan666@huaweicloud.com> wrote: >> >> From: Li Nan <linan122@huawei.com> >> >> Removing a device can trigger WARN_ON in bd_unlink_disk_holder() if creating >> symlink failed while adding device. >> >> WARNING: CPU: 0 PID: 742 at block/holder.c:145 bd_unlink_disk_holder+0x17b/0x1a0 >> >> Fix it by adding the flag 'SymlinkCreated', which only be set after >> creating symlink success. >> >> Signed-off-by: Li Nan <linan122@huawei.com> >> --- >> drivers/md/md.h | 3 +++ >> drivers/md/md.c | 8 ++++++-- >> 2 files changed, 9 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/md/md.h b/drivers/md/md.h >> index 8d881cc59799..427d17713a8c 100644 >> --- a/drivers/md/md.h >> +++ b/drivers/md/md.h >> @@ -207,6 +207,9 @@ enum flag_bits { >> * check if there is collision between raid1 >> * serial bios. >> */ >> + SymlinkCreated, /* This device has created the symlink >> + * with gendisk. >> + */ > > In general, I would like to avoid adding flags if possible. > This flag is mainly used to fix deadlock in next patch. Or should we export bd_find_holder_disk()? Link hodler if it return NULL. just like: rdev_for_each_rcu if (!bd_find_holder_disk) bd_link_disk_holder >> }; >> >> static inline int is_badblock(struct md_rdev *rdev, sector_t s, int sectors, >> diff --git a/drivers/md/md.c b/drivers/md/md.c >> index e05858653a41..d6612b922c76 100644 >> --- a/drivers/md/md.c >> +++ b/drivers/md/md.c >> @@ -2526,7 +2526,8 @@ static int bind_rdev_to_array(struct md_rdev *rdev, struct mddev *mddev) >> sysfs_get_dirent_safe(rdev->kobj.sd, "bad_blocks"); >> >> list_add_rcu(&rdev->same_set, &mddev->disks); >> - bd_link_disk_holder(rdev->bdev, mddev->gendisk); >> + if (!bd_link_disk_holder(rdev->bdev, mddev->gendisk)) >> + set_bit(SymlinkCreated, &rdev->flags); > > Shall we just fail bind_rdev_to_array() if bd_link_disk_holder() > returns non-zero? > I keep this action because of commit 00bcb4ac7ee7 ("md: reduce dependence on sysfs."). Fail bind_rdev_to_array is good to me. > Thanks, > Song > > .
On Thu, Dec 21, 2023 at 5:17 PM Li Nan <linan666@huaweicloud.com> wrote: > > > > 在 2023/12/22 2:58, Song Liu 写道: [...] > > In general, I would like to avoid adding flags if possible. > > > > This flag is mainly used to fix deadlock in next patch. Or should we > export bd_find_holder_disk()? Link hodler if it return NULL. > just like: > > rdev_for_each_rcu > if (!bd_find_holder_disk) > bd_link_disk_holder I was thinking we will not need the flag if we fail bind_rdev_to_array() on error (below). > > > >> }; > >> > >> static inline int is_badblock(struct md_rdev *rdev, sector_t s, int sectors, > >> diff --git a/drivers/md/md.c b/drivers/md/md.c > >> index e05858653a41..d6612b922c76 100644 > >> --- a/drivers/md/md.c > >> +++ b/drivers/md/md.c > >> @@ -2526,7 +2526,8 @@ static int bind_rdev_to_array(struct md_rdev *rdev, struct mddev *mddev) > >> sysfs_get_dirent_safe(rdev->kobj.sd, "bad_blocks"); > >> > >> list_add_rcu(&rdev->same_set, &mddev->disks); > >> - bd_link_disk_holder(rdev->bdev, mddev->gendisk); > >> + if (!bd_link_disk_holder(rdev->bdev, mddev->gendisk)) > >> + set_bit(SymlinkCreated, &rdev->flags); > > > > Shall we just fail bind_rdev_to_array() if bd_link_disk_holder() > > returns non-zero? > > > > I keep this action because of commit 00bcb4ac7ee7 ("md: reduce > dependence on sysfs."). Fail bind_rdev_to_array is good to me. I wonder whether the assumption in 00bcb4ac7ee7 is still true. If bd_link_disk_holder() fails for valid reasons, we need to handle it properly (set a flag, check the flag on unlink, etc.). If we only fail bd_link_disk_holder() on extreme cases (ENOMEM, etc.), we can just fail bind_rdev_to_array(). Thanks, Song
Hi, 在 2023/12/25 9:11, Song Liu 写道: > On Thu, Dec 21, 2023 at 5:17 PM Li Nan <linan666@huaweicloud.com> wrote: >> >> >> >> 在 2023/12/22 2:58, Song Liu 写道: > [...] >>> In general, I would like to avoid adding flags if possible. >>> >> >> This flag is mainly used to fix deadlock in next patch. Or should we >> export bd_find_holder_disk()? Link hodler if it return NULL. >> just like: >> >> rdev_for_each_rcu >> if (!bd_find_holder_disk) >> bd_link_disk_holder > > I was thinking we will not need the flag if we fail bind_rdev_to_array() > on error (below). > >> >> >>>> }; >>>> >>>> static inline int is_badblock(struct md_rdev *rdev, sector_t s, int sectors, >>>> diff --git a/drivers/md/md.c b/drivers/md/md.c >>>> index e05858653a41..d6612b922c76 100644 >>>> --- a/drivers/md/md.c >>>> +++ b/drivers/md/md.c >>>> @@ -2526,7 +2526,8 @@ static int bind_rdev_to_array(struct md_rdev *rdev, struct mddev *mddev) >>>> sysfs_get_dirent_safe(rdev->kobj.sd, "bad_blocks"); >>>> >>>> list_add_rcu(&rdev->same_set, &mddev->disks); >>>> - bd_link_disk_holder(rdev->bdev, mddev->gendisk); >>>> + if (!bd_link_disk_holder(rdev->bdev, mddev->gendisk)) >>>> + set_bit(SymlinkCreated, &rdev->flags); >>> >>> Shall we just fail bind_rdev_to_array() if bd_link_disk_holder() >>> returns non-zero? >>> >> >> I keep this action because of commit 00bcb4ac7ee7 ("md: reduce >> dependence on sysfs."). Fail bind_rdev_to_array is good to me. > > I wonder whether the assumption in 00bcb4ac7ee7 is still true. If > bd_link_disk_holder() fails for valid reasons, we need to handle it > properly (set a flag, check the flag on unlink, etc.). If we only fail > bd_link_disk_holder() on extreme cases (ENOMEM, etc.), we can > just fail bind_rdev_to_array(). I totally agree! Currently, bd_link_disk_holder() from md won't return -EINVAL, it will return -ENOMEM or -ENODEV if underlying disk is deleted, which means bind_rdev_to_array() should fail as well. The only problem is that this will make next patch more complicated, but I think this can be solved. Thanks, Kuai > > Thanks, > Song > . >
diff --git a/drivers/md/md.h b/drivers/md/md.h index 8d881cc59799..427d17713a8c 100644 --- a/drivers/md/md.h +++ b/drivers/md/md.h @@ -207,6 +207,9 @@ enum flag_bits { * check if there is collision between raid1 * serial bios. */ + SymlinkCreated, /* This device has created the symlink + * with gendisk. + */ }; static inline int is_badblock(struct md_rdev *rdev, sector_t s, int sectors, diff --git a/drivers/md/md.c b/drivers/md/md.c index e05858653a41..d6612b922c76 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -2526,7 +2526,8 @@ static int bind_rdev_to_array(struct md_rdev *rdev, struct mddev *mddev) sysfs_get_dirent_safe(rdev->kobj.sd, "bad_blocks"); list_add_rcu(&rdev->same_set, &mddev->disks); - bd_link_disk_holder(rdev->bdev, mddev->gendisk); + if (!bd_link_disk_holder(rdev->bdev, mddev->gendisk)) + set_bit(SymlinkCreated, &rdev->flags); /* May as well allow recovery to be retried once */ mddev->recovery_disabled++; @@ -2561,7 +2562,10 @@ static void md_kick_rdev_from_array(struct md_rdev *rdev) { struct mddev *mddev = rdev->mddev; - bd_unlink_disk_holder(rdev->bdev, rdev->mddev->gendisk); + if (test_bit(SymlinkCreated, &rdev->flags)) { + bd_unlink_disk_holder(rdev->bdev, rdev->mddev->gendisk); + clear_bit(SymlinkCreated, &rdev->flags); + } list_del_rcu(&rdev->same_set); pr_debug("md: unbind<%pg>\n", rdev->bdev); mddev_destroy_serial_pool(rdev->mddev, rdev);