Message ID | 20230719070954.3084379-2-xueshi.hu@smartx.com (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
Series | don't change mempool if in-flight r1bio exists | expand |
Hi, 在 2023/07/19 15:09, Xueshi Hu 写道: > When an IO error happens, reschedule_retry() will increase > r1conf::nr_queued, which makes freeze_array() unblocked. However, before > all r1bio in the memory pool are released, the memory pool should not be > modified. Introduce freeze_array_totally() to solve the problem. Compared > to freeze_array(), it's more strict because any in-flight io needs to > complete including queued io. > > Signed-off-by: Xueshi Hu <xueshi.hu@smartx.com> > --- > drivers/md/raid1.c | 35 +++++++++++++++++++++++++++++++++-- > 1 file changed, 33 insertions(+), 2 deletions(-) > > diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c > index dd25832eb045..5605c9680818 100644 > --- a/drivers/md/raid1.c > +++ b/drivers/md/raid1.c > @@ -1072,7 +1072,7 @@ static void freeze_array(struct r1conf *conf, int extra) > /* Stop sync I/O and normal I/O and wait for everything to > * go quiet. > * This is called in two situations: > - * 1) management command handlers (reshape, remove disk, quiesce). > + * 1) management command handlers (remove disk, quiesce). > * 2) one normal I/O request failed. > > * After array_frozen is set to 1, new sync IO will be blocked at > @@ -1111,6 +1111,37 @@ static void unfreeze_array(struct r1conf *conf) > wake_up(&conf->wait_barrier); > } > > +/* conf->resync_lock should be held */ > +static int get_pending(struct r1conf *conf) > +{ > + int idx, ret; > + > + ret = atomic_read(&conf->nr_sync_pending); > + for (idx = 0; idx < BARRIER_BUCKETS_NR; idx++) > + ret += atomic_read(&conf->nr_pending[idx]); > + > + return ret; > +} > + > +static void freeze_array_totally(struct r1conf *conf) > +{ > + /* > + * freeze_array_totally() is almost the same with freeze_array() except > + * it requires there's no queued io. Raid1's reshape will destroy the > + * old mempool and change r1conf::raid_disks, which are necessary when > + * freeing the queued io. > + */ > + spin_lock_irq(&conf->resync_lock); > + conf->array_frozen = 1; > + raid1_log(conf->mddev, "freeze totally"); > + wait_event_lock_irq_cmd( > + conf->wait_barrier, > + get_pending(conf) == 0, > + conf->resync_lock, > + md_wakeup_thread(conf->mddev->thread)); > + spin_unlock_irq(&conf->resync_lock); > +} > + > static void alloc_behind_master_bio(struct r1bio *r1_bio, > struct bio *bio) > { > @@ -3296,7 +3327,7 @@ static int raid1_reshape(struct mddev *mddev) > return -ENOMEM; > } > > - freeze_array(conf, 0); > + freeze_array_totally(conf); I think this is wrong, raid1_reshape() can't be called with 'reconfig_mutex' grabbed, and this will deadlock because failed io need this lock to be handled by daemon thread.(see details in [1]). Be aware that never hold 'reconfig_mutex' to wait for io. [1] https://git.kernel.org/pub/scm/linux/kernel/git/song/md.git/commit/?h=md-next&id=c4fe7edfc73f750574ef0ec3eee8c2de95324463 > > /* ok, everything is stopped */ > oldpool = conf->r1bio_pool; >
Hi, 在 2023/07/20 9:36, Yu Kuai 写道: > Hi, > > 在 2023/07/19 15:09, Xueshi Hu 写道: >> When an IO error happens, reschedule_retry() will increase >> r1conf::nr_queued, which makes freeze_array() unblocked. However, before >> all r1bio in the memory pool are released, the memory pool should not be >> modified. Introduce freeze_array_totally() to solve the problem. Compared >> to freeze_array(), it's more strict because any in-flight io needs to >> complete including queued io. >> >> Signed-off-by: Xueshi Hu <xueshi.hu@smartx.com> >> --- >> drivers/md/raid1.c | 35 +++++++++++++++++++++++++++++++++-- >> 1 file changed, 33 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c >> index dd25832eb045..5605c9680818 100644 >> --- a/drivers/md/raid1.c >> +++ b/drivers/md/raid1.c >> @@ -1072,7 +1072,7 @@ static void freeze_array(struct r1conf *conf, >> int extra) >> /* Stop sync I/O and normal I/O and wait for everything to >> * go quiet. >> * This is called in two situations: >> - * 1) management command handlers (reshape, remove disk, quiesce). >> + * 1) management command handlers (remove disk, quiesce). >> * 2) one normal I/O request failed. >> * After array_frozen is set to 1, new sync IO will be blocked at >> @@ -1111,6 +1111,37 @@ static void unfreeze_array(struct r1conf *conf) >> wake_up(&conf->wait_barrier); >> } >> +/* conf->resync_lock should be held */ >> +static int get_pending(struct r1conf *conf) >> +{ >> + int idx, ret; >> + >> + ret = atomic_read(&conf->nr_sync_pending); >> + for (idx = 0; idx < BARRIER_BUCKETS_NR; idx++) >> + ret += atomic_read(&conf->nr_pending[idx]); >> + >> + return ret; >> +} >> + >> +static void freeze_array_totally(struct r1conf *conf) >> +{ >> + /* >> + * freeze_array_totally() is almost the same with freeze_array() >> except >> + * it requires there's no queued io. Raid1's reshape will destroy >> the >> + * old mempool and change r1conf::raid_disks, which are necessary >> when >> + * freeing the queued io. >> + */ >> + spin_lock_irq(&conf->resync_lock); >> + conf->array_frozen = 1; >> + raid1_log(conf->mddev, "freeze totally"); >> + wait_event_lock_irq_cmd( >> + conf->wait_barrier, >> + get_pending(conf) == 0, >> + conf->resync_lock, >> + md_wakeup_thread(conf->mddev->thread)); >> + spin_unlock_irq(&conf->resync_lock); >> +} >> + >> static void alloc_behind_master_bio(struct r1bio *r1_bio, >> struct bio *bio) >> { >> @@ -3296,7 +3327,7 @@ static int raid1_reshape(struct mddev *mddev) >> return -ENOMEM; >> } >> - freeze_array(conf, 0); >> + freeze_array_totally(conf); > > I think this is wrong, raid1_reshape() can't be called with Sorry about thi typo, I mean raid1_reshape() can be called with ... Thanks, Kuai > 'reconfig_mutex' grabbed, and this will deadlock because failed io need > this lock to be handled by daemon thread.(see details in [1]). > > Be aware that never hold 'reconfig_mutex' to wait for io. > > [1] > https://git.kernel.org/pub/scm/linux/kernel/git/song/md.git/commit/?h=md-next&id=c4fe7edfc73f750574ef0ec3eee8c2de95324463 > >> /* ok, everything is stopped */ >> oldpool = conf->r1bio_pool; >> > > . >
On Thu, Jul 20, 2023 at 09:37:38AM +0800, Yu Kuai wrote: > Hi, > > 在 2023/07/20 9:36, Yu Kuai 写道: > > Hi, > > > > 在 2023/07/19 15:09, Xueshi Hu 写道: > > > When an IO error happens, reschedule_retry() will increase > > > r1conf::nr_queued, which makes freeze_array() unblocked. However, before > > > all r1bio in the memory pool are released, the memory pool should not be > > > modified. Introduce freeze_array_totally() to solve the problem. Compared > > > to freeze_array(), it's more strict because any in-flight io needs to > > > complete including queued io. > > > > > > Signed-off-by: Xueshi Hu <xueshi.hu@smartx.com> > > > --- > > > drivers/md/raid1.c | 35 +++++++++++++++++++++++++++++++++-- > > > 1 file changed, 33 insertions(+), 2 deletions(-) > > > > > > diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c > > > index dd25832eb045..5605c9680818 100644 > > > --- a/drivers/md/raid1.c > > > +++ b/drivers/md/raid1.c > > > @@ -1072,7 +1072,7 @@ static void freeze_array(struct r1conf *conf, > > > int extra) > > > /* Stop sync I/O and normal I/O and wait for everything to > > > * go quiet. > > > * This is called in two situations: > > > - * 1) management command handlers (reshape, remove disk, quiesce). > > > + * 1) management command handlers (remove disk, quiesce). > > > * 2) one normal I/O request failed. > > > * After array_frozen is set to 1, new sync IO will be blocked at > > > @@ -1111,6 +1111,37 @@ static void unfreeze_array(struct r1conf *conf) > > > wake_up(&conf->wait_barrier); > > > } > > > +/* conf->resync_lock should be held */ > > > +static int get_pending(struct r1conf *conf) > > > +{ > > > + int idx, ret; > > > + > > > + ret = atomic_read(&conf->nr_sync_pending); > > > + for (idx = 0; idx < BARRIER_BUCKETS_NR; idx++) > > > + ret += atomic_read(&conf->nr_pending[idx]); > > > + > > > + return ret; > > > +} > > > + > > > +static void freeze_array_totally(struct r1conf *conf) > > > +{ > > > + /* > > > + * freeze_array_totally() is almost the same with > > > freeze_array() except > > > + * it requires there's no queued io. Raid1's reshape will > > > destroy the > > > + * old mempool and change r1conf::raid_disks, which are > > > necessary when > > > + * freeing the queued io. > > > + */ > > > + spin_lock_irq(&conf->resync_lock); > > > + conf->array_frozen = 1; > > > + raid1_log(conf->mddev, "freeze totally"); > > > + wait_event_lock_irq_cmd( > > > + conf->wait_barrier, > > > + get_pending(conf) == 0, > > > + conf->resync_lock, > > > + md_wakeup_thread(conf->mddev->thread)); > > > + spin_unlock_irq(&conf->resync_lock); > > > +} > > > + > > > static void alloc_behind_master_bio(struct r1bio *r1_bio, > > > struct bio *bio) > > > { > > > @@ -3296,7 +3327,7 @@ static int raid1_reshape(struct mddev *mddev) > > > return -ENOMEM; > > > } > > > - freeze_array(conf, 0); > > > + freeze_array_totally(conf); > > > > I think this is wrong, raid1_reshape() can't be called with > Sorry about thi typo, I mean raid1_reshape() can be called with ... You're right, this is indeed a deadlock. I am wondering whether this approach is viable if (unlikely(atomic_read(conf->nr_queued))) { kfree(newpoolinfo); mempool_exit(&newpool); unfreeze_array(conf); set_bit(MD_RECOVERY_RECOVER, &mddev->recovery); set_bit(MD_RECOVERY_NEEDED, &mddev->recovery); md_wakeup_thread(mddev->thread); return -EBUSY; } Thanks, Hu > > Thanks, > Kuai > > 'reconfig_mutex' grabbed, and this will deadlock because failed io need > > this lock to be handled by daemon thread.(see details in [1]). > > > > Be aware that never hold 'reconfig_mutex' to wait for io. > > > > [1] https://git.kernel.org/pub/scm/linux/kernel/git/song/md.git/commit/?h=md-next&id=c4fe7edfc73f750574ef0ec3eee8c2de95324463 > > > > > /* ok, everything is stopped */ > > > oldpool = conf->r1bio_pool; > > > > > > > . > > >
Hi, 在 2023/07/31 22:02, Xueshi Hu 写道: > On Thu, Jul 20, 2023 at 09:37:38AM +0800, Yu Kuai wrote: >> Hi, >> >> 在 2023/07/20 9:36, Yu Kuai 写道: >>> Hi, >>> >>> 在 2023/07/19 15:09, Xueshi Hu 写道: >>>> When an IO error happens, reschedule_retry() will increase >>>> r1conf::nr_queued, which makes freeze_array() unblocked. However, before >>>> all r1bio in the memory pool are released, the memory pool should not be >>>> modified. Introduce freeze_array_totally() to solve the problem. Compared >>>> to freeze_array(), it's more strict because any in-flight io needs to >>>> complete including queued io. >>>> >>>> Signed-off-by: Xueshi Hu <xueshi.hu@smartx.com> >>>> --- >>>> drivers/md/raid1.c | 35 +++++++++++++++++++++++++++++++++-- >>>> 1 file changed, 33 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c >>>> index dd25832eb045..5605c9680818 100644 >>>> --- a/drivers/md/raid1.c >>>> +++ b/drivers/md/raid1.c >>>> @@ -1072,7 +1072,7 @@ static void freeze_array(struct r1conf *conf, >>>> int extra) >>>> /* Stop sync I/O and normal I/O and wait for everything to >>>> * go quiet. >>>> * This is called in two situations: >>>> - * 1) management command handlers (reshape, remove disk, quiesce). >>>> + * 1) management command handlers (remove disk, quiesce). >>>> * 2) one normal I/O request failed. >>>> * After array_frozen is set to 1, new sync IO will be blocked at >>>> @@ -1111,6 +1111,37 @@ static void unfreeze_array(struct r1conf *conf) >>>> wake_up(&conf->wait_barrier); >>>> } >>>> +/* conf->resync_lock should be held */ >>>> +static int get_pending(struct r1conf *conf) >>>> +{ >>>> + int idx, ret; >>>> + >>>> + ret = atomic_read(&conf->nr_sync_pending); >>>> + for (idx = 0; idx < BARRIER_BUCKETS_NR; idx++) >>>> + ret += atomic_read(&conf->nr_pending[idx]); >>>> + >>>> + return ret; >>>> +} >>>> + >>>> +static void freeze_array_totally(struct r1conf *conf) >>>> +{ >>>> + /* >>>> + * freeze_array_totally() is almost the same with >>>> freeze_array() except >>>> + * it requires there's no queued io. Raid1's reshape will >>>> destroy the >>>> + * old mempool and change r1conf::raid_disks, which are >>>> necessary when >>>> + * freeing the queued io. >>>> + */ >>>> + spin_lock_irq(&conf->resync_lock); >>>> + conf->array_frozen = 1; >>>> + raid1_log(conf->mddev, "freeze totally"); >>>> + wait_event_lock_irq_cmd( >>>> + conf->wait_barrier, >>>> + get_pending(conf) == 0, >>>> + conf->resync_lock, >>>> + md_wakeup_thread(conf->mddev->thread)); >>>> + spin_unlock_irq(&conf->resync_lock); >>>> +} >>>> + >>>> static void alloc_behind_master_bio(struct r1bio *r1_bio, >>>> struct bio *bio) >>>> { >>>> @@ -3296,7 +3327,7 @@ static int raid1_reshape(struct mddev *mddev) >>>> return -ENOMEM; >>>> } >>>> - freeze_array(conf, 0); >>>> + freeze_array_totally(conf); >>> >>> I think this is wrong, raid1_reshape() can't be called with >> Sorry about thi typo, I mean raid1_reshape() can be called with ... > You're right, this is indeed a deadlock. > > I am wondering whether this approach is viable > > if (unlikely(atomic_read(conf->nr_queued))) { > kfree(newpoolinfo); > mempool_exit(&newpool); > unfreeze_array(conf); > > set_bit(MD_RECOVERY_RECOVER, &mddev->recovery); > set_bit(MD_RECOVERY_NEEDED, &mddev->recovery); > md_wakeup_thread(mddev->thread); > return -EBUSY; > } This is not okay, 'nr_queued' can be incresed at any time when normal io failed, read it once doesn't mean anything, and you need to freeze_array() before reading it: freeze_array // guarantee new io won't be dispatched if (atomic_read(conf->nr_queued)) ... unfreeze_array return -EBUSY; Fortunately, I'm working on another patchset to synchronize io with array configuration, which means all the callers of raid1_reshape() will suspend the array, and no normal io will be in progress, hence this problem won't exist as well. Thanks, Kuai > > Thanks, > Hu > >> >> Thanks, >> Kuai >>> 'reconfig_mutex' grabbed, and this will deadlock because failed io need >>> this lock to be handled by daemon thread.(see details in [1]). >>> >>> Be aware that never hold 'reconfig_mutex' to wait for io. >>> >>> [1] https://git.kernel.org/pub/scm/linux/kernel/git/song/md.git/commit/?h=md-next&id=c4fe7edfc73f750574ef0ec3eee8c2de95324463 >>> >>>> /* ok, everything is stopped */ >>>> oldpool = conf->r1bio_pool; >>>> >>> >>> . >>> >> > . >
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index dd25832eb045..5605c9680818 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -1072,7 +1072,7 @@ static void freeze_array(struct r1conf *conf, int extra) /* Stop sync I/O and normal I/O and wait for everything to * go quiet. * This is called in two situations: - * 1) management command handlers (reshape, remove disk, quiesce). + * 1) management command handlers (remove disk, quiesce). * 2) one normal I/O request failed. * After array_frozen is set to 1, new sync IO will be blocked at @@ -1111,6 +1111,37 @@ static void unfreeze_array(struct r1conf *conf) wake_up(&conf->wait_barrier); } +/* conf->resync_lock should be held */ +static int get_pending(struct r1conf *conf) +{ + int idx, ret; + + ret = atomic_read(&conf->nr_sync_pending); + for (idx = 0; idx < BARRIER_BUCKETS_NR; idx++) + ret += atomic_read(&conf->nr_pending[idx]); + + return ret; +} + +static void freeze_array_totally(struct r1conf *conf) +{ + /* + * freeze_array_totally() is almost the same with freeze_array() except + * it requires there's no queued io. Raid1's reshape will destroy the + * old mempool and change r1conf::raid_disks, which are necessary when + * freeing the queued io. + */ + spin_lock_irq(&conf->resync_lock); + conf->array_frozen = 1; + raid1_log(conf->mddev, "freeze totally"); + wait_event_lock_irq_cmd( + conf->wait_barrier, + get_pending(conf) == 0, + conf->resync_lock, + md_wakeup_thread(conf->mddev->thread)); + spin_unlock_irq(&conf->resync_lock); +} + static void alloc_behind_master_bio(struct r1bio *r1_bio, struct bio *bio) { @@ -3296,7 +3327,7 @@ static int raid1_reshape(struct mddev *mddev) return -ENOMEM; } - freeze_array(conf, 0); + freeze_array_totally(conf); /* ok, everything is stopped */ oldpool = conf->r1bio_pool;
When an IO error happens, reschedule_retry() will increase r1conf::nr_queued, which makes freeze_array() unblocked. However, before all r1bio in the memory pool are released, the memory pool should not be modified. Introduce freeze_array_totally() to solve the problem. Compared to freeze_array(), it's more strict because any in-flight io needs to complete including queued io. Signed-off-by: Xueshi Hu <xueshi.hu@smartx.com> --- drivers/md/raid1.c | 35 +++++++++++++++++++++++++++++++++-- 1 file changed, 33 insertions(+), 2 deletions(-)