Message ID | 20211214044259.2656456-1-qiulaibin@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2,-next] block/wbt: fix negative inflight counter when remove scsi device | expand |
On Tue, Dec 14, 2021 at 12:42:59PM +0800, Laibin Qiu wrote: > Now that we disable wbt by set WBT_STATE_OFF_DEFAULT in > wbt_disable_default() when switch elevator to bfq. And when > we remove scsi device, wbt will be enabled by wbt_enable_default. > If it become false positive between wbt_wait() and wbt_track() > when submit write request. > > The following is the scenario that triggered the problem. > > T1 T2 T3 > elevator_switch_mq > bfq_init_queue > wbt_disable_default <= Set > rwb->enable_state (OFF) > Submit_bio > blk_mq_make_request > rq_qos_throttle > <= rwb->enable_state (OFF) > scsi_remove_device > sd_remove > del_gendisk > blk_unregister_queue > elv_unregister_queue > wbt_enable_default > <= Set rwb->enable_state (ON) > q_qos_track > <= rwb->enable_state (ON) > ^^^^^^ this request will mark WBT_TRACKED without inflight add and will > lead to drop rqw->inflight to -1 in wbt_done() which will trigger IO hung. > > Fix this by move wbt_enable_default() from elv_unregister to > elevator_switch_mq. Only re-enable wbt when scheduler switch. > Fixes: 76a8040817b4b ("blk-wbt: make sure throttle is enabled properly") > Signed-off-by: Laibin Qiu <qiulaibin@huawei.com> > --- > block/elevator.c | 7 ++++--- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/block/elevator.c b/block/elevator.c > index ec98aed39c4f..de3cf1fa52fa 100644 > --- a/block/elevator.c > +++ b/block/elevator.c > @@ -525,8 +525,6 @@ void elv_unregister_queue(struct request_queue *q) > kobject_del(&e->kobj); > > e->registered = 0; > - /* Re-enable throttling in case elevator disabled it */ > - wbt_enable_default(q); > } > } > > @@ -593,8 +591,11 @@ int elevator_switch_mq(struct request_queue *q, > lockdep_assert_held(&q->sysfs_lock); > > if (q->elevator) { > - if (q->elevator->registered) > + if (q->elevator->registered) { > elv_unregister_queue(q); > + /* Re-enable throttling in case elevator disabled it */ > + wbt_enable_default(q); > + } Please move wbt_enable_default() into bfq_exit_queue(), which should be easier to follow and fix the issue too given only bfq disables wbt. Thanks, Ming
diff --git a/block/elevator.c b/block/elevator.c index ec98aed39c4f..de3cf1fa52fa 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -525,8 +525,6 @@ void elv_unregister_queue(struct request_queue *q) kobject_del(&e->kobj); e->registered = 0; - /* Re-enable throttling in case elevator disabled it */ - wbt_enable_default(q); } } @@ -593,8 +591,11 @@ int elevator_switch_mq(struct request_queue *q, lockdep_assert_held(&q->sysfs_lock); if (q->elevator) { - if (q->elevator->registered) + if (q->elevator->registered) { elv_unregister_queue(q); + /* Re-enable throttling in case elevator disabled it */ + wbt_enable_default(q); + } ioc_clear_queue(q); blk_mq_sched_free_rqs(q);
Now that we disable wbt by set WBT_STATE_OFF_DEFAULT in wbt_disable_default() when switch elevator to bfq. And when we remove scsi device, wbt will be enabled by wbt_enable_default. If it become false positive between wbt_wait() and wbt_track() when submit write request. The following is the scenario that triggered the problem. T1 T2 T3 elevator_switch_mq bfq_init_queue wbt_disable_default <= Set rwb->enable_state (OFF) Submit_bio blk_mq_make_request rq_qos_throttle <= rwb->enable_state (OFF) scsi_remove_device sd_remove del_gendisk blk_unregister_queue elv_unregister_queue wbt_enable_default <= Set rwb->enable_state (ON) q_qos_track <= rwb->enable_state (ON) ^^^^^^ this request will mark WBT_TRACKED without inflight add and will lead to drop rqw->inflight to -1 in wbt_done() which will trigger IO hung. Fix this by move wbt_enable_default() from elv_unregister to elevator_switch_mq. Only re-enable wbt when scheduler switch. Fixes: 76a8040817b4b ("blk-wbt: make sure throttle is enabled properly") Signed-off-by: Laibin Qiu <qiulaibin@huawei.com> --- block/elevator.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)