Message ID | 20211213040907.2669480-1-qiulaibin@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [-next] block/wbt: fix negative inflight counter when remove scsi device | expand |
On Mon, Dec 13, 2021 at 12:09:07PM +0800, Laibin Qiu wrote: > Submit_bio > scsi_remove_device > sd_remove > del_gendisk > blk_unregister_queue > elv_unregister_queue > wbt_enable_default > <= Set rwb->enable_state (ON) > q_qos_track > <= rwb->enable_state (ON) > ^^^^^^ this request will mark WBT_TRACKED without inflight add and will > lead to drop rqw->inflight to -1 in wbt_done() which will trigger IO hung. > > Fix this by judge whether QUEUE_FLAG_REGISTERED is marked to distinguish > scsi remove scene. > Fixes: 76a8040817b4b ("blk-wbt: make sure throttle is enabled properly") > Signed-off-by: Laibin Qiu <qiulaibin@huawei.com> > --- > block/blk-wbt.c | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/block/blk-wbt.c b/block/blk-wbt.c > index 3ed71b8da887..537f77bb1365 100644 > --- a/block/blk-wbt.c > +++ b/block/blk-wbt.c > @@ -637,6 +637,10 @@ void wbt_enable_default(struct request_queue *q) > { > struct rq_qos *rqos = wbt_rq_qos(q); > > + /* Queue not registered? Maybe shutting down... */ > + if (!blk_queue_registered(q)) > + return; Wouldn't it make more sense to simply not call wbt_enable_default from elv_unregister_queue?
On Mon, Dec 13, 2021 at 09:16:51AM -0800, Christoph Hellwig wrote: > On Mon, Dec 13, 2021 at 12:09:07PM +0800, Laibin Qiu wrote: > > Submit_bio > > scsi_remove_device > > sd_remove > > del_gendisk > > blk_unregister_queue > > elv_unregister_queue > > wbt_enable_default > > <= Set rwb->enable_state (ON) > > q_qos_track > > <= rwb->enable_state (ON) > > ^^^^^^ this request will mark WBT_TRACKED without inflight add and will > > lead to drop rqw->inflight to -1 in wbt_done() which will trigger IO hung. > > > > Fix this by judge whether QUEUE_FLAG_REGISTERED is marked to distinguish > > scsi remove scene. > > Fixes: 76a8040817b4b ("blk-wbt: make sure throttle is enabled properly") > > Signed-off-by: Laibin Qiu <qiulaibin@huawei.com> > > --- > > block/blk-wbt.c | 8 ++++---- > > 1 file changed, 4 insertions(+), 4 deletions(-) > > > > diff --git a/block/blk-wbt.c b/block/blk-wbt.c > > index 3ed71b8da887..537f77bb1365 100644 > > --- a/block/blk-wbt.c > > +++ b/block/blk-wbt.c > > @@ -637,6 +637,10 @@ void wbt_enable_default(struct request_queue *q) > > { > > struct rq_qos *rqos = wbt_rq_qos(q); > > > > + /* Queue not registered? Maybe shutting down... */ > > + if (!blk_queue_registered(q)) > > + return; > > Wouldn't it make more sense to simply not call wbt_enable_default from > elv_unregister_queue? wbt_disable_default() is called in bfq_init_root_group(), so wbt_enable_default should be moved to bfq_exit_queue()? Thanks, Ming
On 2021/12/14 1:16, Christoph Hellwig wrote: > On Mon, Dec 13, 2021 at 12:09:07PM +0800, Laibin Qiu wrote: >> Submit_bio >> scsi_remove_device >> sd_remove >> del_gendisk >> blk_unregister_queue >> elv_unregister_queue >> wbt_enable_default >> <= Set rwb->enable_state (ON) >> q_qos_track >> <= rwb->enable_state (ON) >> ^^^^^^ this request will mark WBT_TRACKED without inflight add and will >> lead to drop rqw->inflight to -1 in wbt_done() which will trigger IO hung. >> >> Fix this by judge whether QUEUE_FLAG_REGISTERED is marked to distinguish >> scsi remove scene. >> Fixes: 76a8040817b4b ("blk-wbt: make sure throttle is enabled properly") >> Signed-off-by: Laibin Qiu <qiulaibin@huawei.com> >> --- >> block/blk-wbt.c | 8 ++++---- >> 1 file changed, 4 insertions(+), 4 deletions(-) >> >> diff --git a/block/blk-wbt.c b/block/blk-wbt.c >> index 3ed71b8da887..537f77bb1365 100644 >> --- a/block/blk-wbt.c >> +++ b/block/blk-wbt.c >> @@ -637,6 +637,10 @@ void wbt_enable_default(struct request_queue *q) >> { >> struct rq_qos *rqos = wbt_rq_qos(q); >> >> + /* Queue not registered? Maybe shutting down... */ >> + if (!blk_queue_registered(q)) >> + return; > > Wouldn't it make more sense to simply not call wbt_enable_default from > elv_unregister_queue? > . > Refer to your opinion, I will post another version of V2. Please take a look again.
On Tue, Dec 14, 2021 at 09:13:10AM +0800, Ming Lei wrote: > > Wouldn't it make more sense to simply not call wbt_enable_default from > > elv_unregister_queue? > > wbt_disable_default() is called in bfq_init_root_group(), so wbt_enable_default s/bfq_init_root_group/bfq_init_queue/ But yes, that sounds like an even better idea. Or maybe even an elevator feature flag.
diff --git a/block/blk-wbt.c b/block/blk-wbt.c index 3ed71b8da887..537f77bb1365 100644 --- a/block/blk-wbt.c +++ b/block/blk-wbt.c @@ -637,6 +637,10 @@ void wbt_enable_default(struct request_queue *q) { struct rq_qos *rqos = wbt_rq_qos(q); + /* Queue not registered? Maybe shutting down... */ + if (!blk_queue_registered(q)) + return; + /* Throttling already enabled? */ if (rqos) { if (RQWB(rqos)->enable_state == WBT_STATE_OFF_DEFAULT) @@ -644,10 +648,6 @@ void wbt_enable_default(struct request_queue *q) return; } - /* Queue not registered? Maybe shutting down... */ - if (!blk_queue_registered(q)) - return; - if (queue_is_mq(q) && IS_ENABLED(CONFIG_BLK_WBT_MQ)) wbt_init(q); }
Now that we disable wbt by set WBT_STATE_OFF_DEFAULT in wbt_disable_default() when switch elevator to bfq. And when we remove scsi device, wbt will be enabled by wbt_enable_default. If it become false positive between wbt_wait() and wbt_track() when submit write request. The following is the scenario that triggered the problem. T1 T2 T3 elevator_switch_mq bfq_init_queue wbt_disable_default <= Set rwb->enable_state (OFF) Submit_bio blk_mq_make_request rq_qos_throttle <= rwb->enable_state (OFF) scsi_remove_device sd_remove del_gendisk blk_unregister_queue elv_unregister_queue wbt_enable_default <= Set rwb->enable_state (ON) q_qos_track <= rwb->enable_state (ON) ^^^^^^ this request will mark WBT_TRACKED without inflight add and will lead to drop rqw->inflight to -1 in wbt_done() which will trigger IO hung. Fix this by judge whether QUEUE_FLAG_REGISTERED is marked to distinguish scsi remove scene. Fixes: 76a8040817b4b ("blk-wbt: make sure throttle is enabled properly") Signed-off-by: Laibin Qiu <qiulaibin@huawei.com> --- block/blk-wbt.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)