Message ID | 20220726110111.1557859-1-yuyufen@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] blk-mq: run queue no matter whether the request is the last request | expand |
On Tue, Jul 26, 2022 at 07:01:11PM +0800, Yufen Yu wrote: > We do test on a virtio scsi device (/dev/sda) and the default mq > scheduler is 'none'. We found a IO hung as following: > > blk_finish_plug > blk_mq_plug_issue_direct > scsi_mq_get_budget > //get budget_token fail and sdev->restarts=1 > > scsi_end_request > scsi_run_queue_async > //sdev->restart=0 and run queue > > blk_mq_request_bypass_insert > //add request to hctx->dispatch list > > //continue to dispath plug list > blk_mq_dispatch_plug_list > blk_mq_try_issue_list_directly > //success issue all requests from plug list > > After .get_budget fail, scsi_mq_get_budget will increase 'restarts'. > Normally, it will run hw queue when io complete and set 'restarts' > as 0. But if we run queue before adding request to the dispatch list > and blk_mq_dispatch_plug_list also success issue all requests, then > on one will run queue, and the request will be stall in the dispatch > list and cannot complete forever. The story isn't related with scsi actually. > > It is wrong to use last request of plug list to decide if run queue is > needed since all the remained requests in plug list may be from other > hctxs. To fix the bug, pass run_queue as true always to > blk_mq_request_bypass_insert(). > > Fix-suggested-by: Ming Lei <ming.lei@redhat.com> > Signed-off-by: Yufen Yu <yuyufen@huawei.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Thanks, Ming
diff --git a/block/blk-mq.c b/block/blk-mq.c index 93d9d60980fb..1eb13d57a946 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2568,7 +2568,7 @@ static void blk_mq_plug_issue_direct(struct blk_plug *plug, bool from_schedule) break; case BLK_STS_RESOURCE: case BLK_STS_DEV_RESOURCE: - blk_mq_request_bypass_insert(rq, false, last); + blk_mq_request_bypass_insert(rq, false, true); blk_mq_commit_rqs(hctx, &queued, from_schedule); return; default:
We do test on a virtio scsi device (/dev/sda) and the default mq scheduler is 'none'. We found a IO hung as following: blk_finish_plug blk_mq_plug_issue_direct scsi_mq_get_budget //get budget_token fail and sdev->restarts=1 scsi_end_request scsi_run_queue_async //sdev->restart=0 and run queue blk_mq_request_bypass_insert //add request to hctx->dispatch list //continue to dispath plug list blk_mq_dispatch_plug_list blk_mq_try_issue_list_directly //success issue all requests from plug list After .get_budget fail, scsi_mq_get_budget will increase 'restarts'. Normally, it will run hw queue when io complete and set 'restarts' as 0. But if we run queue before adding request to the dispatch list and blk_mq_dispatch_plug_list also success issue all requests, then on one will run queue, and the request will be stall in the dispatch list and cannot complete forever. It is wrong to use last request of plug list to decide if run queue is needed since all the remained requests in plug list may be from other hctxs. To fix the bug, pass run_queue as true always to blk_mq_request_bypass_insert(). Fix-suggested-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Yufen Yu <yuyufen@huawei.com> --- block/blk-mq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)