Message ID | 20240903081653.65613-2-songmuchun@bytedance.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Fix some starvation problems in block layer | expand |
On 9/3/24 2:16 AM, Muchun Song wrote: > Supposing the following scenario with a virtio_blk driver. > > CPU0 CPU1 CPU2 > > blk_mq_try_issue_directly() > __blk_mq_issue_directly() > q->mq_ops->queue_rq() > virtio_queue_rq() > blk_mq_stop_hw_queue() > blk_mq_try_issue_directly() virtblk_done() > if (blk_mq_hctx_stopped()) > blk_mq_request_bypass_insert() blk_mq_start_stopped_hw_queue() > blk_mq_run_hw_queue() blk_mq_run_hw_queue() > blk_mq_insert_request() > return // Who is responsible for dispatching this IO request? > > After CPU0 has marked the queue as stopped, CPU1 will see the queue is stopped. > But before CPU1 puts the request on the dispatch list, CPU2 receives the interrupt > of completion of request, so it will run the hardware queue and marks the queue > as non-stopped. Meanwhile, CPU1 also runs the same hardware queue. After both CPU1 > and CPU2 complete blk_mq_run_hw_queue(), CPU1 just puts the request to the same > hardware queue and returns. It misses dispatching a request. Fix it by running > the hardware queue explicitly. And blk_mq_request_issue_directly() should handle > a similar situation. Fix it as well. Patch looks fine, but this commit message is waaaaay too wide. Please limit it to 72-74 chars. The above ordering is diagram is going to otherwise be unreadable in a git log viewing in a terminal.
> On Sep 10, 2024, at 21:17, Jens Axboe <axboe@kernel.dk> wrote: > > On 9/3/24 2:16 AM, Muchun Song wrote: >> Supposing the following scenario with a virtio_blk driver. >> >> CPU0 CPU1 CPU2 >> >> blk_mq_try_issue_directly() >> __blk_mq_issue_directly() >> q->mq_ops->queue_rq() >> virtio_queue_rq() >> blk_mq_stop_hw_queue() >> blk_mq_try_issue_directly() virtblk_done() >> if (blk_mq_hctx_stopped()) >> blk_mq_request_bypass_insert() blk_mq_start_stopped_hw_queue() >> blk_mq_run_hw_queue() blk_mq_run_hw_queue() >> blk_mq_insert_request() >> return // Who is responsible for dispatching this IO request? >> >> After CPU0 has marked the queue as stopped, CPU1 will see the queue is stopped. >> But before CPU1 puts the request on the dispatch list, CPU2 receives the interrupt >> of completion of request, so it will run the hardware queue and marks the queue >> as non-stopped. Meanwhile, CPU1 also runs the same hardware queue. After both CPU1 >> and CPU2 complete blk_mq_run_hw_queue(), CPU1 just puts the request to the same >> hardware queue and returns. It misses dispatching a request. Fix it by running >> the hardware queue explicitly. And blk_mq_request_issue_directly() should handle >> a similar situation. Fix it as well. > > Patch looks fine, but this commit message is waaaaay too wide. Please > limit it to 72-74 chars. The above ordering is diagram is going to > otherwise be unreadable in a git log viewing in a terminal. Thanks for your reply. I'll adjust those lines to make the digram more readable. Muchun, Thanks. > > -- > Jens Axboe
diff --git a/block/blk-mq.c b/block/blk-mq.c index e3c3c0c21b553..b2d0f22de0c7f 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2619,6 +2619,7 @@ static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(rq->q)) { blk_mq_insert_request(rq, 0); + blk_mq_run_hw_queue(hctx, false); return; } @@ -2649,6 +2650,7 @@ static blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last) if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(rq->q)) { blk_mq_insert_request(rq, 0); + blk_mq_run_hw_queue(hctx, false); return BLK_STS_OK; }