Message ID | 20240811101921.4031-2-songmuchun@bytedance.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Fix some starvation problems | expand |
On Sun, Aug 11, 2024 at 06:19:18PM +0800, Muchun Song wrote: > Supposing the following scenario with a virtio_blk driver. > > CPU0 CPU1 CPU2 > > blk_mq_try_issue_directly() > __blk_mq_issue_directly() > q->mq_ops->queue_rq() > virtio_queue_rq() > blk_mq_stop_hw_queue() > blk_mq_try_issue_directly() virtblk_done() > if (blk_mq_hctx_stopped()) > blk_mq_request_bypass_insert() blk_mq_start_stopped_hw_queue() > blk_mq_run_hw_queue() blk_mq_run_hw_queue() > blk_mq_insert_request() > return // Who is responsible for dispatching this IO request? > > After CPU0 has marked the queue as stopped, CPU1 will see the queue is stopped. > But before CPU1 puts the request on the dispatch list, CPU2 receives the interrupt > of completion of request, so it will run the hardware queue and marks the queue > as non-stopped. Meanwhile, CPU1 also runs the same hardware queue. After both CPU1 > and CPU2 complete blk_mq_run_hw_queue(), CPU1 just puts the request to the same > hardware queue and returns. Seems it misses dispatching a request. Fix it by > running the hardware queue explicitly. I think blk_mq_request_issue_directly() > should handle a similar problem. > > Signed-off-by: Muchun Song <songmuchun@bytedance.com> > --- > block/blk-mq.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/block/blk-mq.c b/block/blk-mq.c > index e3c3c0c21b553..b2d0f22de0c7f 100644 > --- a/block/blk-mq.c > +++ b/block/blk-mq.c > @@ -2619,6 +2619,7 @@ static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, > > if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(rq->q)) { > blk_mq_insert_request(rq, 0); > + blk_mq_run_hw_queue(hctx, false); > return; > } > > @@ -2649,6 +2650,7 @@ static blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last) > > if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(rq->q)) { > blk_mq_insert_request(rq, 0); > + blk_mq_run_hw_queue(hctx, false); > return BLK_STS_OK; > } Looks one real issue, and the fix is fine: Reviewed-by: Ming Lei <ming.lei@redhat.com> Thanks, Ming
diff --git a/block/blk-mq.c b/block/blk-mq.c index e3c3c0c21b553..b2d0f22de0c7f 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2619,6 +2619,7 @@ static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(rq->q)) { blk_mq_insert_request(rq, 0); + blk_mq_run_hw_queue(hctx, false); return; } @@ -2649,6 +2650,7 @@ static blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last) if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(rq->q)) { blk_mq_insert_request(rq, 0); + blk_mq_run_hw_queue(hctx, false); return BLK_STS_OK; }
Supposing the following scenario with a virtio_blk driver. CPU0 CPU1 CPU2 blk_mq_try_issue_directly() __blk_mq_issue_directly() q->mq_ops->queue_rq() virtio_queue_rq() blk_mq_stop_hw_queue() blk_mq_try_issue_directly() virtblk_done() if (blk_mq_hctx_stopped()) blk_mq_request_bypass_insert() blk_mq_start_stopped_hw_queue() blk_mq_run_hw_queue() blk_mq_run_hw_queue() blk_mq_insert_request() return // Who is responsible for dispatching this IO request? After CPU0 has marked the queue as stopped, CPU1 will see the queue is stopped. But before CPU1 puts the request on the dispatch list, CPU2 receives the interrupt of completion of request, so it will run the hardware queue and marks the queue as non-stopped. Meanwhile, CPU1 also runs the same hardware queue. After both CPU1 and CPU2 complete blk_mq_run_hw_queue(), CPU1 just puts the request to the same hardware queue and returns. Seems it misses dispatching a request. Fix it by running the hardware queue explicitly. I think blk_mq_request_issue_directly() should handle a similar problem. Signed-off-by: Muchun Song <songmuchun@bytedance.com> --- block/blk-mq.c | 2 ++ 1 file changed, 2 insertions(+)