Message ID | 20200513095443.2038859-7-ming.lei@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | blk-mq: support batching dispatch from scheduler | expand |
On Wed, May 13, 2020 at 05:54:40PM +0800, Ming Lei wrote: > Move code for handling partial dispatch into one helper, so that > blk_mq_dispatch_rq_list gets a bit simpified, and easier to read. > > No functional change. The concept looks good, but some of the logic is very convoluted. What do you think of something like this on top: diff --git a/block/blk-mq.c b/block/blk-mq.c index 86beb8c668689..8c9a6a886919c 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1236,24 +1236,11 @@ static void blk_mq_handle_partial_dispatch(struct blk_mq_hw_ctx *hctx, blk_status_t ret, bool queued) { struct request_queue *q = hctx->queue; - bool needs_restart; - bool no_tag = false; bool no_budget_avail = false; /* - * For non-shared tags, the RESTART check - * will suffice. - */ - if (prep == PREP_DISPATCH_NO_TAG && - (hctx->flags & BLK_MQ_F_TAG_SHARED)) - no_tag = true; - if (prep == PREP_DISPATCH_NO_BUDGET) - no_budget_avail = true; - - /* - * If we didn't flush the entire list, we could have told - * the driver there was more coming, but that turned out to - * be a lie. + * Commit the current batch. There are more waiting requests, but we + * can't guarantee that we'll handle them ASAP. */ if (q->mq_ops->commit_rqs && queued) q->mq_ops->commit_rqs(hctx); @@ -1263,36 +1250,52 @@ static void blk_mq_handle_partial_dispatch(struct blk_mq_hw_ctx *hctx, spin_unlock(&hctx->lock); /* - * If SCHED_RESTART was set by the caller of this function and - * it is no longer set that means that it was cleared by another - * thread and hence that a queue rerun is needed. + * If SCHED_RESTART was set by the caller and it is no longer set, it + * must have been cleared by another thread and hence a queue rerun is + * needed. * - * If 'no_tag' is set, that means that we failed getting - * a driver tag with an I/O scheduler attached. If our dispatch + * If blk_mq_prep_dispatch_rq returned PREP_DISPATCH_NO_TAG, we failed + * to get a driver tag with an I/O scheduler attached. If our dispatch * waitqueue is no longer active, ensure that we run the queue * AFTER adding our entries back to the list. + * If no I/O scheduler has been configured it is possible that the + * hardware queue got stopped and restarted before requests were pushed + * back onto the dispatch list. Rerun the queue to avoid starvation. * - * If no I/O scheduler has been configured it is possible that - * the hardware queue got stopped and restarted before requests - * were pushed back onto the dispatch list. Rerun the queue to - * avoid starvation. Notes: - * - blk_mq_run_hw_queue() checks whether or not a queue has - * been stopped before rerunning a queue. - * - Some but not all block drivers stop a queue before - * returning BLK_STS_RESOURCE. Two exceptions are scsi-mq - * and dm-rq. + * Notes: + * - blk_mq_run_hw_queue() checks whether or not a queue has been + * stopped before rerunning a queue. + * - Some but not all block drivers stop a queue before returning + * BLK_STS_RESOURCE. Two exceptions are scsi-mq and dm-rq. * - * If driver returns BLK_STS_RESOURCE and SCHED_RESTART - * bit is set, run queue after a delay to avoid IO stalls - * that could otherwise occur if the queue is idle. We'll do - * similar if we couldn't get budget and SCHED_RESTART is set. + * If driver returns BLK_STS_RESOURCE and the SCHED_RESTART bit is set, + * run queue after a delay to avoid IO stalls that could otherwise occur + * if the queue is idle. We'll do similar if we couldn't get budget and + * SCHED_RESTART is set. */ - needs_restart = blk_mq_sched_needs_restart(hctx); - if (!needs_restart || - (no_tag && list_empty_careful(&hctx->dispatch_wait.entry))) + switch (prep) { + case PREP_DISPATCH_NO_TAG: + if ((hctx->flags & BLK_MQ_F_TAG_SHARED) && + list_empty_careful(&hctx->dispatch_wait.entry)) { + blk_mq_run_hw_queue(hctx, true); + return; + } + /* + * For non-shared tags, the RESTART check will suffice. + */ + break; + case PREP_DISPATCH_OK: + if (ret == BLK_STS_RESOURCE) + no_budget_avail = true; + break; + case PREP_DISPATCH_NO_BUDGET: + no_budget_avail = true; + break; + } + + if (!blk_mq_sched_needs_restart(hctx)) blk_mq_run_hw_queue(hctx, true); - else if (needs_restart && (ret == BLK_STS_RESOURCE || - no_budget_avail)) + else if (no_budget_avail) blk_mq_delay_run_hw_queue(hctx, BLK_MQ_RESOURCE_DELAY); } @@ -1336,8 +1339,6 @@ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *hctx, struct list_head *list, * accept. */ blk_mq_handle_zone_resource(rq, &zone_list); - if (list_empty(list)) - break; continue; } @@ -1350,9 +1351,6 @@ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *hctx, struct list_head *list, queued++; } while (!list_empty(list)); - if (!list_empty(&zone_list)) - list_splice_tail_init(&zone_list, list); - hctx->dispatched[queued_to_index(queued)]++; /* @@ -1360,11 +1358,13 @@ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *hctx, struct list_head *list, * that is where we will continue on next queue run. */ if (!list_empty(list)) { - blk_mq_handle_partial_dispatch(hctx, list, prep, ret, !!queued); + list_splice_tail_init(&zone_list, list); + blk_mq_handle_partial_dispatch(hctx, list, prep, ret, queued); blk_mq_update_dispatch_busy(hctx, true); return false; - } else - blk_mq_update_dispatch_busy(hctx, false); + } + + blk_mq_update_dispatch_busy(hctx, false); /* * If the host/device is unable to accept more work, inform the
Btw, with the many arguments an more beeing added later I'm not sure anymore if this is really worth a separate function or just if goto label at the end of blk_mq_dispatch_rq_list that can be jumped to, and which has the same amount of indentation is the better idea.
On Wed, May 13, 2020 at 06:01:00AM -0700, Christoph Hellwig wrote: > Btw, with the many arguments an more beeing added later I'm not sure blk_mq_handle_partial_dispatch() is always inlined, and each parameter has precise meaning, so IMO more parameters shouldn't be an issue. > anymore if this is really worth a separate function or just if goto > label at the end of blk_mq_dispatch_rq_list that can be jumped to, > and which has the same amount of indentation is the better idea. > There are more benefits in this way: 1) blk_mq_dispatch_rq_list() becomes more readable 2) name of blk_mq_handle_partial_dispatch() has document benefit. 3) it is easier to add new code into both blk_mq_dispatch_rq_list() and blk_mq_handle_partial_dispatch(), same with changes on both two functions. 4) easier to verify/review two small function So IMO it is worth a separate function, especially blk_mq_dispatch_rq_list() is becoming a big monster function. Thanks, Ming
diff --git a/block/blk-mq.c b/block/blk-mq.c index 34fd09adb7fc..86beb8c66868 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1231,6 +1231,71 @@ static blk_status_t blk_mq_dispatch_rq(struct request *rq, bool is_last) return rq->q->mq_ops->queue_rq(rq->mq_hctx, &bd); } +static void blk_mq_handle_partial_dispatch(struct blk_mq_hw_ctx *hctx, + struct list_head *list, enum prep_dispatch prep, + blk_status_t ret, bool queued) +{ + struct request_queue *q = hctx->queue; + bool needs_restart; + bool no_tag = false; + bool no_budget_avail = false; + + /* + * For non-shared tags, the RESTART check + * will suffice. + */ + if (prep == PREP_DISPATCH_NO_TAG && + (hctx->flags & BLK_MQ_F_TAG_SHARED)) + no_tag = true; + if (prep == PREP_DISPATCH_NO_BUDGET) + no_budget_avail = true; + + /* + * If we didn't flush the entire list, we could have told + * the driver there was more coming, but that turned out to + * be a lie. + */ + if (q->mq_ops->commit_rqs && queued) + q->mq_ops->commit_rqs(hctx); + + spin_lock(&hctx->lock); + list_splice_tail_init(list, &hctx->dispatch); + spin_unlock(&hctx->lock); + + /* + * If SCHED_RESTART was set by the caller of this function and + * it is no longer set that means that it was cleared by another + * thread and hence that a queue rerun is needed. + * + * If 'no_tag' is set, that means that we failed getting + * a driver tag with an I/O scheduler attached. If our dispatch + * waitqueue is no longer active, ensure that we run the queue + * AFTER adding our entries back to the list. + * + * If no I/O scheduler has been configured it is possible that + * the hardware queue got stopped and restarted before requests + * were pushed back onto the dispatch list. Rerun the queue to + * avoid starvation. Notes: + * - blk_mq_run_hw_queue() checks whether or not a queue has + * been stopped before rerunning a queue. + * - Some but not all block drivers stop a queue before + * returning BLK_STS_RESOURCE. Two exceptions are scsi-mq + * and dm-rq. + * + * If driver returns BLK_STS_RESOURCE and SCHED_RESTART + * bit is set, run queue after a delay to avoid IO stalls + * that could otherwise occur if the queue is idle. We'll do + * similar if we couldn't get budget and SCHED_RESTART is set. + */ + needs_restart = blk_mq_sched_needs_restart(hctx); + if (!needs_restart || + (no_tag && list_empty_careful(&hctx->dispatch_wait.entry))) + blk_mq_run_hw_queue(hctx, true); + else if (needs_restart && (ret == BLK_STS_RESOURCE || + no_budget_avail)) + blk_mq_delay_run_hw_queue(hctx, BLK_MQ_RESOURCE_DELAY); +} + /* * Returns true if we did some work AND can potentially do more. */ @@ -1238,7 +1303,6 @@ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *hctx, struct list_head *list, bool got_budget) { enum prep_dispatch prep; - struct request_queue *q = hctx->queue; struct request *rq; int errors, queued; blk_status_t ret = BLK_STS_OK; @@ -1296,65 +1360,7 @@ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *hctx, struct list_head *list, * that is where we will continue on next queue run. */ if (!list_empty(list)) { - bool needs_restart; - bool no_tag = false; - bool no_budget_avail = false; - - /* - * For non-shared tags, the RESTART check - * will suffice. - */ - if (prep == PREP_DISPATCH_NO_TAG && - (hctx->flags & BLK_MQ_F_TAG_SHARED)) - no_tag = true; - if (prep == PREP_DISPATCH_NO_BUDGET) - no_budget_avail = true; - - /* - * If we didn't flush the entire list, we could have told - * the driver there was more coming, but that turned out to - * be a lie. - */ - if (q->mq_ops->commit_rqs && queued) - q->mq_ops->commit_rqs(hctx); - - spin_lock(&hctx->lock); - list_splice_tail_init(list, &hctx->dispatch); - spin_unlock(&hctx->lock); - - /* - * If SCHED_RESTART was set by the caller of this function and - * it is no longer set that means that it was cleared by another - * thread and hence that a queue rerun is needed. - * - * If 'no_tag' is set, that means that we failed getting - * a driver tag with an I/O scheduler attached. If our dispatch - * waitqueue is no longer active, ensure that we run the queue - * AFTER adding our entries back to the list. - * - * If no I/O scheduler has been configured it is possible that - * the hardware queue got stopped and restarted before requests - * were pushed back onto the dispatch list. Rerun the queue to - * avoid starvation. Notes: - * - blk_mq_run_hw_queue() checks whether or not a queue has - * been stopped before rerunning a queue. - * - Some but not all block drivers stop a queue before - * returning BLK_STS_RESOURCE. Two exceptions are scsi-mq - * and dm-rq. - * - * If driver returns BLK_STS_RESOURCE and SCHED_RESTART - * bit is set, run queue after a delay to avoid IO stalls - * that could otherwise occur if the queue is idle. We'll do - * similar if we couldn't get budget and SCHED_RESTART is set. - */ - needs_restart = blk_mq_sched_needs_restart(hctx); - if (!needs_restart || - (no_tag && list_empty_careful(&hctx->dispatch_wait.entry))) - blk_mq_run_hw_queue(hctx, true); - else if (needs_restart && (ret == BLK_STS_RESOURCE || - no_budget_avail)) - blk_mq_delay_run_hw_queue(hctx, BLK_MQ_RESOURCE_DELAY); - + blk_mq_handle_partial_dispatch(hctx, list, prep, ret, !!queued); blk_mq_update_dispatch_busy(hctx, true); return false; } else
Move code for handling partial dispatch into one helper, so that blk_mq_dispatch_rq_list gets a bit simpified, and easier to read. No functional change. Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Baolin Wang <baolin.wang7@gmail.com> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Ming Lei <ming.lei@redhat.com> --- block/blk-mq.c | 126 ++++++++++++++++++++++++++----------------------- 1 file changed, 66 insertions(+), 60 deletions(-)