Message ID | 20170206195330.GB20714@vader (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 02/06/2017 12:53 PM, Omar Sandoval wrote: > On Mon, Feb 06, 2017 at 12:39:57PM -0700, Jens Axboe wrote: >> On 02/06/2017 12:24 PM, Omar Sandoval wrote: >>> From: Omar Sandoval <osandov@fb.com> >>> >>> In blk_mq_sched_dispatch_requests(), we call blk_mq_sched_mark_restart() >>> after we dispatch requests left over on our hardware queue dispatch >>> list. This is so we'll go back and dispatch requests from the scheduler. >>> In this case, it's only necessary to restart the hardware queue that we >>> are running; there's no reason to run other hardware queues just because >>> we are using shared tags. >>> >>> So, split out blk_mq_sched_mark_restart() into two operations, one for >>> just the hardware queue and one for the whole request queue. The core >>> code uses both, and I/O schedulers may also want to use them. >>> >>> Signed-off-by: Omar Sandoval <osandov@fb.com> >>> --- >>> Patch based on block/for-next. >>> >>> block/blk-mq-sched.c | 2 +- >>> block/blk-mq-sched.h | 25 ++++++++++++++++++------- >>> block/blk-mq.c | 5 ++++- >>> 3 files changed, 23 insertions(+), 9 deletions(-) >>> >>> diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c >>> index ee455e7cf9d8..7538565359ea 100644 >>> --- a/block/blk-mq-sched.c >>> +++ b/block/blk-mq-sched.c >>> @@ -201,7 +201,7 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx) >>> * needing a restart in that case. >>> */ >>> if (!list_empty(&rq_list)) { >>> - blk_mq_sched_mark_restart(hctx); >>> + blk_mq_sched_mark_restart_hctx(hctx); >>> blk_mq_dispatch_rq_list(hctx, &rq_list); >> >> What if we dispatched nothing on this hardware queue, and it currently >> doesn't have any IO pending? > > Hm, so there are two ways that could happen. If it's because > ->queue_rq() returned BLK_MQ_RQ_QUEUE_BUSY, then the driver is supposed > to kick I/O off again, right? > > If it's because we failed to get a driver tag, then we'll call > blk_mq_sched_mark_restart_queue() in the shared case. I just realized > that there's a bug there, though. Since we already set the hctx restart > bit, we won't set the queue restart bit. The below should work, and > makes more sense in general. > > Or were you thinking of something else? No, I think that covers it, I had not read far enough either to see that you handle the shared tag case for tag starvation in the caller.
On Mon, Feb 06, 2017 at 01:07:41PM -0700, Jens Axboe wrote: > On 02/06/2017 12:53 PM, Omar Sandoval wrote: > > On Mon, Feb 06, 2017 at 12:39:57PM -0700, Jens Axboe wrote: > >> On 02/06/2017 12:24 PM, Omar Sandoval wrote: > >>> From: Omar Sandoval <osandov@fb.com> > >>> > >>> In blk_mq_sched_dispatch_requests(), we call blk_mq_sched_mark_restart() > >>> after we dispatch requests left over on our hardware queue dispatch > >>> list. This is so we'll go back and dispatch requests from the scheduler. > >>> In this case, it's only necessary to restart the hardware queue that we > >>> are running; there's no reason to run other hardware queues just because > >>> we are using shared tags. > >>> > >>> So, split out blk_mq_sched_mark_restart() into two operations, one for > >>> just the hardware queue and one for the whole request queue. The core > >>> code uses both, and I/O schedulers may also want to use them. > >>> > >>> Signed-off-by: Omar Sandoval <osandov@fb.com> > >>> --- > >>> Patch based on block/for-next. > >>> > >>> block/blk-mq-sched.c | 2 +- > >>> block/blk-mq-sched.h | 25 ++++++++++++++++++------- > >>> block/blk-mq.c | 5 ++++- > >>> 3 files changed, 23 insertions(+), 9 deletions(-) > >>> > >>> diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c > >>> index ee455e7cf9d8..7538565359ea 100644 > >>> --- a/block/blk-mq-sched.c > >>> +++ b/block/blk-mq-sched.c > >>> @@ -201,7 +201,7 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx) > >>> * needing a restart in that case. > >>> */ > >>> if (!list_empty(&rq_list)) { > >>> - blk_mq_sched_mark_restart(hctx); > >>> + blk_mq_sched_mark_restart_hctx(hctx); > >>> blk_mq_dispatch_rq_list(hctx, &rq_list); > >> > >> What if we dispatched nothing on this hardware queue, and it currently > >> doesn't have any IO pending? > > > > Hm, so there are two ways that could happen. If it's because > > ->queue_rq() returned BLK_MQ_RQ_QUEUE_BUSY, then the driver is supposed > > to kick I/O off again, right? > > > > If it's because we failed to get a driver tag, then we'll call > > blk_mq_sched_mark_restart_queue() in the shared case. I just realized > > that there's a bug there, though. Since we already set the hctx restart > > bit, we won't set the queue restart bit. The below should work, and > > makes more sense in general. > > > > Or were you thinking of something else? > > No, I think that covers it, I had not read far enough either to see that > you handle the shared tag case for tag starvation in the caller. Yup, I considered making that its own helper but I figured we could do that when we need the same logic elsewhere.
diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index ee455e7cf9d8..7538565359ea 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -201,7 +201,7 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx) * needing a restart in that case. */ if (!list_empty(&rq_list)) { - blk_mq_sched_mark_restart(hctx); + blk_mq_sched_mark_restart_hctx(hctx); blk_mq_dispatch_rq_list(hctx, &rq_list); } else if (!e || !e->type->ops.mq.dispatch_request) { blk_mq_flush_busy_ctxs(hctx, &rq_list); diff --git a/block/blk-mq-sched.h b/block/blk-mq-sched.h index 5954859c8670..36cc68481b0c 100644 --- a/block/blk-mq-sched.h +++ b/block/blk-mq-sched.h @@ -121,17 +121,27 @@ static inline bool blk_mq_sched_has_work(struct blk_mq_hw_ctx *hctx) return false; } -static inline void blk_mq_sched_mark_restart(struct blk_mq_hw_ctx *hctx) +/* + * Mark a hardware queue as needing a restart. + */ +static inline void blk_mq_sched_mark_restart_hctx(struct blk_mq_hw_ctx *hctx) { - if (!test_bit(BLK_MQ_S_SCHED_RESTART, &hctx->state)) { + if (!test_bit(BLK_MQ_S_SCHED_RESTART, &hctx->state)) set_bit(BLK_MQ_S_SCHED_RESTART, &hctx->state); - if (hctx->flags & BLK_MQ_F_TAG_SHARED) { - struct request_queue *q = hctx->queue; +} + +/* + * Mark a hardware queue and the request queue it belongs to as needing a + * restart. + */ +static inline void blk_mq_sched_mark_restart_queue(struct blk_mq_hw_ctx *hctx) +{ + struct request_queue *q = hctx->queue; - if (!test_bit(QUEUE_FLAG_RESTART, &q->queue_flags)) - set_bit(QUEUE_FLAG_RESTART, &q->queue_flags); - } - } + if (!test_bit(BLK_MQ_S_SCHED_RESTART, &hctx->state)) + set_bit(BLK_MQ_S_SCHED_RESTART, &hctx->state); + if (!test_bit(QUEUE_FLAG_RESTART, &q->queue_flags)) + set_bit(QUEUE_FLAG_RESTART, &q->queue_flags); } static inline bool blk_mq_sched_needs_restart(struct blk_mq_hw_ctx *hctx) diff --git a/block/blk-mq.c b/block/blk-mq.c index be183e6115a1..a494c0b589d5 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -937,7 +937,10 @@ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *hctx, struct list_head *list) * in case the needed IO completed right before we * marked the queue as needing a restart. */ - blk_mq_sched_mark_restart(hctx); + if (hctx->flags & BLK_MQ_F_TAG_SHARED) + blk_mq_sched_mark_restart_queue(hctx); + else + blk_mq_sched_mark_restart_hctx(hctx); if (!blk_mq_get_driver_tag(rq, &hctx, false)) break; }