From patchwork Mon May 25 09:38:02 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 11568469 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4D57013B4 for ; Mon, 25 May 2020 09:38:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 30BD1207FB for ; Mon, 25 May 2020 09:38:33 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="X10QZ3Ma" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389605AbgEYJic (ORCPT ); Mon, 25 May 2020 05:38:32 -0400 Received: from us-smtp-2.mimecast.com ([205.139.110.61]:52960 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2388182AbgEYJib (ORCPT ); Mon, 25 May 2020 05:38:31 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1590399509; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CYNkXLQvUPzz8HYrQ5AKnwKxqwqDAqPm4Fy6h2SdqBg=; b=X10QZ3MaveyUfHSsPpIS/f+y4FpQGCAOZcFNciYZonuW7M4iAoRannEMeeFcws0nISu+cy vpaIyu0wsF0AT7GcNnsTv1kqGEXul1oq/+0HtfMUaQ8DbApPfjtFxkrSdyw3b8XLIU8VYh TpnptJJNgMiDpWwrn5tIvOBrBNHoJoc= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-126-K17OCQNROmqq1bb9DBpZwQ-1; Mon, 25 May 2020 05:38:25 -0400 X-MC-Unique: K17OCQNROmqq1bb9DBpZwQ-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id AC9B18010FA; Mon, 25 May 2020 09:38:23 +0000 (UTC) Received: from localhost (ovpn-12-137.pek2.redhat.com [10.72.12.137]) by smtp.corp.redhat.com (Postfix) with ESMTP id D37AC53E02; Mon, 25 May 2020 09:38:19 +0000 (UTC) From: Ming Lei To: Jens Axboe Cc: linux-block@vger.kernel.org, Ming Lei , Sagi Grimberg , Baolin Wang , Christoph Hellwig , Douglas Anderson , Johannes Thumshirn , Christoph Hellwig Subject: [PATCH V2 1/6] blk-mq: pass request queue into get/put budget callback Date: Mon, 25 May 2020 17:38:02 +0800 Message-Id: <20200525093807.805155-2-ming.lei@redhat.com> In-Reply-To: <20200525093807.805155-1-ming.lei@redhat.com> References: <20200525093807.805155-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org blk-mq budget is abstract from scsi's device queue depth, and it is always per-request-queue instead of hctx. It can be quite absurd to get a budget from one hctx, then dequeue a request from scheduler queue, and this request may not belong to this hctx, at least for bfq and deadline. So fix the mess and always pass request queue to get/put budget callback. Cc: Sagi Grimberg Cc: Baolin Wang Cc: Christoph Hellwig Cc: Douglas Anderson Reviewed-by: Johannes Thumshirn Reviewed-by: Christoph Hellwig Reviewed-by: Douglas Anderson Reviewed-by: Sagi Grimberg Signed-off-by: Ming Lei --- block/blk-mq-sched.c | 8 ++++---- block/blk-mq.c | 8 ++++---- block/blk-mq.h | 12 ++++-------- drivers/scsi/scsi_lib.c | 8 +++----- include/linux/blk-mq.h | 4 ++-- 5 files changed, 17 insertions(+), 23 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index fdcc2c1dd178..a31e281e9d31 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -108,12 +108,12 @@ static int blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx *hctx) break; } - if (!blk_mq_get_dispatch_budget(hctx)) + if (!blk_mq_get_dispatch_budget(q)) break; rq = e->type->ops.dispatch_request(hctx); if (!rq) { - blk_mq_put_dispatch_budget(hctx); + blk_mq_put_dispatch_budget(q); /* * We're releasing without dispatching. Holding the * budget could have blocked any "hctx"s with the @@ -173,12 +173,12 @@ static int blk_mq_do_dispatch_ctx(struct blk_mq_hw_ctx *hctx) if (!sbitmap_any_bit_set(&hctx->ctx_map)) break; - if (!blk_mq_get_dispatch_budget(hctx)) + if (!blk_mq_get_dispatch_budget(q)) break; rq = blk_mq_dequeue_from_ctx(hctx, ctx); if (!rq) { - blk_mq_put_dispatch_budget(hctx); + blk_mq_put_dispatch_budget(q); /* * We're releasing without dispatching. Holding the * budget could have blocked any "hctx"s with the diff --git a/block/blk-mq.c b/block/blk-mq.c index b15509bbf9d8..63f71ec09326 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1259,7 +1259,7 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list, rq = list_first_entry(list, struct request, queuelist); hctx = rq->mq_hctx; - if (!got_budget && !blk_mq_get_dispatch_budget(hctx)) { + if (!got_budget && !blk_mq_get_dispatch_budget(q)) { blk_mq_put_driver_tag(rq); no_budget_avail = true; break; @@ -1274,7 +1274,7 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list, * we'll re-run it below. */ if (!blk_mq_mark_tag_wait(hctx, rq)) { - blk_mq_put_dispatch_budget(hctx); + blk_mq_put_dispatch_budget(q); /* * For non-shared tags, the RESTART check * will suffice. @@ -1922,11 +1922,11 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, if (q->elevator && !bypass_insert) goto insert; - if (!blk_mq_get_dispatch_budget(hctx)) + if (!blk_mq_get_dispatch_budget(q)) goto insert; if (!blk_mq_get_driver_tag(rq)) { - blk_mq_put_dispatch_budget(hctx); + blk_mq_put_dispatch_budget(q); goto insert; } diff --git a/block/blk-mq.h b/block/blk-mq.h index 10bfdfb494fa..9540770de9dc 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -180,20 +180,16 @@ unsigned int blk_mq_in_flight(struct request_queue *q, struct hd_struct *part); void blk_mq_in_flight_rw(struct request_queue *q, struct hd_struct *part, unsigned int inflight[2]); -static inline void blk_mq_put_dispatch_budget(struct blk_mq_hw_ctx *hctx) +static inline void blk_mq_put_dispatch_budget(struct request_queue *q) { - struct request_queue *q = hctx->queue; - if (q->mq_ops->put_budget) - q->mq_ops->put_budget(hctx); + q->mq_ops->put_budget(q); } -static inline bool blk_mq_get_dispatch_budget(struct blk_mq_hw_ctx *hctx) +static inline bool blk_mq_get_dispatch_budget(struct request_queue *q) { - struct request_queue *q = hctx->queue; - if (q->mq_ops->get_budget) - return q->mq_ops->get_budget(hctx); + return q->mq_ops->get_budget(q); return true; } diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 82ad0244b3d0..b9adee0a9266 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1624,17 +1624,15 @@ static void scsi_mq_done(struct scsi_cmnd *cmd) clear_bit(SCMD_STATE_COMPLETE, &cmd->state); } -static void scsi_mq_put_budget(struct blk_mq_hw_ctx *hctx) +static void scsi_mq_put_budget(struct request_queue *q) { - struct request_queue *q = hctx->queue; struct scsi_device *sdev = q->queuedata; atomic_dec(&sdev->device_busy); } -static bool scsi_mq_get_budget(struct blk_mq_hw_ctx *hctx) +static bool scsi_mq_get_budget(struct request_queue *q) { - struct request_queue *q = hctx->queue; struct scsi_device *sdev = q->queuedata; return scsi_dev_queue_ready(q, sdev); @@ -1701,7 +1699,7 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx, if (scsi_target(sdev)->can_queue > 0) atomic_dec(&scsi_target(sdev)->target_busy); out_put_budget: - scsi_mq_put_budget(hctx); + scsi_mq_put_budget(q); switch (ret) { case BLK_STS_OK: break; diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 94c1318e4c1f..95bb54fde713 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -270,8 +270,8 @@ struct blk_mq_queue_data { typedef blk_status_t (queue_rq_fn)(struct blk_mq_hw_ctx *, const struct blk_mq_queue_data *); typedef void (commit_rqs_fn)(struct blk_mq_hw_ctx *); -typedef bool (get_budget_fn)(struct blk_mq_hw_ctx *); -typedef void (put_budget_fn)(struct blk_mq_hw_ctx *); +typedef bool (get_budget_fn)(struct request_queue *); +typedef void (put_budget_fn)(struct request_queue *); typedef enum blk_eh_timer_return (timeout_fn)(struct request *, bool); typedef int (init_hctx_fn)(struct blk_mq_hw_ctx *, void *, unsigned int); typedef void (exit_hctx_fn)(struct blk_mq_hw_ctx *, unsigned int); From patchwork Mon May 25 09:38:03 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 11568473 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DC57A13B4 for ; Mon, 25 May 2020 09:38:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C41B92078B for ; Mon, 25 May 2020 09:38:39 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="CzDbETa7" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389333AbgEYJij (ORCPT ); Mon, 25 May 2020 05:38:39 -0400 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:54680 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2389365AbgEYJii (ORCPT ); Mon, 25 May 2020 05:38:38 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1590399517; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4f5fk42qEpDF0lVE10DSXZj0FxTlGq73DImwMr62WP4=; b=CzDbETa7OM3JURqCi2c0Dy164fXzuxiUgDcJG5HGRVAK2F83xK/vaiKb7b+ueKSCnm4faf znhkB5sFhTpajkDHaqEoAAUL4DiYmmglfIixRy52WOsmeay+TK/tNY+/sVjb1XucyOvI/a wEnXVhL4xyo0gJL6VG439s1FX13AJNA= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-103-a5w-Rz9ZNx62wzjhC2Q0oA-1; Mon, 25 May 2020 05:38:35 -0400 X-MC-Unique: a5w-Rz9ZNx62wzjhC2Q0oA-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id B05AE100726B; Mon, 25 May 2020 09:38:33 +0000 (UTC) Received: from localhost (ovpn-12-137.pek2.redhat.com [10.72.12.137]) by smtp.corp.redhat.com (Postfix) with ESMTP id CA40583858; Mon, 25 May 2020 09:38:26 +0000 (UTC) From: Ming Lei To: Jens Axboe Cc: linux-block@vger.kernel.org, Ming Lei , Sagi Grimberg , Baolin Wang , Christoph Hellwig , Christoph Hellwig Subject: [PATCH V2 2/6] blk-mq: pass hctx to blk_mq_dispatch_rq_list Date: Mon, 25 May 2020 17:38:03 +0800 Message-Id: <20200525093807.805155-3-ming.lei@redhat.com> In-Reply-To: <20200525093807.805155-1-ming.lei@redhat.com> References: <20200525093807.805155-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org All requests in the 'list' of blk_mq_dispatch_rq_list belong to same hctx, so it is better to pass hctx instead of request queue, because blk-mq's dispatch target is hctx instead of request queue. Cc: Sagi Grimberg Cc: Baolin Wang Cc: Christoph Hellwig Reviewed-by: Christoph Hellwig Reviewed-by: Sagi Grimberg Signed-off-by: Ming Lei Reviewed-by: Johannes Thumshirn --- block/blk-mq-sched.c | 14 ++++++-------- block/blk-mq.c | 6 +++--- block/blk-mq.h | 2 +- 3 files changed, 10 insertions(+), 12 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index a31e281e9d31..632c6f8b63f7 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -96,10 +96,9 @@ static int blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx *hctx) struct elevator_queue *e = q->elevator; LIST_HEAD(rq_list); int ret = 0; + struct request *rq; do { - struct request *rq; - if (e->type->ops.has_work && !e->type->ops.has_work(hctx)) break; @@ -131,7 +130,7 @@ static int blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx *hctx) * in blk_mq_dispatch_rq_list(). */ list_add(&rq->queuelist, &rq_list); - } while (blk_mq_dispatch_rq_list(q, &rq_list, true)); + } while (blk_mq_dispatch_rq_list(rq->mq_hctx, &rq_list, true)); return ret; } @@ -161,10 +160,9 @@ static int blk_mq_do_dispatch_ctx(struct blk_mq_hw_ctx *hctx) LIST_HEAD(rq_list); struct blk_mq_ctx *ctx = READ_ONCE(hctx->dispatch_from); int ret = 0; + struct request *rq; do { - struct request *rq; - if (!list_empty_careful(&hctx->dispatch)) { ret = -EAGAIN; break; @@ -200,7 +198,7 @@ static int blk_mq_do_dispatch_ctx(struct blk_mq_hw_ctx *hctx) /* round robin for fair dispatch */ ctx = blk_mq_next_ctx(hctx, rq->mq_ctx); - } while (blk_mq_dispatch_rq_list(q, &rq_list, true)); + } while (blk_mq_dispatch_rq_list(rq->mq_hctx, &rq_list, true)); WRITE_ONCE(hctx->dispatch_from, ctx); return ret; @@ -240,7 +238,7 @@ static int __blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx) */ if (!list_empty(&rq_list)) { blk_mq_sched_mark_restart_hctx(hctx); - if (blk_mq_dispatch_rq_list(q, &rq_list, false)) { + if (blk_mq_dispatch_rq_list(hctx, &rq_list, false)) { if (has_sched_dispatch) ret = blk_mq_do_dispatch_sched(hctx); else @@ -253,7 +251,7 @@ static int __blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx) ret = blk_mq_do_dispatch_ctx(hctx); } else { blk_mq_flush_busy_ctxs(hctx, &rq_list); - blk_mq_dispatch_rq_list(q, &rq_list, false); + blk_mq_dispatch_rq_list(hctx, &rq_list, false); } return ret; diff --git a/block/blk-mq.c b/block/blk-mq.c index 63f71ec09326..b0049f4d5128 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1233,10 +1233,10 @@ static void blk_mq_handle_zone_resource(struct request *rq, /* * Returns true if we did some work AND can potentially do more. */ -bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list, +bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *hctx, struct list_head *list, bool got_budget) { - struct blk_mq_hw_ctx *hctx; + struct request_queue *q = hctx->queue; struct request *rq, *nxt; bool no_tag = false; int errors, queued; @@ -1258,7 +1258,7 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list, rq = list_first_entry(list, struct request, queuelist); - hctx = rq->mq_hctx; + WARN_ON_ONCE(hctx != rq->mq_hctx); if (!got_budget && !blk_mq_get_dispatch_budget(q)) { blk_mq_put_driver_tag(rq); no_budget_avail = true; diff --git a/block/blk-mq.h b/block/blk-mq.h index 9540770de9dc..9c0e93d4fe38 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -40,7 +40,7 @@ struct blk_mq_ctx { void blk_mq_exit_queue(struct request_queue *q); int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr); void blk_mq_wake_waiters(struct request_queue *q); -bool blk_mq_dispatch_rq_list(struct request_queue *, struct list_head *, bool); +bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *hctx, struct list_head *, bool); void blk_mq_add_to_requeue_list(struct request *rq, bool at_head, bool kick_requeue_list); void blk_mq_flush_busy_ctxs(struct blk_mq_hw_ctx *hctx, struct list_head *list); From patchwork Mon May 25 09:38:04 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 11568475 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2948E13B4 for ; Mon, 25 May 2020 09:38:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 11A4C207DA for ; Mon, 25 May 2020 09:38:47 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="QqfyZrmi" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389365AbgEYJiq (ORCPT ); Mon, 25 May 2020 05:38:46 -0400 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:27050 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2389607AbgEYJiq (ORCPT ); Mon, 25 May 2020 05:38:46 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1590399525; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=t8aU7NqLVWXDrPNslboZq3TKIU4tGVfOYFyWz3OsDiM=; b=QqfyZrmilHd9aYjXaas7Y8LJamWMQUktLWrnR3IW1ZLbo0f33LEXa4U8j3ljvGnmVK+SvB G1YS8qI8jaL+Om+TjdnDuDVsBoXMztLW4QKAua01UOdzBxl0om5pE3nA9s1dFPr4gGIPAs gRMIb5Km5dcyH4kEC3KZ/I/xiuBY5qk= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-159-dhIbG7MONXuqpuf3tEx1XA-1; Mon, 25 May 2020 05:38:41 -0400 X-MC-Unique: dhIbG7MONXuqpuf3tEx1XA-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 1B078460; Mon, 25 May 2020 09:38:40 +0000 (UTC) Received: from localhost (ovpn-12-137.pek2.redhat.com [10.72.12.137]) by smtp.corp.redhat.com (Postfix) with ESMTP id BD6BE19D61; Mon, 25 May 2020 09:38:36 +0000 (UTC) From: Ming Lei To: Jens Axboe Cc: linux-block@vger.kernel.org, Ming Lei , Sagi Grimberg , Baolin Wang , Christoph Hellwig Subject: [PATCH V2 3/6] blk-mq: move getting driver tag and bugget into one helper Date: Mon, 25 May 2020 17:38:04 +0800 Message-Id: <20200525093807.805155-4-ming.lei@redhat.com> In-Reply-To: <20200525093807.805155-1-ming.lei@redhat.com> References: <20200525093807.805155-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Move code for getting driver tag and bugget into one helper, so blk_mq_dispatch_rq_list gets a bit simpified, and easier to read. Meantime move updating of 'no_tag' and 'no_budget_avaiable' into the branch for handling partial dispatch because that is exactly consumer of the two local variables. Also rename the parameter of 'got_budget' as 'ask_budget'. No functional change. Cc: Sagi Grimberg Cc: Baolin Wang Cc: Christoph Hellwig Reviewed-by: Sagi Grimberg Reviewed-by: Christoph Hellwig Signed-off-by: Ming Lei Reviewed-by: Johannes Thumshirn --- block/blk-mq.c | 75 +++++++++++++++++++++++++++++++++----------------- 1 file changed, 49 insertions(+), 26 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index b0049f4d5128..1b257a94b020 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1230,18 +1230,51 @@ static void blk_mq_handle_zone_resource(struct request *rq, __blk_mq_requeue_request(rq); } +enum prep_dispatch { + PREP_DISPATCH_OK, + PREP_DISPATCH_NO_TAG, + PREP_DISPATCH_NO_BUDGET, +}; + +static enum prep_dispatch blk_mq_prep_dispatch_rq(struct request *rq, + bool ask_budget) +{ + struct blk_mq_hw_ctx *hctx = rq->mq_hctx; + + if (ask_budget && !blk_mq_get_dispatch_budget(rq->q)) { + blk_mq_put_driver_tag(rq); + return PREP_DISPATCH_NO_BUDGET; + } + + if (!blk_mq_get_driver_tag(rq)) { + /* + * The initial allocation attempt failed, so we need to + * rerun the hardware queue when a tag is freed. The + * waitqueue takes care of that. If the queue is run + * before we add this entry back on the dispatch list, + * we'll re-run it below. + */ + if (!blk_mq_mark_tag_wait(hctx, rq)) { + /* budget is always obtained before getting tag */ + blk_mq_put_dispatch_budget(rq->q); + return PREP_DISPATCH_NO_TAG; + } + } + + return PREP_DISPATCH_OK; +} + /* * Returns true if we did some work AND can potentially do more. */ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *hctx, struct list_head *list, bool got_budget) { + enum prep_dispatch prep; struct request_queue *q = hctx->queue; struct request *rq, *nxt; - bool no_tag = false; int errors, queued; blk_status_t ret = BLK_STS_OK; - bool no_budget_avail = false; LIST_HEAD(zone_list); if (list_empty(list)) @@ -1259,31 +1292,9 @@ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *hctx, struct list_head *list, rq = list_first_entry(list, struct request, queuelist); WARN_ON_ONCE(hctx != rq->mq_hctx); - if (!got_budget && !blk_mq_get_dispatch_budget(q)) { - blk_mq_put_driver_tag(rq); - no_budget_avail = true; + prep = blk_mq_prep_dispatch_rq(rq, !got_budget); + if (prep != PREP_DISPATCH_OK) break; - } - - if (!blk_mq_get_driver_tag(rq)) { - /* - * The initial allocation attempt failed, so we need to - * rerun the hardware queue when a tag is freed. The - * waitqueue takes care of that. If the queue is run - * before we add this entry back on the dispatch list, - * we'll re-run it below. - */ - if (!blk_mq_mark_tag_wait(hctx, rq)) { - blk_mq_put_dispatch_budget(q); - /* - * For non-shared tags, the RESTART check - * will suffice. - */ - if (hctx->flags & BLK_MQ_F_TAG_SHARED) - no_tag = true; - break; - } - } list_del_init(&rq->queuelist); @@ -1336,6 +1347,18 @@ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *hctx, struct list_head *list, */ if (!list_empty(list)) { bool needs_restart; + bool no_tag = false; + bool no_budget_avail = false; + + /* + * For non-shared tags, the RESTART check + * will suffice. + */ + if (prep == PREP_DISPATCH_NO_TAG && + (hctx->flags & BLK_MQ_F_TAG_SHARED)) + no_tag = true; + if (prep == PREP_DISPATCH_NO_BUDGET) + no_budget_avail = true; /* * If we didn't flush the entire list, we could have told From patchwork Mon May 25 09:38:05 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 11568477 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C005513B4 for ; Mon, 25 May 2020 09:38:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9F8E82073B for ; Mon, 25 May 2020 09:38:51 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="WFEw1xU1" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389607AbgEYJiv (ORCPT ); Mon, 25 May 2020 05:38:51 -0400 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:56363 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2389367AbgEYJiu (ORCPT ); Mon, 25 May 2020 05:38:50 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1590399529; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=N4rbrEoU/rVmWob08czqBURVqgZ4E3FdNW1uZmy8HQY=; b=WFEw1xU15yJwyomw5WUGp79Rwwv+Db4MgQIXEPX84CS4Oh4NB+IpA9cx7LtxUI3lrPOUeu G8dIMOuKNQUl5/vUxxtrNIft/WnQrGD8h+MVVIMh38diiwDj+kG20GmLBCw45ENJvJt4qO 7E41yKr3EPJ+QqQm6yxlVs/D1p53igs= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-344-5PQ7ARx3NaK3g1N6S_pdvA-1; Mon, 25 May 2020 05:38:47 -0400 X-MC-Unique: 5PQ7ARx3NaK3g1N6S_pdvA-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 5745D107ACCD; Mon, 25 May 2020 09:38:46 +0000 (UTC) Received: from localhost (ovpn-12-137.pek2.redhat.com [10.72.12.137]) by smtp.corp.redhat.com (Postfix) with ESMTP id DE6AC5C1BB; Mon, 25 May 2020 09:38:42 +0000 (UTC) From: Ming Lei To: Jens Axboe Cc: linux-block@vger.kernel.org, Ming Lei , Sagi Grimberg , Baolin Wang , Christoph Hellwig Subject: [PATCH V2 4/6] blk-mq: remove dead check from blk_mq_dispatch_rq_list Date: Mon, 25 May 2020 17:38:05 +0800 Message-Id: <20200525093807.805155-5-ming.lei@redhat.com> In-Reply-To: <20200525093807.805155-1-ming.lei@redhat.com> References: <20200525093807.805155-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org When BLK_STS_RESOURCE or BLK_STS_DEV_RESOURCE is returned from .queue_rq, the 'list' variable always holds this rq which isn't queued to LLD successfully. So blk_mq_dispatch_rq_list() always returns false from the branch of '!list_empty(list)'. No functional change. Cc: Sagi Grimberg Cc: Baolin Wang Cc: Christoph Hellwig Reviewed-by: Sagi Grimberg Reviewed-by: Christoph Hellwig Signed-off-by: Ming Lei --- block/blk-mq.c | 7 ------- 1 file changed, 7 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 1b257a94b020..a368eeb9d378 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1410,13 +1410,6 @@ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *hctx, struct list_head *list, } else blk_mq_update_dispatch_busy(hctx, false); - /* - * If the host/device is unable to accept more work, inform the - * caller of that. - */ - if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE) - return false; - return (queued + errors) != 0; } From patchwork Mon May 25 09:38:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 11568479 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6D9FD1391 for ; Mon, 25 May 2020 09:38:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 56B872078B for ; Mon, 25 May 2020 09:38:58 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="ILoReOPm" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389437AbgEYJi5 (ORCPT ); Mon, 25 May 2020 05:38:57 -0400 Received: from us-smtp-1.mimecast.com ([205.139.110.61]:31596 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2389367AbgEYJi5 (ORCPT ); Mon, 25 May 2020 05:38:57 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1590399535; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=J0F0dkruYpXhFys8BHfJC/7C9x3OrO8Ez5fyDqppBlg=; b=ILoReOPm0OCC6irWGGQ1y+Yw2IrBk+GSmXeHWiGNj9N7yWdPMbNB0EBvNzLW55xmVwfQa2 6LXjEnWmUjyhMGExXPhGW6uYeVBVv+/ggU0oWO39cezHhyf4LyVW2QuYhWVrVsdkpZuMOi tLWR9sBlOK62TODrShU9/5Al3dhOhmI= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-494-OGmm8U_ZP2OrQB1SdHaitQ-1; Mon, 25 May 2020 05:38:53 -0400 X-MC-Unique: OGmm8U_ZP2OrQB1SdHaitQ-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id C49D680B713; Mon, 25 May 2020 09:38:52 +0000 (UTC) Received: from localhost (ovpn-12-137.pek2.redhat.com [10.72.12.137]) by smtp.corp.redhat.com (Postfix) with ESMTP id 350B653E02; Mon, 25 May 2020 09:38:48 +0000 (UTC) From: Ming Lei To: Jens Axboe Cc: linux-block@vger.kernel.org, Ming Lei , Sagi Grimberg , Baolin Wang , Christoph Hellwig Subject: [PATCH V2 5/6] blk-mq: pass obtained budget count to blk_mq_dispatch_rq_list Date: Mon, 25 May 2020 17:38:06 +0800 Message-Id: <20200525093807.805155-6-ming.lei@redhat.com> In-Reply-To: <20200525093807.805155-1-ming.lei@redhat.com> References: <20200525093807.805155-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Pass obtained budget count to blk_mq_dispatch_rq_list(), and prepare for supporting fully batching submission. With the obtained budget count, it is easier to put extra budgets in case of .queue_rq failure. Meantime remove the old 'got_budget' parameter. Cc: Sagi Grimberg Cc: Baolin Wang Cc: Christoph Hellwig Signed-off-by: Ming Lei --- block/blk-mq-sched.c | 8 ++++---- block/blk-mq.c | 13 +++++++++---- block/blk-mq.h | 3 ++- 3 files changed, 15 insertions(+), 9 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 632c6f8b63f7..4c72073830f3 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -130,7 +130,7 @@ static int blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx *hctx) * in blk_mq_dispatch_rq_list(). */ list_add(&rq->queuelist, &rq_list); - } while (blk_mq_dispatch_rq_list(rq->mq_hctx, &rq_list, true)); + } while (blk_mq_dispatch_rq_list(rq->mq_hctx, &rq_list, 1)); return ret; } @@ -198,7 +198,7 @@ static int blk_mq_do_dispatch_ctx(struct blk_mq_hw_ctx *hctx) /* round robin for fair dispatch */ ctx = blk_mq_next_ctx(hctx, rq->mq_ctx); - } while (blk_mq_dispatch_rq_list(rq->mq_hctx, &rq_list, true)); + } while (blk_mq_dispatch_rq_list(rq->mq_hctx, &rq_list, 1)); WRITE_ONCE(hctx->dispatch_from, ctx); return ret; @@ -238,7 +238,7 @@ static int __blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx) */ if (!list_empty(&rq_list)) { blk_mq_sched_mark_restart_hctx(hctx); - if (blk_mq_dispatch_rq_list(hctx, &rq_list, false)) { + if (blk_mq_dispatch_rq_list(hctx, &rq_list, 0)) { if (has_sched_dispatch) ret = blk_mq_do_dispatch_sched(hctx); else @@ -251,7 +251,7 @@ static int __blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx) ret = blk_mq_do_dispatch_ctx(hctx); } else { blk_mq_flush_busy_ctxs(hctx, &rq_list); - blk_mq_dispatch_rq_list(hctx, &rq_list, false); + blk_mq_dispatch_rq_list(hctx, &rq_list, 0); } return ret; diff --git a/block/blk-mq.c b/block/blk-mq.c index a368eeb9d378..3f672b2662a9 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1256,7 +1256,8 @@ static enum prep_dispatch blk_mq_prep_dispatch_rq(struct request *rq, */ if (!blk_mq_mark_tag_wait(hctx, rq)) { /* budget is always obtained before getting tag */ - blk_mq_put_dispatch_budget(rq->q); + if (ask_budget) + blk_mq_put_dispatch_budget(rq->q); return PREP_DISPATCH_NO_TAG; } } @@ -1268,7 +1269,7 @@ static enum prep_dispatch blk_mq_prep_dispatch_rq(struct request *rq, * Returns true if we did some work AND can potentially do more. */ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *hctx, struct list_head *list, - bool got_budget) + unsigned int nr_budgets) { enum prep_dispatch prep; struct request_queue *q = hctx->queue; @@ -1280,7 +1281,7 @@ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *hctx, struct list_head *list, if (list_empty(list)) return false; - WARN_ON(!list_is_singular(list) && got_budget); + WARN_ON(!list_is_singular(list) && nr_budgets); /* * Now process all the entries, sending them to the driver. @@ -1292,7 +1293,7 @@ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *hctx, struct list_head *list, rq = list_first_entry(list, struct request, queuelist); WARN_ON_ONCE(hctx != rq->mq_hctx); - prep = blk_mq_prep_dispatch_rq(rq, !got_budget); + prep = blk_mq_prep_dispatch_rq(rq, !nr_budgets); if (prep != PREP_DISPATCH_OK) break; @@ -1349,7 +1350,11 @@ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *hctx, struct list_head *list, bool needs_restart; bool no_tag = false; bool no_budget_avail = false; + unsigned i = 0; + /* release got budgets */ + while (i++ < nr_budgets) + blk_mq_put_dispatch_budget(hctx->queue); /* * For non-shared tags, the RESTART check * will suffice. diff --git a/block/blk-mq.h b/block/blk-mq.h index 9c0e93d4fe38..97d39a63353a 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -40,7 +40,8 @@ struct blk_mq_ctx { void blk_mq_exit_queue(struct request_queue *q); int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr); void blk_mq_wake_waiters(struct request_queue *q); -bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *hctx, struct list_head *, bool); +bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *hctx, struct list_head *, + unsigned int); void blk_mq_add_to_requeue_list(struct request *rq, bool at_head, bool kick_requeue_list); void blk_mq_flush_busy_ctxs(struct blk_mq_hw_ctx *hctx, struct list_head *list); From patchwork Mon May 25 09:38:07 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 11568481 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D17221391 for ; Mon, 25 May 2020 09:39:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B92E22073B for ; Mon, 25 May 2020 09:39:10 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="BHTWtDEe" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389520AbgEYJjK (ORCPT ); Mon, 25 May 2020 05:39:10 -0400 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:59065 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2389367AbgEYJjK (ORCPT ); Mon, 25 May 2020 05:39:10 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1590399548; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=OjNiUE7gDtN0yF/gxU1VFRPBtu2E3U8ObEBp3r1kD3o=; b=BHTWtDEeDVxo+upssoKONDsvpoNqoGDGGNMrw1DN6bpkHm+97NQv1T4fkHZ+gfeDjQM8N7 DytYBLIMFd/TJZQCXI294QI4ZAf2UzK/QBvgeZXElAdsXkIpg9YLKvuoPsnSNQ0UGF9K1A djrz2j+3z6Mei4I6rRZ2Zxx3P/2QRfU= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-93-G107ELwMPQi_qH4ZdaQDig-1; Mon, 25 May 2020 05:39:05 -0400 X-MC-Unique: G107ELwMPQi_qH4ZdaQDig-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id BD8C380B700; Mon, 25 May 2020 09:39:03 +0000 (UTC) Received: from localhost (ovpn-12-137.pek2.redhat.com [10.72.12.137]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7ABDA1CA; Mon, 25 May 2020 09:38:59 +0000 (UTC) From: Ming Lei To: Jens Axboe Cc: linux-block@vger.kernel.org, Ming Lei , Sagi Grimberg , Baolin Wang , Christoph Hellwig Subject: [PATCH V2 6/6] blk-mq: support batching dispatch in case of io scheduler Date: Mon, 25 May 2020 17:38:07 +0800 Message-Id: <20200525093807.805155-7-ming.lei@redhat.com> In-Reply-To: <20200525093807.805155-1-ming.lei@redhat.com> References: <20200525093807.805155-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org More and more drivers want to get batching requests queued from block layer, such as mmc, and tcp based storage drivers. Also current in-tree users have virtio-scsi, virtio-blk and nvme. For none, we already support batching dispatch. But for io scheduler, every time we just take one request from scheduler and pass the single request to blk_mq_dispatch_rq_list(). This way makes batching dispatch not possible when io scheduler is applied. One reason is that we don't want to hurt sequential IO performance, becasue IO merge chance is reduced if more requests are dequeued from scheduler queue. Try to support batching dispatch for io scheduler by starting with the following simple approach: 1) still make sure we can get budget before dequeueing request 2) use hctx->dispatch_busy to evaluate if queue is busy, if it is busy we fackback to non-batching dispatch, otherwise dequeue as many as possible requests from scheduler, and pass them to blk_mq_dispatch_rq_list(). Wrt. 2), we use similar policy for none, and turns out that SCSI SSD performance got improved much. In future, maybe we can develop more intelligent algorithem for batching dispatch. [1] https://lore.kernel.org/linux-block/20200512075501.GF1531898@T590/#r [2] https://lore.kernel.org/linux-block/fe6bd8b9-6ed9-b225-f80c-314746133722@grimberg.me/ Cc: Sagi Grimberg Cc: Baolin Wang Cc: Christoph Hellwig Signed-off-by: Ming Lei --- block/blk-mq-sched.c | 75 +++++++++++++++++++++++++++++++++++++++++++- block/blk-mq.c | 2 -- 2 files changed, 74 insertions(+), 3 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 4c72073830f3..75cf9528ac01 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -7,6 +7,7 @@ #include #include #include +#include #include @@ -80,6 +81,68 @@ void blk_mq_sched_restart(struct blk_mq_hw_ctx *hctx) blk_mq_run_hw_queue(hctx, true); } +/* + * We know bfq and deadline apply single scheduler queue instead of multi + * queue. However, the two are often used on single queue devices, also + * the current @hctx should affect the real device status most of times + * because of locality principle. + * + * So use current hctx->dispatch_busy directly for figuring out batching + * dispatch count. + */ +static unsigned int blk_mq_sched_get_batching_nr(struct blk_mq_hw_ctx *hctx) +{ + if (hctx->dispatch_busy) + return 1; + return hctx->queue->nr_requests; +} + +static int sched_rq_cmp(void *priv, struct list_head *a, struct list_head *b) +{ + struct request *rqa = container_of(a, struct request, queuelist); + struct request *rqb = container_of(b, struct request, queuelist); + + return rqa->mq_hctx > rqb->mq_hctx; +} + +static inline void blk_mq_do_dispatch_rq_lists(struct blk_mq_hw_ctx *hctx, + struct list_head *lists, bool multi_hctxs, unsigned count) +{ + if (likely(!multi_hctxs)) { + blk_mq_dispatch_rq_list(hctx, lists, count); + return; + } + + /* + * Requests from different hctx may be dequeued from some scheduler, + * such as bfq and deadline. + * + * Sort the requests in the list according to their hctx, dispatch + * batching requests from same hctx + */ + list_sort(NULL, lists, sched_rq_cmp); + + while (!list_empty(lists)) { + LIST_HEAD(list); + struct request *new, *rq = list_first_entry(lists, + struct request, queuelist); + unsigned cnt = 0; + + list_for_each_entry(new, lists, queuelist) { + if (new->mq_hctx != rq->mq_hctx) + break; + cnt++; + } + + if (new->mq_hctx == rq->mq_hctx) + list_splice_tail_init(lists, &list); + else + list_cut_before(&list, lists, &new->queuelist); + + blk_mq_dispatch_rq_list(rq->mq_hctx, &list, cnt); + } +} + #define BLK_MQ_BUDGET_DELAY 3 /* ms units */ /* @@ -97,6 +160,9 @@ static int blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx *hctx) LIST_HEAD(rq_list); int ret = 0; struct request *rq; + int cnt = 0; + unsigned int max_dispatch = blk_mq_sched_get_batching_nr(hctx); + bool multi_hctxs = false; do { if (e->type->ops.has_work && !e->type->ops.has_work(hctx)) @@ -130,7 +196,14 @@ static int blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx *hctx) * in blk_mq_dispatch_rq_list(). */ list_add(&rq->queuelist, &rq_list); - } while (blk_mq_dispatch_rq_list(rq->mq_hctx, &rq_list, 1)); + cnt++; + + if (rq->mq_hctx != hctx && !multi_hctxs) + multi_hctxs = true; + } while (cnt < max_dispatch); + + if (cnt) + blk_mq_do_dispatch_rq_lists(hctx, &rq_list, multi_hctxs, cnt); return ret; } diff --git a/block/blk-mq.c b/block/blk-mq.c index 3f672b2662a9..ed61811e1611 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1281,8 +1281,6 @@ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *hctx, struct list_head *list, if (list_empty(list)) return false; - WARN_ON(!list_is_singular(list) && nr_budgets); - /* * Now process all the entries, sending them to the driver. */