diff mbox

[V4,3/3] blk-mq: issue request directly for blk_insert_cloned_request

Message ID 20180115165810.2515-4-ming.lei@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Ming Lei Jan. 15, 2018, 4:58 p.m. UTC
blk_insert_cloned_request() is called in fast path of dm-rq driver, and
in this function we append request to hctx->dispatch_list of the underlying
queue directly.

1) This way isn't efficient enough because hctx lock is always required

2) With blk_insert_cloned_request(), we bypass underlying queue's IO scheduler
totally, and depend on DM rq driver to do IO schedule completely. But DM
rq driver can't get underlying queue's dispatch feedback at all, and this
information is extreamly useful for IO merge. Without that IO merge can't
be done basically by blk-mq, and causes very bad sequential IO performance.

This patch makes use of blk_mq_try_issue_directly() to dispatch rq to
underlying queue and provides disptch result to dm-rq and blk-mq, and
improves the above situations very much.

With this patch, seqential IO is improved by 3X in my test over
mpath/virtio-scsi.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-mq.c | 29 ++++++++++++++++++++++++++---
 1 file changed, 26 insertions(+), 3 deletions(-)

Comments

Mike Snitzer Jan. 16, 2018, 1:34 a.m. UTC | #1
On Mon, Jan 15 2018 at 11:58am -0500,
Ming Lei <ming.lei@redhat.com> wrote:

> blk_insert_cloned_request() is called in fast path of dm-rq driver, and
> in this function we append request to hctx->dispatch_list of the underlying
> queue directly.
> 
> 1) This way isn't efficient enough because hctx lock is always required
> 
> 2) With blk_insert_cloned_request(), we bypass underlying queue's IO scheduler
> totally, and depend on DM rq driver to do IO schedule completely. But DM
> rq driver can't get underlying queue's dispatch feedback at all, and this
> information is extreamly useful for IO merge. Without that IO merge can't
> be done basically by blk-mq, and causes very bad sequential IO performance.
> 
> This patch makes use of blk_mq_try_issue_directly() to dispatch rq to
> underlying queue and provides disptch result to dm-rq and blk-mq, and
> improves the above situations very much.
> 
> With this patch, seqential IO is improved by 3X in my test over
> mpath/virtio-scsi.
> 
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>  block/blk-mq.c | 29 ++++++++++++++++++++++++++---
>  1 file changed, 26 insertions(+), 3 deletions(-)

Hey Ming,

I also just noticed your V4 of this patch only includes the
block/blk-mq.c changes.

You're missing:

 block/blk-core.c   |  3 +--
 block/blk-mq.h     |  3 +++
 drivers/md/dm-rq.c | 19 +++++++++++++---

Please let Jens know if you're OK with my V4, tagged here:
https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/tag/?h=for-block-4.16/dm-changes-2
And also detailed in this message from earlier in this thread:
https://marc.info/?l=linux-block&m=151603824725438&w=2

Or please generate V5 of your series.  Hopefully it'd include the header
I revised for this 3/3 patch, see:
https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=for-block-4.16/dm-changes-2&id=d86beab5712a8f18123011487dee797a1e3a07e1

We also need to address the issues Jens noticed and I looked at a bit
closer: https://marc.info/?l=linux-block&m=151604528127707&w=2
(those issues might need fixing first, and marked for stable@, and the
series rebased ontop of it?)

Thanks!
Mie
diff mbox

Patch

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 1654c9c284d8..ce3965f271e3 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1730,6 +1730,11 @@  static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
 	blk_qc_t new_cookie;
 	blk_status_t ret = BLK_STS_OK;
 	bool run_queue = true;
+	/*
+	 * This function is called from blk_insert_cloned_request() if
+	 * 'cookie' is NULL, and for dispatching this request only.
+	 */
+	bool dispatch_only = !cookie;
 
 	/* RCU or SRCU read lock is needed before checking quiesced flag */
 	if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) {
@@ -1737,10 +1742,17 @@  static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
 		goto insert;
 	}
 
-	if (q->elevator)
+	if (q->elevator && !dispatch_only)
 		goto insert;
 
 	ret = __blk_mq_issue_req(hctx, rq, &new_cookie);
+	if (dispatch_only) {
+		if (ret == BLK_STS_AGAIN)
+			return BLK_STS_RESOURCE;
+		if (ret == BLK_STS_RESOURCE)
+			__blk_mq_requeue_request(rq);
+		return ret;
+	}
 	if (ret == BLK_STS_AGAIN)
 		goto insert;
 
@@ -1763,8 +1775,11 @@  static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
 	}
 
 insert:
-	blk_mq_sched_insert_request(rq, false, run_queue, false,
-					hctx->flags & BLK_MQ_F_BLOCKING);
+	if (!dispatch_only)
+		blk_mq_sched_insert_request(rq, false, run_queue, false,
+				hctx->flags & BLK_MQ_F_BLOCKING);
+	else
+		blk_mq_request_bypass_insert(rq, run_queue);
 	return ret;
 }
 
@@ -1784,6 +1799,14 @@  static blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
 	return ret;
 }
 
+blk_status_t blk_mq_request_direct_issue(struct request *rq)
+{
+	struct blk_mq_ctx *ctx = rq->mq_ctx;
+	struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(rq->q, ctx->cpu);
+
+	return blk_mq_try_issue_directly(hctx, rq, NULL);
+}
+
 static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
 {
 	const int is_sync = op_is_sync(bio->bi_opf);