From patchwork Sat Aug 26 16:33:32 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 9923587 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id B382D602BD for ; Sat, 26 Aug 2017 16:35:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A4E3D285D1 for ; Sat, 26 Aug 2017 16:35:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 99EFE285DD; Sat, 26 Aug 2017 16:35:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2A95A285D1 for ; Sat, 26 Aug 2017 16:35:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751124AbdHZQfs (ORCPT ); Sat, 26 Aug 2017 12:35:48 -0400 Received: from mx1.redhat.com ([209.132.183.28]:59878 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751061AbdHZQfs (ORCPT ); Sat, 26 Aug 2017 12:35:48 -0400 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 2D4927E423; Sat, 26 Aug 2017 16:35:48 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 2D4927E423 Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=ming.lei@redhat.com Received: from localhost (ovpn-12-21.pek2.redhat.com [10.72.12.21]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0B83D19E8E; Sat, 26 Aug 2017 16:35:41 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org, Christoph Hellwig Cc: Bart Van Assche , Laurence Oberman , Paolo Valente , Mel Gorman , Ming Lei Subject: [PATCH V3 14/14] blk-mq: improve bio merge from blk-mq sw queue Date: Sun, 27 Aug 2017 00:33:32 +0800 Message-Id: <20170826163332.28971-15-ming.lei@redhat.com> In-Reply-To: <20170826163332.28971-1-ming.lei@redhat.com> References: <20170826163332.28971-1-ming.lei@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Sat, 26 Aug 2017 16:35:48 +0000 (UTC) Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This patch uses hash table to do bio merge from sw queue, then we can align to blk-mq scheduler/block legacy's way for bio merge. Turns out bio merge via hash table is more efficient than simple merge on the last 8 requests in sw queue. On SCSI SRP, it is observed ~10% IOPS is increased in sequential IO test with this patch. It is also one step forward to real 'none' scheduler, in which way the blk-mq scheduler framework can be more clean. Signed-off-by: Ming Lei --- block/blk-mq-sched.c | 49 ++++++++++++------------------------------------- block/blk-mq.c | 28 +++++++++++++++++++++++++--- 2 files changed, 37 insertions(+), 40 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 5af0ff71730c..b958caa8bccb 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -268,50 +268,25 @@ bool blk_mq_sched_try_merge(struct request_queue *q, struct bio *bio, } EXPORT_SYMBOL_GPL(blk_mq_sched_try_merge); -/* - * Reverse check our software queue for entries that we could potentially - * merge with. Currently includes a hand-wavy stop count of 8, to not spend - * too much time checking for merges. - */ -static bool blk_mq_attempt_merge(struct request_queue *q, +static bool blk_mq_ctx_try_merge(struct request_queue *q, struct blk_mq_ctx *ctx, struct bio *bio) { - struct request *rq; - int checked = 8; + struct request *rq, *free = NULL; + enum elv_merge type; + bool merged; lockdep_assert_held(&ctx->lock); - list_for_each_entry_reverse(rq, &ctx->rq_list, queuelist) { - bool merged = false; - - if (!checked--) - break; - - if (!blk_rq_merge_ok(rq, bio)) - continue; + type = elv_merge_ctx(q, &rq, bio, ctx); + merged = __blk_mq_try_merge(q, bio, &free, rq, type); - switch (blk_try_merge(rq, bio)) { - case ELEVATOR_BACK_MERGE: - if (blk_mq_sched_allow_merge(q, rq, bio)) - merged = bio_attempt_back_merge(q, rq, bio); - break; - case ELEVATOR_FRONT_MERGE: - if (blk_mq_sched_allow_merge(q, rq, bio)) - merged = bio_attempt_front_merge(q, rq, bio); - break; - case ELEVATOR_DISCARD_MERGE: - merged = bio_attempt_discard_merge(q, rq, bio); - break; - default: - continue; - } + if (free) + blk_mq_free_request(free); - if (merged) - ctx->rq_merged++; - return merged; - } + if (merged) + ctx->rq_merged++; - return false; + return merged; } bool __blk_mq_sched_bio_merge(struct request_queue *q, struct bio *bio) @@ -329,7 +304,7 @@ bool __blk_mq_sched_bio_merge(struct request_queue *q, struct bio *bio) if (hctx->flags & BLK_MQ_F_SHOULD_MERGE) { /* default per sw-queue merge */ spin_lock(&ctx->lock); - ret = blk_mq_attempt_merge(q, ctx, bio); + ret = blk_mq_ctx_try_merge(q, ctx, bio); spin_unlock(&ctx->lock); } diff --git a/block/blk-mq.c b/block/blk-mq.c index fc3d26bbfc1a..d935f15c54da 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -847,6 +847,18 @@ static void blk_mq_timeout_work(struct work_struct *work) blk_queue_exit(q); } +static void blk_mq_ctx_remove_rq_list(struct blk_mq_ctx *ctx, + struct list_head *head) +{ + struct request *rq; + + lockdep_assert_held(&ctx->lock); + + list_for_each_entry(rq, head, queuelist) + rqhash_del(rq); + ctx->last_merge = NULL; +} + struct flush_busy_ctx_data { struct blk_mq_hw_ctx *hctx; struct list_head *list; @@ -861,6 +873,7 @@ static bool flush_busy_ctx(struct sbitmap *sb, unsigned int bitnr, void *data) sbitmap_clear_bit(sb, bitnr); spin_lock(&ctx->lock); list_splice_tail_init(&ctx->rq_list, flush_data->list); + blk_mq_ctx_remove_rq_list(ctx, flush_data->list); spin_unlock(&ctx->lock); return true; } @@ -890,17 +903,23 @@ static bool dispatch_rq_from_ctx(struct sbitmap *sb, unsigned int bitnr, void *d struct dispatch_rq_data *dispatch_data = data; struct blk_mq_hw_ctx *hctx = dispatch_data->hctx; struct blk_mq_ctx *ctx = hctx->ctxs[bitnr]; + struct request *rq = NULL; spin_lock(&ctx->lock); if (unlikely(!list_empty(&ctx->rq_list))) { - dispatch_data->rq = list_entry_rq(ctx->rq_list.next); - list_del_init(&dispatch_data->rq->queuelist); + rq = list_entry_rq(ctx->rq_list.next); + list_del_init(&rq->queuelist); + rqhash_del(rq); if (list_empty(&ctx->rq_list)) sbitmap_clear_bit(sb, bitnr); } + if (ctx->last_merge == rq) + ctx->last_merge = NULL; spin_unlock(&ctx->lock); - return !dispatch_data->rq; + dispatch_data->rq = rq; + + return !rq; } struct request *blk_mq_dispatch_rq_from_ctx(struct blk_mq_hw_ctx *hctx, @@ -1431,6 +1450,8 @@ static inline void __blk_mq_insert_req_list(struct blk_mq_hw_ctx *hctx, list_add(&rq->queuelist, &ctx->rq_list); else list_add_tail(&rq->queuelist, &ctx->rq_list); + + rqhash_add(ctx->hash, rq); } void __blk_mq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, @@ -1923,6 +1944,7 @@ static int blk_mq_hctx_notify_dead(unsigned int cpu, struct hlist_node *node) spin_lock(&ctx->lock); if (!list_empty(&ctx->rq_list)) { list_splice_init(&ctx->rq_list, &tmp); + blk_mq_ctx_remove_rq_list(ctx, &tmp); blk_mq_hctx_clear_pending(hctx, ctx); } spin_unlock(&ctx->lock);