From patchwork Sat Sep 30 11:46:52 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 9979475 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 0BC4E60311 for ; Sat, 30 Sep 2017 11:58:31 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E3DB4291F5 for ; Sat, 30 Sep 2017 11:58:30 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D8C7A2945C; Sat, 30 Sep 2017 11:58:30 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 494E8291F5 for ; Sat, 30 Sep 2017 11:58:30 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 7E7A081E07; Sat, 30 Sep 2017 11:58:29 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 7E7A081E07 Authentication-Results: ext-mx01.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx01.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=dm-devel-bounces@redhat.com Received: from colo-mx.corp.redhat.com (colo-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.20]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 6375460639; Sat, 30 Sep 2017 11:58:29 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id 3C38F1855947; Sat, 30 Sep 2017 11:58:29 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id v8UBmUjk010445 for ; Sat, 30 Sep 2017 07:48:30 -0400 Received: by smtp.corp.redhat.com (Postfix) id 8D94E4CD; Sat, 30 Sep 2017 11:48:30 +0000 (UTC) Delivered-To: dm-devel@redhat.com Received: from localhost (ovpn-12-31.pek2.redhat.com [10.72.12.31]) by smtp.corp.redhat.com (Postfix) with ESMTP id 883575167D; Sat, 30 Sep 2017 11:48:17 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org, Christoph Hellwig , Mike Snitzer , dm-devel@redhat.com Date: Sat, 30 Sep 2017 19:46:52 +0800 Message-Id: <20170930114652.32441-6-ming.lei@redhat.com> In-Reply-To: <20170930114652.32441-1-ming.lei@redhat.com> References: <20170930114652.32441-1-ming.lei@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-loop: dm-devel@redhat.com Cc: Bart Van Assche , Ming Lei , Laurence Oberman , linux-kernel@vger.kernel.org, Omar Sandoval Subject: [dm-devel] [PATCH 5/5] dm-rq: improve I/O merge by dealing with underlying STS_RESOURCE X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Sat, 30 Sep 2017 11:58:29 +0000 (UTC) X-Virus-Scanned: ClamAV using ClamSMTP If the underlying queue returns BLK_STS_RESOURCE, we let dm-rq handle the requeue instead of blk-mq, then I/O merge can be improved because underlying's out-of-resource can be perceived and handled by dm-rq now. Follows IOPS test of mpath on lpfc, fio(libaio, bs:4k, dio, queue_depth:64, 8 jobs). 1) blk-mq none scheduler ----------------------------------------------------- IOPS(K) |v4.14-rc2 |v4.14-rc2 with| v4.14-rc2 with | |[1][2] | [1] [2] [3] ----------------------------------------------------- read | 53.69 | 40.26 | 94.61 ----------------------------------------------------- randread | 24.64 | 30.08 | 35.57 ----------------------------------------------------- write | 39.55 | 41.51 | 216.84 ----------------------------------------------------- randwrite | 33.97 | 34.27 | 33.98 ----------------------------------------------------- 2) blk-mq mq-deadline scheduler ----------------------------------------------------- IOPS(K) |v4.14-rc2 |v4.14-rc2 with| v4.14-rc2 with | |[1][2] | [1] [2] [3] ----------------------------------------------------- IOPS(K) |MQ-DEADLINE |MQ-DEADLINE |MQ-DEADLINE ----------------------------------------------------- read | 23.81 | 21.91 | 89.94 ----------------------------------------------------- randread | 38.47 | 38.96 | 38.02 ----------------------------------------------------- write | 39.52 | 40.2 | 225.75 ----------------------------------------------------- randwrite | 34.8 | 33.73 | 33.44 ----------------------------------------------------- [1] [PATCH V5 0/7] blk-mq-sched: improve sequential I/O performance(part 1) https://marc.info/?l=linux-block&m=150676854821077&w=2 [2] [PATCH V5 0/8] blk-mq: improve bio merge for none scheduler https://marc.info/?l=linux-block&m=150677085521416&w=2 [3] this patchset Signed-off-by: Ming Lei --- block/blk-mq.c | 17 +---------------- drivers/md/dm-rq.c | 14 ++++++++++++-- 2 files changed, 13 insertions(+), 18 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 9a3a561a63b5..58d2268f9733 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1467,17 +1467,6 @@ void __blk_mq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, blk_mq_hctx_mark_pending(hctx, ctx); } -static void blk_mq_request_direct_insert(struct blk_mq_hw_ctx *hctx, - struct request *rq) -{ - spin_lock(&hctx->lock); - list_add_tail(&rq->queuelist, &hctx->dispatch); - set_bit(BLK_MQ_S_DISPATCH_BUSY, &hctx->state); - spin_unlock(&hctx->lock); - - blk_mq_run_hw_queue(hctx, false); -} - /* * Should only be used carefully, when the caller knows we want to * bypass a potential IO scheduler on the target device. @@ -1487,12 +1476,8 @@ blk_status_t blk_mq_request_bypass_insert(struct request *rq) struct blk_mq_ctx *ctx = rq->mq_ctx; struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(rq->q, ctx->cpu); blk_qc_t cookie; - blk_status_t ret; - ret = blk_mq_try_issue_directly(hctx, rq, &cookie, true); - if (ret == BLK_STS_RESOURCE) - blk_mq_request_direct_insert(hctx, rq); - return ret; + return blk_mq_try_issue_directly(hctx, rq, &cookie, true); } void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx, diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c index 2ef524bddd38..feb49c4d6fa2 100644 --- a/drivers/md/dm-rq.c +++ b/drivers/md/dm-rq.c @@ -405,7 +405,7 @@ static void end_clone_request(struct request *clone, blk_status_t error) dm_complete_request(tio->orig, error); } -static void dm_dispatch_clone_request(struct request *clone, struct request *rq) +static blk_status_t dm_dispatch_clone_request(struct request *clone, struct request *rq) { blk_status_t r; @@ -417,6 +417,7 @@ static void dm_dispatch_clone_request(struct request *clone, struct request *rq) if (r != BLK_STS_OK && r != BLK_STS_RESOURCE) /* must complete clone in terms of original request */ dm_complete_request(rq, r); + return r; } static int dm_rq_bio_constructor(struct bio *bio, struct bio *bio_orig, @@ -490,8 +491,10 @@ static int map_request(struct dm_rq_target_io *tio) struct request *rq = tio->orig; struct request *cache = tio->clone; struct request *clone = cache; + blk_status_t ret; r = ti->type->clone_and_map_rq(ti, rq, &tio->info, &clone); + again: switch (r) { case DM_MAPIO_SUBMITTED: /* The target has taken the I/O to submit by itself later */ @@ -509,7 +512,14 @@ static int map_request(struct dm_rq_target_io *tio) /* The target has remapped the I/O so dispatch it */ trace_block_rq_remap(clone->q, clone, disk_devt(dm_disk(md)), blk_rq_pos(rq)); - dm_dispatch_clone_request(clone, rq); + ret = dm_dispatch_clone_request(clone, rq); + if (ret == BLK_STS_RESOURCE) { + if (!rq->q->mq_ops) + r = DM_MAPIO_DELAY_REQUEUE; + else + r = DM_MAPIO_REQUEUE; + goto again; + } break; case DM_MAPIO_REQUEUE: /* The target wants to requeue the I/O */