From patchwork Mon Jul 31 16:51:11 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 9872577 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 212A8603B4 for ; Mon, 31 Jul 2017 16:53:43 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EFD6828459 for ; Mon, 31 Jul 2017 16:53:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E4D1028500; Mon, 31 Jul 2017 16:53:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9BFAB28508 for ; Mon, 31 Jul 2017 16:53:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751614AbdGaQxl (ORCPT ); Mon, 31 Jul 2017 12:53:41 -0400 Received: from mx1.redhat.com ([209.132.183.28]:36492 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750949AbdGaQxl (ORCPT ); Mon, 31 Jul 2017 12:53:41 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D2EF012F5F; Mon, 31 Jul 2017 16:53:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com D2EF012F5F Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=ming.lei@redhat.com Received: from localhost (ovpn-12-46.pek2.redhat.com [10.72.12.46]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9591660C15; Mon, 31 Jul 2017 16:53:32 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org, Christoph Hellwig Cc: Bart Van Assche , linux-scsi@vger.kernel.org, "Martin K . Petersen" , "James E . J . Bottomley" , Ming Lei Subject: [PATCH 14/14] blk-mq-sched: improve IO scheduling on SCSI devcie Date: Tue, 1 Aug 2017 00:51:11 +0800 Message-Id: <20170731165111.11536-16-ming.lei@redhat.com> In-Reply-To: <20170731165111.11536-1-ming.lei@redhat.com> References: <20170731165111.11536-1-ming.lei@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Mon, 31 Jul 2017 16:53:41 +0000 (UTC) Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP SCSI device often has per-request_queue queue depth (.cmd_per_lun), which is applied among all hw queues actually, and this patchset calls this as shared queue depth. One theory of scheduler is that we shouldn't dequeue request from sw/scheduler queue and dispatch it to driver when the low level queue is busy. For SCSI device, queue being busy depends on the per-request_queue limit, so we should hold all hw queues if the request queue is busy. This patch introduces per-request_queue dispatch list for this purpose, and only when all requests in this list are dispatched out successfully, we can restart to dequeue request from sw/scheduler queue and dispath it to lld. Signed-off-by: Ming Lei --- block/blk-mq.c | 8 +++++++- block/blk-mq.h | 14 +++++++++++--- include/linux/blkdev.h | 5 +++++ 3 files changed, 23 insertions(+), 4 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 9b8b3a740d18..6d02901d798e 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2667,8 +2667,14 @@ int blk_mq_update_sched_queue_depth(struct request_queue *q) * this queue depth limit */ if (q->queue_depth) { - queue_for_each_hw_ctx(q, hctx, i) + queue_for_each_hw_ctx(q, hctx, i) { hctx->flags |= BLK_MQ_F_SHARED_DEPTH; + hctx->dispatch_lock = &q->__mq_dispatch_lock; + hctx->dispatch_list = &q->__mq_dispatch_list; + + spin_lock_init(hctx->dispatch_lock); + INIT_LIST_HEAD(hctx->dispatch_list); + } } if (!q->elevator) diff --git a/block/blk-mq.h b/block/blk-mq.h index a8788058da56..4853d422836f 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -138,19 +138,27 @@ static inline bool blk_mq_hw_queue_mapped(struct blk_mq_hw_ctx *hctx) static inline bool blk_mq_hctx_is_busy(struct request_queue *q, struct blk_mq_hw_ctx *hctx) { - return test_bit(BLK_MQ_S_BUSY, &hctx->state); + if (!(hctx->flags & BLK_MQ_F_SHARED_DEPTH)) + return test_bit(BLK_MQ_S_BUSY, &hctx->state); + return q->mq_dispatch_busy; } static inline void blk_mq_hctx_set_busy(struct request_queue *q, struct blk_mq_hw_ctx *hctx) { - set_bit(BLK_MQ_S_BUSY, &hctx->state); + if (!(hctx->flags & BLK_MQ_F_SHARED_DEPTH)) + set_bit(BLK_MQ_S_BUSY, &hctx->state); + else + q->mq_dispatch_busy = 1; } static inline void blk_mq_hctx_clear_busy(struct request_queue *q, struct blk_mq_hw_ctx *hctx) { - clear_bit(BLK_MQ_S_BUSY, &hctx->state); + if (!(hctx->flags & BLK_MQ_F_SHARED_DEPTH)) + clear_bit(BLK_MQ_S_BUSY, &hctx->state); + else + q->mq_dispatch_busy = 0; } static inline bool blk_mq_has_dispatch_rqs(struct blk_mq_hw_ctx *hctx) diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 25f6a0cb27d3..bc0e607710f2 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -395,6 +395,11 @@ struct request_queue { atomic_t shared_hctx_restart; + /* blk-mq dispatch list and lock for shared queue depth case */ + struct list_head __mq_dispatch_list; + spinlock_t __mq_dispatch_lock; + unsigned int mq_dispatch_busy; + struct blk_queue_stats *stats; struct rq_wb *rq_wb;