From patchwork Mon Jul 31 16:51:11 2017
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Ming Lei <ming.lei@redhat.com>
X-Patchwork-Id: 9872577
Return-Path: <linux-scsi-owner@kernel.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
	[172.30.200.125])
	by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id
	212A8603B4 for <patchwork-linux-scsi@patchwork.kernel.org>;
	Mon, 31 Jul 2017 16:53:43 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EFD6828459
	for <patchwork-linux-scsi@patchwork.kernel.org>;
	Mon, 31 Jul 2017 16:53:42 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id E4D1028500; Mon, 31 Jul 2017 16:53:42 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI
	autolearn=unavailable version=3.3.1
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9BFAB28508
	for <patchwork-linux-scsi@patchwork.kernel.org>;
	Mon, 31 Jul 2017 16:53:42 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1751614AbdGaQxl (ORCPT
	<rfc822;patchwork-linux-scsi@patchwork.kernel.org>);
	Mon, 31 Jul 2017 12:53:41 -0400
Received: from mx1.redhat.com ([209.132.183.28]:36492 "EHLO mx1.redhat.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1750949AbdGaQxl (ORCPT <rfc822;linux-scsi@vger.kernel.org>);
	Mon, 31 Jul 2017 12:53:41 -0400
Received: from smtp.corp.redhat.com
	(int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13])
	(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by mx1.redhat.com (Postfix) with ESMTPS id D2EF012F5F;
	Mon, 31 Jul 2017 16:53:40 +0000 (UTC)
DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com D2EF012F5F
Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com;
	dmarc=none (p=none dis=none) header.from=redhat.com
Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com;
	spf=fail smtp.mailfrom=ming.lei@redhat.com
Received: from localhost (ovpn-12-46.pek2.redhat.com [10.72.12.46])
	by smtp.corp.redhat.com (Postfix) with ESMTP id 9591660C15;
	Mon, 31 Jul 2017 16:53:32 +0000 (UTC)
From: Ming Lei <ming.lei@redhat.com>
To: Jens Axboe <axboe@fb.com>, linux-block@vger.kernel.org,
	Christoph Hellwig <hch@infradead.org>
Cc: Bart Van Assche <bart.vanassche@sandisk.com>, linux-scsi@vger.kernel.org,
	"Martin K . Petersen" <martin.petersen@oracle.com>,
	"James E . J . Bottomley" <jejb@linux.vnet.ibm.com>,
	Ming Lei <ming.lei@redhat.com>
Subject: [PATCH 14/14] blk-mq-sched: improve IO scheduling on SCSI devcie
Date: Tue,  1 Aug 2017 00:51:11 +0800
Message-Id: <20170731165111.11536-16-ming.lei@redhat.com>
In-Reply-To: <20170731165111.11536-1-ming.lei@redhat.com>
References: <20170731165111.11536-1-ming.lei@redhat.com>
X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16
	(mx1.redhat.com [10.5.110.26]);
	Mon, 31 Jul 2017 16:53:41 +0000 (UTC)
Sender: linux-scsi-owner@vger.kernel.org
Precedence: bulk
List-ID: <linux-scsi.vger.kernel.org>
X-Mailing-List: linux-scsi@vger.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

SCSI device often has per-request_queue queue depth
(.cmd_per_lun), which is applied among all hw queues
actually, and this patchset calls this as shared
queue depth.

One theory of scheduler is that we shouldn't dequeue
request from sw/scheduler queue and dispatch it to
driver when the low level queue is busy.

For SCSI device, queue being busy depends on the
per-request_queue limit, so we should hold all
hw queues if the request queue is busy.

This patch introduces per-request_queue dispatch
list for this purpose, and only when all requests
in this list are dispatched out successfully, we
can restart to dequeue request from sw/scheduler
queue and dispath it to lld.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-mq.c         |  8 +++++++-
 block/blk-mq.h         | 14 +++++++++++---
 include/linux/blkdev.h |  5 +++++
 3 files changed, 23 insertions(+), 4 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 9b8b3a740d18..6d02901d798e 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2667,8 +2667,14 @@ int blk_mq_update_sched_queue_depth(struct request_queue *q)
 	 * this queue depth limit
 	 */
 	if (q->queue_depth) {
-		queue_for_each_hw_ctx(q, hctx, i)
+		queue_for_each_hw_ctx(q, hctx, i) {
 			hctx->flags |= BLK_MQ_F_SHARED_DEPTH;
+			hctx->dispatch_lock = &q->__mq_dispatch_lock;
+			hctx->dispatch_list = &q->__mq_dispatch_list;
+
+			spin_lock_init(hctx->dispatch_lock);
+			INIT_LIST_HEAD(hctx->dispatch_list);
+		}
 	}
 
 	if (!q->elevator)
diff --git a/block/blk-mq.h b/block/blk-mq.h
index a8788058da56..4853d422836f 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -138,19 +138,27 @@ static inline bool blk_mq_hw_queue_mapped(struct blk_mq_hw_ctx *hctx)
 static inline bool blk_mq_hctx_is_busy(struct request_queue *q,
 		struct blk_mq_hw_ctx *hctx)
 {
-	return test_bit(BLK_MQ_S_BUSY, &hctx->state);
+	if (!(hctx->flags & BLK_MQ_F_SHARED_DEPTH))
+		return test_bit(BLK_MQ_S_BUSY, &hctx->state);
+	return q->mq_dispatch_busy;
 }
 
 static inline void blk_mq_hctx_set_busy(struct request_queue *q,
 		struct blk_mq_hw_ctx *hctx)
 {
-	set_bit(BLK_MQ_S_BUSY, &hctx->state);
+	if (!(hctx->flags & BLK_MQ_F_SHARED_DEPTH))
+		set_bit(BLK_MQ_S_BUSY, &hctx->state);
+	else
+		q->mq_dispatch_busy = 1;
 }
 
 static inline void blk_mq_hctx_clear_busy(struct request_queue *q,
 		struct blk_mq_hw_ctx *hctx)
 {
-	clear_bit(BLK_MQ_S_BUSY, &hctx->state);
+	if (!(hctx->flags & BLK_MQ_F_SHARED_DEPTH))
+		clear_bit(BLK_MQ_S_BUSY, &hctx->state);
+	else
+		q->mq_dispatch_busy = 0;
 }
 
 static inline bool blk_mq_has_dispatch_rqs(struct blk_mq_hw_ctx *hctx)
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 25f6a0cb27d3..bc0e607710f2 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -395,6 +395,11 @@ struct request_queue {
 
 	atomic_t		shared_hctx_restart;
 
+	/* blk-mq dispatch list and lock for shared queue depth case */
+	struct list_head	__mq_dispatch_list;
+	spinlock_t		__mq_dispatch_lock;
+	unsigned int		mq_dispatch_busy;
+
 	struct blk_queue_stats	*stats;
 	struct rq_wb		*rq_wb;