From patchwork Tue Jan 16 16:41:04 2018
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Mike Snitzer <snitzer@redhat.com>
X-Patchwork-Id: 10167619
Return-Path: <linux-block-owner@kernel.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
	[172.30.200.125])
	by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id
	92233601E7 for <patchwork-linux-block@patchwork.kernel.org>;
	Tue, 16 Jan 2018 16:41:30 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7FF2A206AC
	for <patchwork-linux-block@patchwork.kernel.org>;
	Tue, 16 Jan 2018 16:41:30 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id 74A8920855; Tue, 16 Jan 2018 16:41:30 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI
	autolearn=ham version=3.3.1
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5E279206AC
	for <patchwork-linux-block@patchwork.kernel.org>;
	Tue, 16 Jan 2018 16:41:29 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1751047AbeAPQl2 (ORCPT
	<rfc822;patchwork-linux-block@patchwork.kernel.org>);
	Tue, 16 Jan 2018 11:41:28 -0500
Received: from mx1.redhat.com ([209.132.183.28]:8680 "EHLO mx1.redhat.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1750876AbeAPQlO (ORCPT <rfc822;linux-block@vger.kernel.org>);
	Tue, 16 Jan 2018 11:41:14 -0500
Received: from smtp.corp.redhat.com
	(int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12])
	(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by mx1.redhat.com (Postfix) with ESMTPS id 975AA8046E;
	Tue, 16 Jan 2018 16:41:08 +0000 (UTC)
Received: from localhost (ovpn-112-56.rdu2.redhat.com [10.10.112.56])
	by smtp.corp.redhat.com (Postfix) with ESMTPS id 9AA4C60C9E;
	Tue, 16 Jan 2018 16:41:05 +0000 (UTC)
Date: Tue, 16 Jan 2018 11:41:04 -0500
From: Mike Snitzer <snitzer@redhat.com>
To: axboe@kernel.dk
Cc: linux-block@vger.kernel.org, Ming Lei <tom.leiming@gmail.com>,
	dm-devel@redhat.com, hch@lst.de
Subject: Re: [for-4.16 PATCH v4-mike 2/2] blk-mq: issue request directly for
	blk_insert_cloned_request
Message-ID: <20180116164103.GA21650@redhat.com>
References: <20180116150148.21145-1-snitzer@redhat.com>
	<20180116150148.21145-2-snitzer@redhat.com>
MIME-Version: 1.0
Content-Disposition: inline
In-Reply-To: <20180116150148.21145-2-snitzer@redhat.com>
User-Agent: Mutt/1.5.23 (2014-03-12)
X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16
	(mx1.redhat.com [10.5.110.28]);
	Tue, 16 Jan 2018 16:41:13 +0000 (UTC)
Sender: linux-block-owner@vger.kernel.org
Precedence: bulk
List-ID: <linux-block.vger.kernel.org>
X-Mailing-List: linux-block@vger.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

On Tue, Jan 16 2018 at 10:01P -0500,
Mike Snitzer <snitzer@redhat.com> wrote:

> From: Ming Lei <ming.lei@redhat.com>
> 
> blk_insert_cloned_request() is called in fast path of dm-rq driver, and
> in this function we append request to hctx->dispatch_list of the underlying
> queue directly.
> 
> 1) This way isn't efficient enough because hctx lock is always required
> 
> 2) With blk_insert_cloned_request(), we bypass underlying queue's IO
> scheduler totally, and depend on DM rq driver to do IO schedule
> completely.  But DM rq driver can't get underlying queue's dispatch
> feedback at all, and this information is extreamly useful for IO merge.
> Without that IO merge can't be done basically by blk-mq, which causes
> very bad sequential IO performance.
> 
> Fix this by having blk_insert_cloned_request() make use of
> blk_mq_try_issue_directly() via blk_mq_request_direct_issue().
> blk_mq_request_direct_issue() allows a request to be dispatched to be
> issue directly to the underlying queue and provides dispatch result to
> dm-rq and blk-mq.
> 
> With this, the DM's blk-mq sequential IO performance is vastly
> improved (as much as 3X in mpath/virtio-scsi testing).
> 
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  block/blk-core.c   |  3 +--
>  block/blk-mq.c     | 42 ++++++++++++++++++++++++++++++++++++------
>  block/blk-mq.h     |  3 +++
>  drivers/md/dm-rq.c | 19 ++++++++++++++++---
>  4 files changed, 56 insertions(+), 11 deletions(-)
> 
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 7ba607527487..55f338020254 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -2500,8 +2500,7 @@ blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request *
>  		 * bypass a potential scheduler on the bottom device for
>  		 * insert.
>  		 */
> -		blk_mq_request_bypass_insert(rq, true);
> -		return BLK_STS_OK;
> +		return blk_mq_request_direct_issue(rq);
>  	}
>  
>  	spin_lock_irqsave(q->queue_lock, flags);
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 55f3a27fb2e6..3168a13cb012 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1706,6 +1706,12 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
>  	blk_qc_t new_cookie;
>  	blk_status_t ret = BLK_STS_OK;
>  	bool run_queue = true;
> +	/*
> +	 * If @cookie is NULL do not insert the request, this mode is used
> +	 * by blk_insert_cloned_request() via blk_mq_request_direct_issue()
> +	 */
> +	bool dispatch_only = !cookie;
> +	bool need_insert = false;
>  
>  	/* RCU or SRCU read lock is needed before checking quiesced flag */
>  	if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) {
> @@ -1713,25 +1719,38 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
>  		goto insert;
>  	}
>  
> -	if (q->elevator)
> +	if (q->elevator && !dispatch_only)
>  		goto insert;
>  
>  	if (!blk_mq_get_driver_tag(rq, NULL, false))
> -		goto insert;
> +		need_insert = true;
>  
> -	if (!blk_mq_get_dispatch_budget(hctx)) {
> +	if (!need_insert && !blk_mq_get_dispatch_budget(hctx)) {
>  		blk_mq_put_driver_tag(rq);
> +		need_insert = true;
> +	}
> +
> +	if (need_insert) {
> +		if (dispatch_only)
> +			return BLK_STS_RESOURCE;
>  		goto insert;
>  	}
>  
>  	new_cookie = request_to_qc_t(hctx, rq);
>  
> +	ret = q->mq_ops->queue_rq(hctx, &bd);
> +
> +	if (dispatch_only) {
> +		if (ret == BLK_STS_RESOURCE)
> +			__blk_mq_requeue_request(rq);
> +		return ret;
> +	}
> +
>  	/*
>  	 * For OK queue, we are done. For error, kill it. Any other
>  	 * error (busy), just add it to our list as we previously
>  	 * would have done
>  	 */
> -	ret = q->mq_ops->queue_rq(hctx, &bd);
>  	switch (ret) {
>  	case BLK_STS_OK:
>  		*cookie = new_cookie;

FYI, because blk_mq_put_driver_tag() is idempotent, the above can be
simplified by eliminating 'need_insert' like so (Jens feel free to fold
this in?):

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 3168a13cb012..c2f66a5c5242 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1711,7 +1711,6 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
 	 * by blk_insert_cloned_request() via blk_mq_request_direct_issue()
 	 */
 	bool dispatch_only = !cookie;
-	bool need_insert = false;
 
 	/* RCU or SRCU read lock is needed before checking quiesced flag */
 	if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) {
@@ -1722,15 +1721,9 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
 	if (q->elevator && !dispatch_only)
 		goto insert;
 
-	if (!blk_mq_get_driver_tag(rq, NULL, false))
-		need_insert = true;
-
-	if (!need_insert && !blk_mq_get_dispatch_budget(hctx)) {
+	if (!blk_mq_get_driver_tag(rq, NULL, false) ||
+	    !blk_mq_get_dispatch_budget(hctx)) {
 		blk_mq_put_driver_tag(rq);
-		need_insert = true;
-	}
-
-	if (need_insert) {
 		if (dispatch_only)
 			return BLK_STS_RESOURCE;
 		goto insert;