From patchwork Tue Jan 16 16:41:04 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 10167619 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 92233601E7 for ; Tue, 16 Jan 2018 16:41:30 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7FF2A206AC for ; Tue, 16 Jan 2018 16:41:30 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 74A8920855; Tue, 16 Jan 2018 16:41:30 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5E279206AC for ; Tue, 16 Jan 2018 16:41:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751047AbeAPQl2 (ORCPT ); Tue, 16 Jan 2018 11:41:28 -0500 Received: from mx1.redhat.com ([209.132.183.28]:8680 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750876AbeAPQlO (ORCPT ); Tue, 16 Jan 2018 11:41:14 -0500 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 975AA8046E; Tue, 16 Jan 2018 16:41:08 +0000 (UTC) Received: from localhost (ovpn-112-56.rdu2.redhat.com [10.10.112.56]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 9AA4C60C9E; Tue, 16 Jan 2018 16:41:05 +0000 (UTC) Date: Tue, 16 Jan 2018 11:41:04 -0500 From: Mike Snitzer To: axboe@kernel.dk Cc: linux-block@vger.kernel.org, Ming Lei , dm-devel@redhat.com, hch@lst.de Subject: Re: [for-4.16 PATCH v4-mike 2/2] blk-mq: issue request directly for blk_insert_cloned_request Message-ID: <20180116164103.GA21650@redhat.com> References: <20180116150148.21145-1-snitzer@redhat.com> <20180116150148.21145-2-snitzer@redhat.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20180116150148.21145-2-snitzer@redhat.com> User-Agent: Mutt/1.5.23 (2014-03-12) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Tue, 16 Jan 2018 16:41:13 +0000 (UTC) Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Tue, Jan 16 2018 at 10:01P -0500, Mike Snitzer wrote: > From: Ming Lei > > blk_insert_cloned_request() is called in fast path of dm-rq driver, and > in this function we append request to hctx->dispatch_list of the underlying > queue directly. > > 1) This way isn't efficient enough because hctx lock is always required > > 2) With blk_insert_cloned_request(), we bypass underlying queue's IO > scheduler totally, and depend on DM rq driver to do IO schedule > completely. But DM rq driver can't get underlying queue's dispatch > feedback at all, and this information is extreamly useful for IO merge. > Without that IO merge can't be done basically by blk-mq, which causes > very bad sequential IO performance. > > Fix this by having blk_insert_cloned_request() make use of > blk_mq_try_issue_directly() via blk_mq_request_direct_issue(). > blk_mq_request_direct_issue() allows a request to be dispatched to be > issue directly to the underlying queue and provides dispatch result to > dm-rq and blk-mq. > > With this, the DM's blk-mq sequential IO performance is vastly > improved (as much as 3X in mpath/virtio-scsi testing). > > Signed-off-by: Ming Lei > Signed-off-by: Mike Snitzer > --- > block/blk-core.c | 3 +-- > block/blk-mq.c | 42 ++++++++++++++++++++++++++++++++++++------ > block/blk-mq.h | 3 +++ > drivers/md/dm-rq.c | 19 ++++++++++++++++--- > 4 files changed, 56 insertions(+), 11 deletions(-) > > diff --git a/block/blk-core.c b/block/blk-core.c > index 7ba607527487..55f338020254 100644 > --- a/block/blk-core.c > +++ b/block/blk-core.c > @@ -2500,8 +2500,7 @@ blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request * > * bypass a potential scheduler on the bottom device for > * insert. > */ > - blk_mq_request_bypass_insert(rq, true); > - return BLK_STS_OK; > + return blk_mq_request_direct_issue(rq); > } > > spin_lock_irqsave(q->queue_lock, flags); > diff --git a/block/blk-mq.c b/block/blk-mq.c > index 55f3a27fb2e6..3168a13cb012 100644 > --- a/block/blk-mq.c > +++ b/block/blk-mq.c > @@ -1706,6 +1706,12 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, > blk_qc_t new_cookie; > blk_status_t ret = BLK_STS_OK; > bool run_queue = true; > + /* > + * If @cookie is NULL do not insert the request, this mode is used > + * by blk_insert_cloned_request() via blk_mq_request_direct_issue() > + */ > + bool dispatch_only = !cookie; > + bool need_insert = false; > > /* RCU or SRCU read lock is needed before checking quiesced flag */ > if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) { > @@ -1713,25 +1719,38 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, > goto insert; > } > > - if (q->elevator) > + if (q->elevator && !dispatch_only) > goto insert; > > if (!blk_mq_get_driver_tag(rq, NULL, false)) > - goto insert; > + need_insert = true; > > - if (!blk_mq_get_dispatch_budget(hctx)) { > + if (!need_insert && !blk_mq_get_dispatch_budget(hctx)) { > blk_mq_put_driver_tag(rq); > + need_insert = true; > + } > + > + if (need_insert) { > + if (dispatch_only) > + return BLK_STS_RESOURCE; > goto insert; > } > > new_cookie = request_to_qc_t(hctx, rq); > > + ret = q->mq_ops->queue_rq(hctx, &bd); > + > + if (dispatch_only) { > + if (ret == BLK_STS_RESOURCE) > + __blk_mq_requeue_request(rq); > + return ret; > + } > + > /* > * For OK queue, we are done. For error, kill it. Any other > * error (busy), just add it to our list as we previously > * would have done > */ > - ret = q->mq_ops->queue_rq(hctx, &bd); > switch (ret) { > case BLK_STS_OK: > *cookie = new_cookie; FYI, because blk_mq_put_driver_tag() is idempotent, the above can be simplified by eliminating 'need_insert' like so (Jens feel free to fold this in?): diff --git a/block/blk-mq.c b/block/blk-mq.c index 3168a13cb012..c2f66a5c5242 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1711,7 +1711,6 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, * by blk_insert_cloned_request() via blk_mq_request_direct_issue() */ bool dispatch_only = !cookie; - bool need_insert = false; /* RCU or SRCU read lock is needed before checking quiesced flag */ if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) { @@ -1722,15 +1721,9 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, if (q->elevator && !dispatch_only) goto insert; - if (!blk_mq_get_driver_tag(rq, NULL, false)) - need_insert = true; - - if (!need_insert && !blk_mq_get_dispatch_budget(hctx)) { + if (!blk_mq_get_driver_tag(rq, NULL, false) || + !blk_mq_get_dispatch_budget(hctx)) { blk_mq_put_driver_tag(rq); - need_insert = true; - } - - if (need_insert) { if (dispatch_only) return BLK_STS_RESOURCE; goto insert;