From patchwork Thu Sep 21 18:57:31 2017
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "jianchao.wang" <jianchao.w.wang@oracle.com>
X-Patchwork-Id: 9963205
Return-Path: <linux-block-owner@kernel.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
	[172.30.200.125])
	by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id
	1CEEE6056A for <patchwork-linux-block@patchwork.kernel.org>;
	Thu, 21 Sep 2017 02:58:10 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0C8BA292B0
	for <patchwork-linux-block@patchwork.kernel.org>;
	Thu, 21 Sep 2017 02:58:10 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id 00AEC292B2; Thu, 21 Sep 2017 02:58:09 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-3.7 required=2.0 tests=BAYES_00,
	DATE_IN_FUTURE_12_24,
	RCVD_IN_DNSWL_HI,UNPARSEABLE_RELAY autolearn=ham version=3.3.1
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 500C7292B0
	for <patchwork-linux-block@patchwork.kernel.org>;
	Thu, 21 Sep 2017 02:58:09 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1751396AbdIUC6I (ORCPT
	<rfc822;patchwork-linux-block@patchwork.kernel.org>);
	Wed, 20 Sep 2017 22:58:08 -0400
Received: from aserp1040.oracle.com ([141.146.126.69]:45060 "EHLO
	aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1751283AbdIUC6I (ORCPT
	<rfc822;linux-block@vger.kernel.org>);
	Wed, 20 Sep 2017 22:58:08 -0400
Received: from aserv0022.oracle.com (aserv0022.oracle.com [141.146.126.234])
	by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2)
	with ESMTP id v8L2vgYF003742
	(version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256
	verify=OK); Thu, 21 Sep 2017 02:57:42 GMT
Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236])
	by aserv0022.oracle.com (8.14.4/8.14.4) with ESMTP id
	v8L2vgYA024467
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256
	verify=OK); Thu, 21 Sep 2017 02:57:42 GMT
Received: from abhmp0009.oracle.com (abhmp0009.oracle.com [141.146.116.15])
	by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id
	v8L2vgOb019384; Thu, 21 Sep 2017 02:57:42 GMT
Received: from will-ThinkCentre-M910s.cn.oracle.com (/10.182.71.22)
	by default (Oracle Beehive Gateway v4.0)
	with ESMTP ; Wed, 20 Sep 2017 19:57:41 -0700
From: Jianchao Wang <jianchao.w.wang@oracle.com>
To: Jens Axboe <axboe@kernel.dk>, hch@infradead.org
Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
	Jianchao Wang <jianchao.w.wang@oracle.com>
Subject: [PATCH v2] block: consider merge of segments when merge bio into rq
Date: Fri, 22 Sep 2017 02:57:31 +0800
Message-Id: <1506020251-24768-1-git-send-email-jianchao.w.wang@oracle.com>
X-Mailer: git-send-email 2.7.4
X-Source-IP: aserv0022.oracle.com [141.146.126.234]
Sender: linux-block-owner@vger.kernel.org
Precedence: bulk
List-ID: <linux-block.vger.kernel.org>
X-Mailing-List: linux-block@vger.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

When account the nr_phys_segments during merging bios into rq,
only consider segments merging in individual bio but not all
the bios in a rq. This leads to the bigger nr_phys_segments of
rq than the real one when the segments of bios in rq are
contiguous and mergeable. The nr_phys_segments of rq will exceed
max_segmets of q and stop merging while the sectors of rq maybe
far away from the max_sectors of q.

In practice, the merging will stop due to max_segmets limit while
the segments in the rq are contiguous and mergeable during the
mkfs.ext4 workload on my local. This could be harmful to the
performance of sequential operations.

To fix it, consider the segments merge when account nr_phys_segments
of rq during merging bio into rq. Decrease the nr_phys_segments of rq
by 1 when the adjacent segments in bio and rq are contiguous and
mergeable. Consequently get more fully merging and better performance
in sequential operations. In addition, it could eliminate the wasting of
scatterlist structure.

On my local mkfs.ext4 workload, the final size of rq issued raise from
168 sectors (max_segmets is 168) to 2560 sectors (max_sector_kb is 1280).

V2 : Add more comment to elaborate how this issue found and result after
apply the patch.

Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
---
 block/blk-merge.c | 98 ++++++++++++++++++++++++++++++++++++-------------------
 1 file changed, 64 insertions(+), 34 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 14b6e37..b2f54fd 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -472,54 +472,60 @@ int blk_rq_map_sg(struct request_queue *q, struct request *rq,
 }
 EXPORT_SYMBOL(blk_rq_map_sg);
 
-static inline int ll_new_hw_segment(struct request_queue *q,
-				    struct request *req,
-				    struct bio *bio)
-{
-	int nr_phys_segs = bio_phys_segments(q, bio);
-
-	if (req->nr_phys_segments + nr_phys_segs > queue_max_segments(q))
-		goto no_merge;
-
-	if (blk_integrity_merge_bio(q, req, bio) == false)
-		goto no_merge;
-
-	/*
-	 * This will form the start of a new hw segment.  Bump both
-	 * counters.
-	 */
-	req->nr_phys_segments += nr_phys_segs;
-	return 1;
-
-no_merge:
-	req_set_nomerge(q, req);
-	return 0;
-}
-
 int ll_back_merge_fn(struct request_queue *q, struct request *req,
 		     struct bio *bio)
 {
+	unsigned int seg_size;
+	int total_nr_phys_segs;
+	bool contig;
+
 	if (req_gap_back_merge(req, bio))
 		return 0;
 	if (blk_integrity_rq(req) &&
 	    integrity_req_gap_back_merge(req, bio))
 		return 0;
 	if (blk_rq_sectors(req) + bio_sectors(bio) >
-	    blk_rq_get_max_sectors(req, blk_rq_pos(req))) {
-		req_set_nomerge(q, req);
-		return 0;
-	}
+	    blk_rq_get_max_sectors(req, blk_rq_pos(req)))
+		goto no_merge;
+
 	if (!bio_flagged(req->biotail, BIO_SEG_VALID))
 		blk_recount_segments(q, req->biotail);
 	if (!bio_flagged(bio, BIO_SEG_VALID))
 		blk_recount_segments(q, bio);
 
-	return ll_new_hw_segment(q, req, bio);
+	if (blk_integrity_merge_bio(q, req, bio) == false)
+		goto no_merge;
+
+	seg_size = req->biotail->bi_seg_back_size + bio->bi_seg_front_size;
+	total_nr_phys_segs = req->nr_phys_segments + bio_phys_segments(q, bio);
+
+	contig = blk_phys_contig_segment(q, req->biotail, bio);
+	if (contig)
+		total_nr_phys_segs--;
+
+	if (unlikely(total_nr_phys_segs > queue_max_segments(q)))
+		goto no_merge;
+
+	if (contig) {
+		if (req->nr_phys_segments == 1)
+			req->bio->bi_seg_front_size = seg_size;
+		if (bio->bi_phys_segments == 1)
+			bio->bi_seg_back_size = seg_size;
+	}
+	req->nr_phys_segments = total_nr_phys_segs;
+	return 1;
+
+no_merge:
+	req_set_nomerge(q, req);
+	return 0;
 }
 
 int ll_front_merge_fn(struct request_queue *q, struct request *req,
 		      struct bio *bio)
 {
+	unsigned int seg_size;
+	int total_nr_phys_segs;
+	bool contig;
 
 	if (req_gap_front_merge(req, bio))
 		return 0;
@@ -527,16 +533,40 @@ int ll_front_merge_fn(struct request_queue *q, struct request *req,
 	    integrity_req_gap_front_merge(req, bio))
 		return 0;
 	if (blk_rq_sectors(req) + bio_sectors(bio) >
-	    blk_rq_get_max_sectors(req, bio->bi_iter.bi_sector)) {
-		req_set_nomerge(q, req);
-		return 0;
-	}
+	    blk_rq_get_max_sectors(req, bio->bi_iter.bi_sector))
+		goto no_merge;
+
 	if (!bio_flagged(bio, BIO_SEG_VALID))
 		blk_recount_segments(q, bio);
 	if (!bio_flagged(req->bio, BIO_SEG_VALID))
 		blk_recount_segments(q, req->bio);
 
-	return ll_new_hw_segment(q, req, bio);
+	if (blk_integrity_merge_bio(q, req, bio) == false)
+		goto no_merge;
+
+	seg_size = req->bio->bi_seg_front_size + bio->bi_seg_back_size;
+	total_nr_phys_segs = req->nr_phys_segments + bio_phys_segments(q, bio);
+
+	contig = blk_phys_contig_segment(q, bio, req->bio);
+	if (contig)
+		total_nr_phys_segs--;
+
+	if (unlikely(total_nr_phys_segs > queue_max_segments(q)))
+		goto no_merge;
+
+	if (contig) {
+		if (req->nr_phys_segments == 1)
+			req->biotail->bi_seg_back_size = seg_size;
+		if (bio->bi_phys_segments == 1)
+			bio->bi_seg_front_size = seg_size;
+	}
+
+	req->nr_phys_segments = total_nr_phys_segs;
+	return 1;
+
+no_merge:
+	req_set_nomerge(q, req);
+	return 0;
 }
 
 /*