diff mbox

block: fix bio_will_gap()

Message ID 20170413160626.14635-1-ming.lei@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Ming Lei April 13, 2017, 4:06 p.m. UTC
Now commit 729204ef49ec("block: relax check on sg gap")
allows us to merge bios if both are physically contiguous,
this change can merge huge of small bios in use case of mkfs,
for example, mkfs.ntfs running time can be decreased to ~1/10.

But if one rq starts with non-aligned buffer(the 1st bvec's
bv_offset isn't zero) and if we allow to merge, it is quite
difficult to respect sg gap limit, especially the segment
can't be at maximum segment size, otherwise the segment
ends in unaligned virt boundary. This patch trys to avoid the
issue by not allowing to merge if the req starts with non-aligned
buffer.

Also add comments to explain why the merged segment can't
end in unaligned virt boundary.

Fixes: 729204ef49ec ("block: relax check on sg gap")
Tested-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 include/linux/blkdev.h | 32 ++++++++++++++++++++++++++++----
 1 file changed, 28 insertions(+), 4 deletions(-)

Comments

Bart Van Assche April 13, 2017, 4:22 p.m. UTC | #1
On Fri, 2017-04-14 at 00:06 +0800, Ming Lei wrote:
> But if one rq starts with non-aligned buffer(the 1st bvec's
> bv_offset isn't zero) and if we allow to merge, it is quite
> difficult to respect sg gap limit, especially the segment
> can't be at maximum segment size, otherwise the segment
> ends in unaligned virt boundary. This patch trys to avoid the
> issue by not allowing to merge if the req starts with non-aligned
> buffer.

Hello Ming,

Why is it considered difficult to detect whether or not a gap exists if
the offset of the first bio is non-zero? Please note that a thoroughly
tested version of gap detection code that supports non-zero offsets for
the first element is already upstream. See also ib_map_mr_sg() and
ib_sg_to_pages() in drivers/infiniband/core/verbs.c.

Thanks,

Bart.
Ming Lei April 14, 2017, 2:04 a.m. UTC | #2
On Thu, Apr 13, 2017 at 04:22:22PM +0000, Bart Van Assche wrote:
> On Fri, 2017-04-14 at 00:06 +0800, Ming Lei wrote:
> > But if one rq starts with non-aligned buffer(the 1st bvec's
> > bv_offset isn't zero) and if we allow to merge, it is quite
> > difficult to respect sg gap limit, especially the segment
> > can't be at maximum segment size, otherwise the segment
> > ends in unaligned virt boundary. This patch trys to avoid the
> > issue by not allowing to merge if the req starts with non-aligned
> > buffer.
> 
> Hello Ming,
> 
> Why is it considered difficult to detect whether or not a gap exists if
> the offset of the first bio is non-zero? Please note that a thoroughly
> tested version of gap detection code that supports non-zero offsets for
> the first element is already upstream. See also ib_map_mr_sg() and
> ib_sg_to_pages() in drivers/infiniband/core/verbs.c.

I don't know infiniband much, but from the code, it is just about
sg, and looks not same with block's sg gap limit.

The sg gap limit in block layer actually means:
	- except for last segment, every segment has to end in aligned virt boundary
	- from the 2nd segment, offset of the segment has to be zero

But we don't store main segment info(start, size) in bio/req,
all the segments are basically computed in the flight, so it isn't
easy to check/handle sg gap. Also IMO segment handling of block layer is
quite fragile, and hope multi-page bvec can improve the case much.

We relax check for sg gap for allowing merging tons of small bios, if
these bios are physically contiguous. But if offset of the 1st bio
in the merged req isn't zero, current segment compution in
__blk_segment_map_sg() doesn't consider sg gap, so one segment may
ends in unaligned virt boundary.

Given it is late in v4.11 cycle, this patch doesn't want to touch
__blk_segment_map_sg() for avoiding regression risk, instead just
changing bio_will_gap() for avoding the case with minimized risk,
because the change is quite simple.


Thanks,
Ming
Johannes Thumshirn April 14, 2017, 11:26 a.m. UTC | #3
On Thu, Apr 13, 2017 at 6:06 PM, Ming Lei <ming.lei@redhat.com> wrote:
> Now commit 729204ef49ec("block: relax check on sg gap")
> allows us to merge bios if both are physically contiguous,
> this change can merge huge of small bios in use case of mkfs,
> for example, mkfs.ntfs running time can be decreased to ~1/10.
>
> But if one rq starts with non-aligned buffer(the 1st bvec's
> bv_offset isn't zero) and if we allow to merge, it is quite
> difficult to respect sg gap limit, especially the segment
> can't be at maximum segment size, otherwise the segment
> ends in unaligned virt boundary. This patch trys to avoid the
> issue by not allowing to merge if the req starts with non-aligned
> buffer.
>
> Also add comments to explain why the merged segment can't
> end in unaligned virt boundary.
>
> Fixes: 729204ef49ec ("block: relax check on sg gap")
> Tested-by: Johannes Thumshirn <jthumshirn@suse.de>
> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---


Hi Jens and Omar,

I know Jens is currently on vacation, but can maybe Omar get this fix to Linus
before v4.11? This would be highly appreciated.

Thanks a lot,
        Johannes
Jens Axboe April 14, 2017, 7:51 p.m. UTC | #4
On 04/14/2017 05:26 AM, Johannes Thumshirn wrote:
> On Thu, Apr 13, 2017 at 6:06 PM, Ming Lei <ming.lei@redhat.com> wrote:
>> Now commit 729204ef49ec("block: relax check on sg gap")
>> allows us to merge bios if both are physically contiguous,
>> this change can merge huge of small bios in use case of mkfs,
>> for example, mkfs.ntfs running time can be decreased to ~1/10.
>>
>> But if one rq starts with non-aligned buffer(the 1st bvec's
>> bv_offset isn't zero) and if we allow to merge, it is quite
>> difficult to respect sg gap limit, especially the segment
>> can't be at maximum segment size, otherwise the segment
>> ends in unaligned virt boundary. This patch trys to avoid the
>> issue by not allowing to merge if the req starts with non-aligned
>> buffer.
>>
>> Also add comments to explain why the merged segment can't
>> end in unaligned virt boundary.
>>
>> Fixes: 729204ef49ec ("block: relax check on sg gap")
>> Tested-by: Johannes Thumshirn <jthumshirn@suse.de>
>> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
>> Signed-off-by: Ming Lei <ming.lei@redhat.com>
>> ---
> 
> 
> Hi Jens and Omar,
> 
> I know Jens is currently on vacation, but can maybe Omar get this fix to Linus
> before v4.11? This would be highly appreciated.

I'll pick this up now. I'm back early next week, so we're still fine for
4.11. But I'm glad I pushed back on the original change for 4.10 now.
diff mbox

Patch

diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index b75d6fe5a1b9..5eaf323de98b 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1672,12 +1672,36 @@  static inline bool bios_segs_mergeable(struct request_queue *q,
 	return true;
 }
 
-static inline bool bio_will_gap(struct request_queue *q, struct bio *prev,
-			 struct bio *next)
+static inline bool bio_will_gap(struct request_queue *q,
+				struct request *prev_rq,
+				struct bio *prev,
+				struct bio *next)
 {
 	if (bio_has_data(prev) && queue_virt_boundary(q)) {
 		struct bio_vec pb, nb;
 
+		/*
+		 * don't merge if the 1st bio starts with non-zero
+		 * offset, otherwise it is quite difficult to respect
+		 * sg gap limit. We work hard to merge huge of small
+		 * single bios in case of mkfs.
+		 */
+		if (prev_rq)
+			bio_get_first_bvec(prev_rq->bio, &pb);
+		else
+			bio_get_first_bvec(prev, &pb);
+		if (pb.bv_offset)
+			return true;
+
+		/*
+		 * We don't need to worry about the situation that the
+		 * merged segment ends in unaligned virt boundary:
+		 *
+		 * - if 'pb' ends aligned, the merged segment ends aligned
+		 * - if 'pb' ends unaligned, the next bio must include
+		 *   one single bvec of 'nb', otherwise the 'nb' can't
+		 *   merge with 'pb'
+		 */
 		bio_get_last_bvec(prev, &pb);
 		bio_get_first_bvec(next, &nb);
 
@@ -1690,12 +1714,12 @@  static inline bool bio_will_gap(struct request_queue *q, struct bio *prev,
 
 static inline bool req_gap_back_merge(struct request *req, struct bio *bio)
 {
-	return bio_will_gap(req->q, req->biotail, bio);
+	return bio_will_gap(req->q, req, req->biotail, bio);
 }
 
 static inline bool req_gap_front_merge(struct request *req, struct bio *bio)
 {
-	return bio_will_gap(req->q, bio, req->bio);
+	return bio_will_gap(req->q, NULL, bio, req->bio);
 }
 
 static inline void blk_dump_rq(const struct request *req, const char *msg)