diff mbox

[RFC] block: fix bio merge checks when virt_boundary is set

Message ID 20160316223804.GA6217@localhost.lm.intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Keith Busch March 16, 2016, 10:38 p.m. UTC
On Wed, Mar 16, 2016 at 05:26:28PM +0100, Vitaly Kuznetsov wrote:
> Ming Lei <tom.leiming@gmail.com> writes:
> > We do have the above merge in bio_add_page(), so the two bios in
> > your above example shouldn't have been observed if the two buffers
> > are added to bio via the bio_add_page().
> >
> > If you see short bios in above example, maybe you need to check ntfs code:
> >
> > - if bio_add_page() is used to add buffer
> > - if using one standalone bio to transfer each 512byte, even they
> > are in same page and the sector is continuous
> 
> I'm not using ntfs, mkfs.ntfs is a userspace application which shows the
> regression when virt_boundary is in place. I should have avoided
> mentioning bio_add_pc_page() here as it is unrelated to the issue.
> 
> In particular, I'm concearned about the following call sites:
> blk_bio_segment_split()
> ll_back_merge_fn()
> ll_front_merge_fn()

I don't think blk_bio_segment_split would have seen such a bio vector
if it pages were added with bio_add_page. Those should already have
been combined. In any case, I think you can get what you're after just
by moving the gap check after BIOVEC_PHYS_MERGABLE. Does the following
look ok to you?

---
--
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Vitaly Kuznetsov March 17, 2016, 11:20 a.m. UTC | #1
Keith Busch <keith.busch@intel.com> writes:

> On Wed, Mar 16, 2016 at 05:26:28PM +0100, Vitaly Kuznetsov wrote:
>> Ming Lei <tom.leiming@gmail.com> writes:
>> > We do have the above merge in bio_add_page(), so the two bios in
>> > your above example shouldn't have been observed if the two buffers
>> > are added to bio via the bio_add_page().
>> >
>> > If you see short bios in above example, maybe you need to check ntfs code:
>> >
>> > - if bio_add_page() is used to add buffer
>> > - if using one standalone bio to transfer each 512byte, even they
>> > are in same page and the sector is continuous
>> 
>> I'm not using ntfs, mkfs.ntfs is a userspace application which shows the
>> regression when virt_boundary is in place. I should have avoided
>> mentioning bio_add_pc_page() here as it is unrelated to the issue.
>> 
>> In particular, I'm concearned about the following call sites:
>> blk_bio_segment_split()
>> ll_back_merge_fn()
>> ll_front_merge_fn()
>
> I don't think blk_bio_segment_split would have seen such a bio vector
> if it pages were added with bio_add_page. Those should already have
> been combined. In any case, I think you can get what you're after just
> by moving the gap check after BIOVEC_PHYS_MERGABLE. Does the following
> look ok to you?
>

Thanks, it does.

Just tested against 4.5, the test was:

# time mkfs.ntfs -s 512 -Q /dev/sdc1

The results are:

non-patched kernel:
real 0m35.552s
user 0m0.006s
sys 0m28.316s

my patch:
real 0m6.277s
user 0m0.010s
sys 0m5.870s

your patch:
real 0m4.247s
user 0m0.005s
sys 0m4.136s

Will you send it or would you like me to do that with your Suggested-by?

(a nitpick below)

> ---
> diff --git a/block/blk-merge.c b/block/blk-merge.c
> index 2613531..4aa8e44 100644
> --- a/block/blk-merge.c
> +++ b/block/blk-merge.c
> @@ -96,13 +96,6 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
>  	const unsigned max_sectors = get_max_io_size(q, bio);
>
>  	bio_for_each_segment(bv, bio, iter) {
> -		/*
> -		 * If the queue doesn't support SG gaps and adding this
> -		 * offset would create a gap, disallow it.
> -		 */
> -		if (bvprvp && bvec_gap_to_prev(q, bvprvp, bv.bv_offset))
> -			goto split;
> -
>  		if (sectors + (bv.bv_len >> 9) > max_sectors) {
>  			/*
>  			 * Consider this a new segment if we're splitting in
> @@ -139,6 +132,13 @@ new_segment:
>  		if (nsegs == queue_max_segments(q))
>  			goto split;
>
> +		/*
> +		 * If the queue doesn't support SG gaps and adding this
> +		 * offset would create a gap, disallow it.
> +		 */
> +		if (bvprvp && bvec_gap_to_prev(q, bvprvp, bv.bv_offset))
> +			goto split;
> +
>  		nsegs++;
>  		bvprv = bv;
>  		bvprvp = &bvprv;
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index 413c84f..69cffbe 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -1400,7 +1400,8 @@ static inline bool bio_will_gap(struct request_queue *q, struct bio *prev,
>  		bio_get_last_bvec(prev, &pb);
>  		bio_get_first_bvec(next, &nb);
>
> -		return __bvec_gap_to_prev(q, &pb, nb.bv_offset);
> +		if (!BIOVEC_PHYS_MERGEABLE(&pb, &nb))
> +			return __bvec_gap_to_prev(q, &pb, nb.bv_offset);
>  	}

Any reason to put this check here and not move to __bvec_gap_to_prev()?
I find it misleading that __bvec_gap_to_prev() reports a gap when offset
!= 0 not checking BIOVEC_PHYS_MERGEABLE().

>
>  	return false;
> --
Keith Busch March 17, 2016, 4:39 p.m. UTC | #2
On Thu, Mar 17, 2016 at 12:20:28PM +0100, Vitaly Kuznetsov wrote:
> Keith Busch <keith.busch@intel.com> writes:
> > been combined. In any case, I think you can get what you're after just
> > by moving the gap check after BIOVEC_PHYS_MERGABLE. Does the following
> > look ok to you?
> >
> 
> Thanks, it does.

Cool, thanks for confirming.

> Will you send it or would you like me to do that with your Suggested-by?

I'm not confident yet this doesn't break anything, particularly since
we moved the gap check after the length check. Just wanted to confirm
the concept addressed your concern, but still need to take a closer look
and test before submitting.
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ming Lei March 18, 2016, 2:59 a.m. UTC | #3
On Fri, Mar 18, 2016 at 12:39 AM, Keith Busch <keith.busch@intel.com> wrote:
> On Thu, Mar 17, 2016 at 12:20:28PM +0100, Vitaly Kuznetsov wrote:
>> Keith Busch <keith.busch@intel.com> writes:
>> > been combined. In any case, I think you can get what you're after just
>> > by moving the gap check after BIOVEC_PHYS_MERGABLE. Does the following
>> > look ok to you?
>> >
>>
>> Thanks, it does.
>
> Cool, thanks for confirming.
>
>> Will you send it or would you like me to do that with your Suggested-by?
>
> I'm not confident yet this doesn't break anything, particularly since
> we moved the gap check after the length check. Just wanted to confirm
> the concept addressed your concern, but still need to take a closer look
> and test before submitting.

IMO, the change on blk_bio_segment_split() is correct, because actually it
is a sg gap and the check should have been done between segments
instead of bvecs. So it is reasonable to move the check just before populating
a new segment.

But for the 2nd change in bio_will_gap(), which should fix Vitaly's problem, I
am still not sure if it is completely correct. bio_will_gap() is used
to check if two
bios may be merged. Suppose two bios are continues physically, the last bvec
in 1st bio and the first bvec in 2nd bio might not be in one same segment
because of segment size limit.

The root cause might be from blkdev_writepage(), and I guess these small
bios are from there.

thanks,
Ming Lei
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 2613531..4aa8e44 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -96,13 +96,6 @@  static struct bio *blk_bio_segment_split(struct request_queue *q,
 	const unsigned max_sectors = get_max_io_size(q, bio);
 
 	bio_for_each_segment(bv, bio, iter) {
-		/*
-		 * If the queue doesn't support SG gaps and adding this
-		 * offset would create a gap, disallow it.
-		 */
-		if (bvprvp && bvec_gap_to_prev(q, bvprvp, bv.bv_offset))
-			goto split;
-
 		if (sectors + (bv.bv_len >> 9) > max_sectors) {
 			/*
 			 * Consider this a new segment if we're splitting in
@@ -139,6 +132,13 @@  new_segment:
 		if (nsegs == queue_max_segments(q))
 			goto split;
 
+		/*
+		 * If the queue doesn't support SG gaps and adding this
+		 * offset would create a gap, disallow it.
+		 */
+		if (bvprvp && bvec_gap_to_prev(q, bvprvp, bv.bv_offset))
+			goto split;
+
 		nsegs++;
 		bvprv = bv;
 		bvprvp = &bvprv;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 413c84f..69cffbe 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1400,7 +1400,8 @@  static inline bool bio_will_gap(struct request_queue *q, struct bio *prev,
 		bio_get_last_bvec(prev, &pb);
 		bio_get_first_bvec(next, &nb);
 
-		return __bvec_gap_to_prev(q, &pb, nb.bv_offset);
+		if (!BIOVEC_PHYS_MERGEABLE(&pb, &nb))
+			return __bvec_gap_to_prev(q, &pb, nb.bv_offset);
 	}
 
 	return false;