diff mbox series

[V10,14/19] block: enable multipage bvecs

Message ID 20181115085306.9910-15-ming.lei@redhat.com (mailing list archive)
State New, archived
Headers show
Series block: support multi-page bvec | expand

Commit Message

Ming Lei Nov. 15, 2018, 8:53 a.m. UTC
This patch pulls the trigger for multi-page bvecs.

Now any request queue which supports queue cluster will see multi-page
bvecs.

Cc: Dave Chinner <dchinner@redhat.com>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: dm-devel@redhat.com
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: Shaohua Li <shli@kernel.org>
Cc: linux-raid@vger.kernel.org
Cc: linux-erofs@lists.ozlabs.org
Cc: David Sterba <dsterba@suse.com>
Cc: linux-btrfs@vger.kernel.org
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: linux-xfs@vger.kernel.org
Cc: Gao Xiang <gaoxiang25@huawei.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: linux-ext4@vger.kernel.org
Cc: Coly Li <colyli@suse.de>
Cc: linux-bcache@vger.kernel.org
Cc: Boaz Harrosh <ooo@electrozaur.com>
Cc: Bob Peterson <rpeterso@redhat.com>
Cc: cluster-devel@redhat.com
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/bio.c | 24 ++++++++++++++++++------
 1 file changed, 18 insertions(+), 6 deletions(-)

Comments

Omar Sandoval Nov. 16, 2018, 1:56 a.m. UTC | #1
On Thu, Nov 15, 2018 at 04:53:01PM +0800, Ming Lei wrote:
> This patch pulls the trigger for multi-page bvecs.
> 
> Now any request queue which supports queue cluster will see multi-page
> bvecs.
> 
> Cc: Dave Chinner <dchinner@redhat.com>
> Cc: Kent Overstreet <kent.overstreet@gmail.com>
> Cc: Mike Snitzer <snitzer@redhat.com>
> Cc: dm-devel@redhat.com
> Cc: Alexander Viro <viro@zeniv.linux.org.uk>
> Cc: linux-fsdevel@vger.kernel.org
> Cc: Shaohua Li <shli@kernel.org>
> Cc: linux-raid@vger.kernel.org
> Cc: linux-erofs@lists.ozlabs.org
> Cc: David Sterba <dsterba@suse.com>
> Cc: linux-btrfs@vger.kernel.org
> Cc: Darrick J. Wong <darrick.wong@oracle.com>
> Cc: linux-xfs@vger.kernel.org
> Cc: Gao Xiang <gaoxiang25@huawei.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Theodore Ts'o <tytso@mit.edu>
> Cc: linux-ext4@vger.kernel.org
> Cc: Coly Li <colyli@suse.de>
> Cc: linux-bcache@vger.kernel.org
> Cc: Boaz Harrosh <ooo@electrozaur.com>
> Cc: Bob Peterson <rpeterso@redhat.com>
> Cc: cluster-devel@redhat.com
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>  block/bio.c | 24 ++++++++++++++++++------
>  1 file changed, 18 insertions(+), 6 deletions(-)
> 
> diff --git a/block/bio.c b/block/bio.c
> index 6486722d4d4b..ed6df6f8e63d 100644
> --- a/block/bio.c
> +++ b/block/bio.c

This comment above __bio_try_merge_page() doesn't make sense after this
change:

 This is a
 a useful optimisation for file systems with a block size smaller than the
 page size.

Can you please get rid of it in this patch?

> @@ -767,12 +767,24 @@ bool __bio_try_merge_page(struct bio *bio, struct page *page,
>  
>  	if (bio->bi_vcnt > 0) {
>  		struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt - 1];
> -
> -		if (page == bv->bv_page && off == bv->bv_offset + bv->bv_len) {
> -			bv->bv_len += len;
> -			bio->bi_iter.bi_size += len;
> -			return true;
> -		}
> +		struct request_queue *q = NULL;
> +
> +		if (page == bv->bv_page && off == (bv->bv_offset + bv->bv_len)
> +				&& (off + len) <= PAGE_SIZE)
> +			goto merge;

The parentheses around (bv->bv_offset + bv->bv_len) and (off + len) are
unnecessary noise.

What's the point of the new (off + len) <= PAGE_SIZE check?

> +
> +		if (bio->bi_disk)
> +			q = bio->bi_disk->queue;
> +
> +		/* disable multi-page bvec too if cluster isn't enabled */
> +		if (!q || !blk_queue_cluster(q) ||
> +		    ((page_to_phys(bv->bv_page) + bv->bv_offset + bv->bv_len) !=
> +		     (page_to_phys(page) + off)))

More unnecessary parentheses here.

> +			return false;
> + merge:
> +		bv->bv_len += len;
> +		bio->bi_iter.bi_size += len;
> +		return true;
>  	}
>  	return false;
>  }
> -- 
> 2.9.5
>
Christoph Hellwig Nov. 16, 2018, 1:53 p.m. UTC | #2
> -
> -		if (page == bv->bv_page && off == bv->bv_offset + bv->bv_len) {
> -			bv->bv_len += len;
> -			bio->bi_iter.bi_size += len;
> -			return true;
> -		}
> +		struct request_queue *q = NULL;
> +
> +		if (page == bv->bv_page && off == (bv->bv_offset + bv->bv_len)
> +				&& (off + len) <= PAGE_SIZE)

How could the page struct be the same, but the range beyond PAGE_SIZE
(at least with the existing callers)?

Also no need for the inner btraces, and the && always goes on the
first line.

> +		if (bio->bi_disk)
> +			q = bio->bi_disk->queue;
> +
> +		/* disable multi-page bvec too if cluster isn't enabled */
> +		if (!q || !blk_queue_cluster(q) ||
> +		    ((page_to_phys(bv->bv_page) + bv->bv_offset + bv->bv_len) !=
> +		     (page_to_phys(page) + off)))
> +			return false;
> + merge:
> +		bv->bv_len += len;
> +		bio->bi_iter.bi_size += len;
> +		return true;

Ok, this is scary, as it will give differen results depending on when
bi_disk is assigned.  But then again we shouldn't really do the cluster
check here, but rather when splitting the bio for the actual low-level
driver.

(and eventually we should kill this clustering setting off in favor
of our normal segment limits).
Ming Lei Nov. 19, 2018, 8:45 a.m. UTC | #3
On Thu, Nov 15, 2018 at 05:56:27PM -0800, Omar Sandoval wrote:
> On Thu, Nov 15, 2018 at 04:53:01PM +0800, Ming Lei wrote:
> > This patch pulls the trigger for multi-page bvecs.
> > 
> > Now any request queue which supports queue cluster will see multi-page
> > bvecs.
> > 
> > Cc: Dave Chinner <dchinner@redhat.com>
> > Cc: Kent Overstreet <kent.overstreet@gmail.com>
> > Cc: Mike Snitzer <snitzer@redhat.com>
> > Cc: dm-devel@redhat.com
> > Cc: Alexander Viro <viro@zeniv.linux.org.uk>
> > Cc: linux-fsdevel@vger.kernel.org
> > Cc: Shaohua Li <shli@kernel.org>
> > Cc: linux-raid@vger.kernel.org
> > Cc: linux-erofs@lists.ozlabs.org
> > Cc: David Sterba <dsterba@suse.com>
> > Cc: linux-btrfs@vger.kernel.org
> > Cc: Darrick J. Wong <darrick.wong@oracle.com>
> > Cc: linux-xfs@vger.kernel.org
> > Cc: Gao Xiang <gaoxiang25@huawei.com>
> > Cc: Christoph Hellwig <hch@lst.de>
> > Cc: Theodore Ts'o <tytso@mit.edu>
> > Cc: linux-ext4@vger.kernel.org
> > Cc: Coly Li <colyli@suse.de>
> > Cc: linux-bcache@vger.kernel.org
> > Cc: Boaz Harrosh <ooo@electrozaur.com>
> > Cc: Bob Peterson <rpeterso@redhat.com>
> > Cc: cluster-devel@redhat.com
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > ---
> >  block/bio.c | 24 ++++++++++++++++++------
> >  1 file changed, 18 insertions(+), 6 deletions(-)
> > 
> > diff --git a/block/bio.c b/block/bio.c
> > index 6486722d4d4b..ed6df6f8e63d 100644
> > --- a/block/bio.c
> > +++ b/block/bio.c
> 
> This comment above __bio_try_merge_page() doesn't make sense after this
> change:
> 
>  This is a
>  a useful optimisation for file systems with a block size smaller than the
>  page size.
> 
> Can you please get rid of it in this patch?

I understand __bio_try_merge_page() still works for original cases, so
looks the optimization for sub-pagesize is still there too, isn't it?

> 
> > @@ -767,12 +767,24 @@ bool __bio_try_merge_page(struct bio *bio, struct page *page,
> >  
> >  	if (bio->bi_vcnt > 0) {
> >  		struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt - 1];
> > -
> > -		if (page == bv->bv_page && off == bv->bv_offset + bv->bv_len) {
> > -			bv->bv_len += len;
> > -			bio->bi_iter.bi_size += len;
> > -			return true;
> > -		}
> > +		struct request_queue *q = NULL;
> > +
> > +		if (page == bv->bv_page && off == (bv->bv_offset + bv->bv_len)
> > +				&& (off + len) <= PAGE_SIZE)
> > +			goto merge;
> 
> The parentheses around (bv->bv_offset + bv->bv_len) and (off + len) are
> unnecessary noise.
> 
> What's the point of the new (off + len) <= PAGE_SIZE check?

Yeah, I don't know why I did it, :-(, the check is absolutely always true.

> 
> > +
> > +		if (bio->bi_disk)
> > +			q = bio->bi_disk->queue;
> > +
> > +		/* disable multi-page bvec too if cluster isn't enabled */
> > +		if (!q || !blk_queue_cluster(q) ||
> > +		    ((page_to_phys(bv->bv_page) + bv->bv_offset + bv->bv_len) !=
> > +		     (page_to_phys(page) + off)))
> 
> More unnecessary parentheses here.

OK.

Thanks,
Ming
Ming Lei Nov. 19, 2018, 9 a.m. UTC | #4
On Fri, Nov 16, 2018 at 02:53:08PM +0100, Christoph Hellwig wrote:
> > -
> > -		if (page == bv->bv_page && off == bv->bv_offset + bv->bv_len) {
> > -			bv->bv_len += len;
> > -			bio->bi_iter.bi_size += len;
> > -			return true;
> > -		}
> > +		struct request_queue *q = NULL;
> > +
> > +		if (page == bv->bv_page && off == (bv->bv_offset + bv->bv_len)
> > +				&& (off + len) <= PAGE_SIZE)
> 
> How could the page struct be the same, but the range beyond PAGE_SIZE
> (at least with the existing callers)?
> 
> Also no need for the inner btraces, and the && always goes on the
> first line.

OK.

> 
> > +		if (bio->bi_disk)
> > +			q = bio->bi_disk->queue;
> > +
> > +		/* disable multi-page bvec too if cluster isn't enabled */
> > +		if (!q || !blk_queue_cluster(q) ||
> > +		    ((page_to_phys(bv->bv_page) + bv->bv_offset + bv->bv_len) !=
> > +		     (page_to_phys(page) + off)))
> > +			return false;
> > + merge:
> > +		bv->bv_len += len;
> > +		bio->bi_iter.bi_size += len;
> > +		return true;
> 
> Ok, this is scary, as it will give differen results depending on when
> bi_disk is assigned.

It is just merge or not, both can be handled well now.

> But then again we shouldn't really do the cluster
> check here, but rather when splitting the bio for the actual low-level
> driver.

Yeah, I thought of this way too, but it may cause tons of bio split for
no-clustering, and there are quite a few scsi devices which require
to disable clustering.

[linux]$ git grep -n DISABLE_CLUSTERING ./drivers/scsi/ | wc -l
     28

Or we may introduce bio_split_to_single_page_bvec() to allocate &
convert to single-page bvec table for non-clustering, will try this
approach in next version.

> 
> (and eventually we should kill this clustering setting off in favor
> of our normal segment limits).

Yeah, it has been in my post-multi-page todo list already, :-)

thanks,
Ming
diff mbox series

Patch

diff --git a/block/bio.c b/block/bio.c
index 6486722d4d4b..ed6df6f8e63d 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -767,12 +767,24 @@  bool __bio_try_merge_page(struct bio *bio, struct page *page,
 
 	if (bio->bi_vcnt > 0) {
 		struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt - 1];
-
-		if (page == bv->bv_page && off == bv->bv_offset + bv->bv_len) {
-			bv->bv_len += len;
-			bio->bi_iter.bi_size += len;
-			return true;
-		}
+		struct request_queue *q = NULL;
+
+		if (page == bv->bv_page && off == (bv->bv_offset + bv->bv_len)
+				&& (off + len) <= PAGE_SIZE)
+			goto merge;
+
+		if (bio->bi_disk)
+			q = bio->bi_disk->queue;
+
+		/* disable multi-page bvec too if cluster isn't enabled */
+		if (!q || !blk_queue_cluster(q) ||
+		    ((page_to_phys(bv->bv_page) + bv->bv_offset + bv->bv_len) !=
+		     (page_to_phys(page) + off)))
+			return false;
+ merge:
+		bv->bv_len += len;
+		bio->bi_iter.bi_size += len;
+		return true;
 	}
 	return false;
 }