diff mbox

[11/27] bcache: io.c: use bio_set_vec_table

Message ID 1459857443-20611-12-git-send-email-tom.leiming@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Ming Lei April 5, 2016, 11:56 a.m. UTC
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 drivers/md/bcache/io.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Comments

Christoph Hellwig April 5, 2016, 12:49 p.m. UTC | #1
On Tue, Apr 05, 2016 at 07:56:56PM +0800, Ming Lei wrote:
> diff --git a/drivers/md/bcache/io.c b/drivers/md/bcache/io.c
> index 86a0bb8..1c48462 100644
> --- a/drivers/md/bcache/io.c
> +++ b/drivers/md/bcache/io.c
> @@ -26,8 +26,7 @@ struct bio *bch_bbio_alloc(struct cache_set *c)
>  
>  	bio_init(bio);
>  	bio->bi_flags		|= BIO_POOL_NONE << BIO_POOL_OFFSET;
> -	bio->bi_max_vecs	 = bucket_pages(c);
> -	bio->bi_io_vec		 = bio->bi_inline_vecs;
> +	bio_set_vec_table(bio, bio->bi_inline_vecs, bucket_pages(c));

All this bcache code needs to move away from bio_init on a bio
embedded in a driver private structure toward properly using
bio_alloc / bio_alloc_bioset.  That will also fix the crash
with bcache over md that Shaohua reported, so I'd suggest to fast
track this part of the series.
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ming Lei April 5, 2016, 3:24 p.m. UTC | #2
On Tue, Apr 5, 2016 at 8:49 PM, Christoph Hellwig <hch@infradead.org> wrote:
> On Tue, Apr 05, 2016 at 07:56:56PM +0800, Ming Lei wrote:
>> diff --git a/drivers/md/bcache/io.c b/drivers/md/bcache/io.c
>> index 86a0bb8..1c48462 100644
>> --- a/drivers/md/bcache/io.c
>> +++ b/drivers/md/bcache/io.c
>> @@ -26,8 +26,7 @@ struct bio *bch_bbio_alloc(struct cache_set *c)
>>
>>       bio_init(bio);
>>       bio->bi_flags           |= BIO_POOL_NONE << BIO_POOL_OFFSET;
>> -     bio->bi_max_vecs         = bucket_pages(c);
>> -     bio->bi_io_vec           = bio->bi_inline_vecs;
>> +     bio_set_vec_table(bio, bio->bi_inline_vecs, bucket_pages(c));
>
> All this bcache code needs to move away from bio_init on a bio
> embedded in a driver private structure toward properly using
> bio_alloc / bio_alloc_bioset.  That will also fix the crash
> with bcache over md that Shaohua reported, so I'd suggest to fast
> track this part of the series.

I suggest to keep this usage for the following reasons:

- bio can be embedded into one biger instance, which is often allocated
dynamically, so one extra allocation for bio can be avoided.

- we should support arbitrary bio size by this way, at least bio_add_page()
supports this usage.  Also code gets lots of simplication with arbitrary bio
size support, such as prio_io(): bcache

BTW, the root cause for bcache crash still isn't clear now because
blk_bio_segment_split() should split big bio into proper size with
all queue's limits. Maybe the max segment limit isn't figured out correctly.

Thanks,
Ming Lei
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Hellwig April 5, 2016, 5:31 p.m. UTC | #3
On Tue, Apr 05, 2016 at 11:24:30PM +0800, Ming Lei wrote:
> - bio can be embedded into one biger instance, which is often allocated
> dynamically, so one extra allocation for bio can be avoided.

We can also do this the other way around with the bios front_pad,
which avoid the caller poking into bio details.

> - we should support arbitrary bio size by this way, at least bio_add_page()
> supports this usage.  Also code gets lots of simplication with arbitrary bio
> size support, such as prio_io(): bcache

There is no reason for not supporting huge bios in the core bio code,
in fact using bio_kmalloc you can already allocate huges bios
dynamically right now.  Except that you can't really use it, because the
layers below don't expect that.  Bios based drivers expect to be able to
call bio_clone and friends called on bios passed to them, and might
also make assumptions about the max number of bios segments for now.

> BTW, the root cause for bcache crash still isn't clear now because
> blk_bio_segment_split() should split big bio into proper size with
> all queue's limits. Maybe the max segment limit isn't figured out correctly.

The root cause is pretty simple:  The queue limits matter for request
based drivers, which are the only ones getting bios > BIO_MAX_PAGES
except for the buggy bcache use case.  You'll need to either adjust the
limit for all bio based drivers to or get rid of that one magic caller
not playing by the rules.
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Kent Overstreet April 6, 2016, 12:35 a.m. UTC | #4
On Tue, Apr 05, 2016 at 05:49:02AM -0700, Christoph Hellwig wrote:
> On Tue, Apr 05, 2016 at 07:56:56PM +0800, Ming Lei wrote:
> > diff --git a/drivers/md/bcache/io.c b/drivers/md/bcache/io.c
> > index 86a0bb8..1c48462 100644
> > --- a/drivers/md/bcache/io.c
> > +++ b/drivers/md/bcache/io.c
> > @@ -26,8 +26,7 @@ struct bio *bch_bbio_alloc(struct cache_set *c)
> >  
> >  	bio_init(bio);
> >  	bio->bi_flags		|= BIO_POOL_NONE << BIO_POOL_OFFSET;
> > -	bio->bi_max_vecs	 = bucket_pages(c);
> > -	bio->bi_io_vec		 = bio->bi_inline_vecs;
> > +	bio_set_vec_table(bio, bio->bi_inline_vecs, bucket_pages(c));
> 
> All this bcache code needs to move away from bio_init on a bio
> embedded in a driver private structure toward properly using
> bio_alloc / bio_alloc_bioset.  That will also fix the crash
> with bcache over md that Shaohua reported, so I'd suggest to fast
> track this part of the series.

Why?

bio_init() is a publicly exported function, it's always been one and bcache is
ot the only driver to use it directly.

bios with > BIO_MAX_PAGES bvecs is a separate issue; I would argue that the bug
is in md's queue_limits; it uses blk_set_stacking_limits() which sets
max_segments = USHRT_MAX, which is wrong if it's going to clone the biovec.
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/md/bcache/io.c b/drivers/md/bcache/io.c
index 86a0bb8..1c48462 100644
--- a/drivers/md/bcache/io.c
+++ b/drivers/md/bcache/io.c
@@ -26,8 +26,7 @@  struct bio *bch_bbio_alloc(struct cache_set *c)
 
 	bio_init(bio);
 	bio->bi_flags		|= BIO_POOL_NONE << BIO_POOL_OFFSET;
-	bio->bi_max_vecs	 = bucket_pages(c);
-	bio->bi_io_vec		 = bio->bi_inline_vecs;
+	bio_set_vec_table(bio, bio->bi_inline_vecs, bucket_pages(c));
 
 	return bio;
 }