diff mbox

block: Simplify the bio cloning implementation

Message ID 20180626222624.11739-1-bart.vanassche@wdc.com (mailing list archive)
State New, archived
Headers show

Commit Message

Bart Van Assche June 26, 2018, 10:26 p.m. UTC
There is no good reason to use different code paths for different
request operations. Hence remove the switch/case statement from
bio_clone_bioset().

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
---
 block/bio.c | 15 ++-------------
 1 file changed, 2 insertions(+), 13 deletions(-)

Comments

Ming Lei June 27, 2018, 1:13 a.m. UTC | #1
On Tue, Jun 26, 2018 at 03:26:24PM -0700, Bart Van Assche wrote:
> There is no good reason to use different code paths for different
> request operations. Hence remove the switch/case statement from
> bio_clone_bioset().
> 
> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Ming Lei <ming.lei@redhat.com>
> ---
>  block/bio.c | 15 ++-------------
>  1 file changed, 2 insertions(+), 13 deletions(-)
> 
> diff --git a/block/bio.c b/block/bio.c
> index f7e3d88bd0b6..4c27cc9ea55e 100644
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -691,19 +691,8 @@ struct bio *bio_clone_bioset(struct bio *bio_src, gfp_t gfp_mask,
>  	bio->bi_iter.bi_sector	= bio_src->bi_iter.bi_sector;
>  	bio->bi_iter.bi_size	= bio_src->bi_iter.bi_size;
>  
> -	switch (bio_op(bio)) {
> -	case REQ_OP_DISCARD:
> -	case REQ_OP_SECURE_ERASE:
> -	case REQ_OP_WRITE_ZEROES:
> -		break;
> -	case REQ_OP_WRITE_SAME:
> -		bio->bi_io_vec[bio->bi_vcnt++] = bio_src->bi_io_vec[0];
> -		break;
> -	default:
> -		bio_for_each_segment(bv, bio_src, iter)
> -			bio->bi_io_vec[bio->bi_vcnt++] = bv;
> -		break;
> -	}
> +	bio_for_each_segment(bv, bio_src, iter)
> +		bio->bi_io_vec[bio->bi_vcnt++] = bv;

The above change may not be correct for WRITE_SAME, since
bio_src->bi_iter.bi_size should be the actual bytes to write by drive.

Also Christoph has killed bio_clone_bioset() already, and I think that is
correct thing to do, see:

	https://marc.info/?l=linux-block&m=152938432207215&w=2

Thanks,
Ming
Bart Van Assche June 27, 2018, 5:48 p.m. UTC | #2
On 06/26/18 18:13, Ming Lei wrote:
> On Tue, Jun 26, 2018 at 03:26:24PM -0700, Bart Van Assche wrote:
>> There is no good reason to use different code paths for different
>> request operations. Hence remove the switch/case statement from
>> bio_clone_bioset().
>>
>> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
>> Cc: Christoph Hellwig <hch@lst.de>
>> Cc: Ming Lei <ming.lei@redhat.com>
>> ---
>>   block/bio.c | 15 ++-------------
>>   1 file changed, 2 insertions(+), 13 deletions(-)
>>
>> diff --git a/block/bio.c b/block/bio.c
>> index f7e3d88bd0b6..4c27cc9ea55e 100644
>> --- a/block/bio.c
>> +++ b/block/bio.c
>> @@ -691,19 +691,8 @@ struct bio *bio_clone_bioset(struct bio *bio_src, gfp_t gfp_mask,
>>   	bio->bi_iter.bi_sector	= bio_src->bi_iter.bi_sector;
>>   	bio->bi_iter.bi_size	= bio_src->bi_iter.bi_size;
>>   
>> -	switch (bio_op(bio)) {
>> -	case REQ_OP_DISCARD:
>> -	case REQ_OP_SECURE_ERASE:
>> -	case REQ_OP_WRITE_ZEROES:
>> -		break;
>> -	case REQ_OP_WRITE_SAME:
>> -		bio->bi_io_vec[bio->bi_vcnt++] = bio_src->bi_io_vec[0];
>> -		break;
>> -	default:
>> -		bio_for_each_segment(bv, bio_src, iter)
>> -			bio->bi_io_vec[bio->bi_vcnt++] = bv;
>> -		break;
>> -	}
>> +	bio_for_each_segment(bv, bio_src, iter)
>> +		bio->bi_io_vec[bio->bi_vcnt++] = bv;
> 
> The above change may not be correct for WRITE_SAME, since
> bio_src->bi_iter.bi_size should be the actual bytes to write by drive.

Since bio_for_each_segment() neither modifies bio_src->bi_iter nor 
bio->bi_iter, the above patch retains the value copied into 
bio->bi_iter.bi_size before bio_for_each_segment() was called. In other 
words, bio_src->bi_iter.bi_size is not modified and the resulting 
bio->bi_iter.bi_size should be identical with or without this patch.

> Also Christoph has killed bio_clone_bioset() already, and I think that is
> correct thing to do, see:
> 
> 	https://marc.info/?l=linux-block&m=152938432207215&w=2

Anyway, I'm fine with using Christoph's approach and dropping this patch.

Bart.
Ming Lei June 27, 2018, 11:20 p.m. UTC | #3
On Wed, Jun 27, 2018 at 10:48:06AM -0700, Bart Van Assche wrote:
> On 06/26/18 18:13, Ming Lei wrote:
> > On Tue, Jun 26, 2018 at 03:26:24PM -0700, Bart Van Assche wrote:
> > > There is no good reason to use different code paths for different
> > > request operations. Hence remove the switch/case statement from
> > > bio_clone_bioset().
> > > 
> > > Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
> > > Cc: Christoph Hellwig <hch@lst.de>
> > > Cc: Ming Lei <ming.lei@redhat.com>
> > > ---
> > >   block/bio.c | 15 ++-------------
> > >   1 file changed, 2 insertions(+), 13 deletions(-)
> > > 
> > > diff --git a/block/bio.c b/block/bio.c
> > > index f7e3d88bd0b6..4c27cc9ea55e 100644
> > > --- a/block/bio.c
> > > +++ b/block/bio.c
> > > @@ -691,19 +691,8 @@ struct bio *bio_clone_bioset(struct bio *bio_src, gfp_t gfp_mask,
> > >   	bio->bi_iter.bi_sector	= bio_src->bi_iter.bi_sector;
> > >   	bio->bi_iter.bi_size	= bio_src->bi_iter.bi_size;
> > > -	switch (bio_op(bio)) {
> > > -	case REQ_OP_DISCARD:
> > > -	case REQ_OP_SECURE_ERASE:
> > > -	case REQ_OP_WRITE_ZEROES:
> > > -		break;
> > > -	case REQ_OP_WRITE_SAME:
> > > -		bio->bi_io_vec[bio->bi_vcnt++] = bio_src->bi_io_vec[0];
> > > -		break;
> > > -	default:
> > > -		bio_for_each_segment(bv, bio_src, iter)
> > > -			bio->bi_io_vec[bio->bi_vcnt++] = bv;
> > > -		break;
> > > -	}
> > > +	bio_for_each_segment(bv, bio_src, iter)
> > > +		bio->bi_io_vec[bio->bi_vcnt++] = bv;
> > 
> > The above change may not be correct for WRITE_SAME, since
> > bio_src->bi_iter.bi_size should be the actual bytes to write by drive.
> 
> Since bio_for_each_segment() neither modifies bio_src->bi_iter nor
> bio->bi_iter, the above patch retains the value copied into
> bio->bi_iter.bi_size before bio_for_each_segment() was called. In other

Yes.

> words, bio_src->bi_iter.bi_size is not modified and the resulting
> bio->bi_iter.bi_size should be identical with or without this patch.

That is true too.

But,

What we need to do is to only copy the 1st bvec for WRITE_SAME, your patch
changes to copy (bio->bi_iter.bi_size / block size) bvecs, then memory corruption
may be triggered. So bio_for_each_segment() can't be used here.

Thanks,
Ming
Bart Van Assche June 27, 2018, 11:55 p.m. UTC | #4
On 06/27/18 16:21, Ming Lei wrote:
> What we need to do is to only copy the 1st bvec for WRITE_SAME, your patch
> changes to copy (bio->bi_iter.bi_size / block size) bvecs, then memory corruption
> may be triggered. So bio_for_each_segment() can't be used here.

Has it been considered to use memcpy() to copy the bi_vcnt bio_vecs 
instead of using bio_for_each_segment() in bio_clone_bioset()? That will 
in this context probably be even faster than using bio_for_each_segment().

Bart.
Ming Lei June 28, 2018, 12:08 a.m. UTC | #5
On Wed, Jun 27, 2018 at 04:55:20PM -0700, Bart Van Assche wrote:
> On 06/27/18 16:21, Ming Lei wrote:
> > What we need to do is to only copy the 1st bvec for WRITE_SAME, your patch
> > changes to copy (bio->bi_iter.bi_size / block size) bvecs, then memory corruption
> > may be triggered. So bio_for_each_segment() can't be used here.
> 
> Has it been considered to use memcpy() to copy the bi_vcnt bio_vecs instead
> of using bio_for_each_segment() in bio_clone_bioset()? That will in this
> context probably be even faster than using bio_for_each_segment().

After immutable bvec is introduced:

1) there is little chance in which bvec table need to copy, so performance
may not be a issue, as you see, bio_clone_bioset() is going to die now.

2) bi_vcnt/bi_io_vec can't be used directly on fast-cloned bio

2) code is simplified a lot by using iterator helper since bio can be
advanced in unit of byte. In theory, we may cook a special helper to
speed up the copy, but still depends on use cases.

thanks,
Ming
diff mbox

Patch

diff --git a/block/bio.c b/block/bio.c
index f7e3d88bd0b6..4c27cc9ea55e 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -691,19 +691,8 @@  struct bio *bio_clone_bioset(struct bio *bio_src, gfp_t gfp_mask,
 	bio->bi_iter.bi_sector	= bio_src->bi_iter.bi_sector;
 	bio->bi_iter.bi_size	= bio_src->bi_iter.bi_size;
 
-	switch (bio_op(bio)) {
-	case REQ_OP_DISCARD:
-	case REQ_OP_SECURE_ERASE:
-	case REQ_OP_WRITE_ZEROES:
-		break;
-	case REQ_OP_WRITE_SAME:
-		bio->bi_io_vec[bio->bi_vcnt++] = bio_src->bi_io_vec[0];
-		break;
-	default:
-		bio_for_each_segment(bv, bio_src, iter)
-			bio->bi_io_vec[bio->bi_vcnt++] = bv;
-		break;
-	}
+	bio_for_each_segment(bv, bio_src, iter)
+		bio->bi_io_vec[bio->bi_vcnt++] = bv;
 
 	if (bio_integrity(bio_src)) {
 		int ret;