diff mbox series

[03/16] block: optimise req_bio_endio()

Message ID 8061fa49096e1a0e44849aa204a0a1ae57c4423a.1634676157.git.asml.silence@gmail.com (mailing list archive)
State New, archived
Headers show
Series block optimisation round | expand

Commit Message

Pavel Begunkov Oct. 19, 2021, 9:24 p.m. UTC
First, get rid of an extra branch and chain error checks. Also reshuffle
it with bio_advance(), so it goes closer to the final check, with that
the compiler loads rq->rq_flags only once, and also doesn't reload
bio->bi_iter.bi_size if bio_advance() didn't actually advanced the iter.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 block/blk-mq.c | 16 +++++++---------
 1 file changed, 7 insertions(+), 9 deletions(-)

Comments

Christoph Hellwig Oct. 20, 2021, 6:11 a.m. UTC | #1
On Tue, Oct 19, 2021 at 10:24:12PM +0100, Pavel Begunkov wrote:
> First, get rid of an extra branch and chain error checks. Also reshuffle
> it with bio_advance(), so it goes closer to the final check, with that
> the compiler loads rq->rq_flags only once, and also doesn't reload
> bio->bi_iter.bi_size if bio_advance() didn't actually advanced the iter.

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

We should also probably look into killing the whole strange RQF_QUIET
to BIO_QUIET some day.
Shinichiro Kawasaki Oct. 22, 2021, 9:58 a.m. UTC | #2
Hello Pavel,

Recently I tried out for-next branch and observed that simple dd command to
zonefs files causes an I/O error.

$ sudo dd if=/dev/zero of=/mnt/seq/0 bs=4096 count=1 oflag=direct
dd: error writing '/mnt/seq/0': Input/output error
1+0 records in
0+0 records out
0 bytes copied, 0.00409641 s, 0.0 kB/s

At that time, kernel reported warnings.

[90713.298721][ T2735] zonefs (nvme0n1) WARNING: inode 1: invalid size 0 (should be 4096)
[90713.299761][ T2735] zonefs (nvme0n1) WARNING: remounting filesystem read-only

I bisected and found that this patch triggers the error and warnings. I think
one liner change is needed in this patch. Please find it below, in line.


On Oct 19, 2021 / 22:24, Pavel Begunkov wrote:
> First, get rid of an extra branch and chain error checks. Also reshuffle
> it with bio_advance(), so it goes closer to the final check, with that
> the compiler loads rq->rq_flags only once, and also doesn't reload
> bio->bi_iter.bi_size if bio_advance() didn't actually advanced the iter.
> 
> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
> ---
>  block/blk-mq.c | 16 +++++++---------
>  1 file changed, 7 insertions(+), 9 deletions(-)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 3481a8712234..bab1fccda6ca 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -617,25 +617,23 @@ void blk_mq_free_plug_rqs(struct blk_plug *plug)
>  static void req_bio_endio(struct request *rq, struct bio *bio,
>  			  unsigned int nbytes, blk_status_t error)
>  {
> -	if (error)
> +	if (unlikely(error)) {
>  		bio->bi_status = error;
> -
> -	if (unlikely(rq->rq_flags & RQF_QUIET))
> -		bio_set_flag(bio, BIO_QUIET);
> -
> -	bio_advance(bio, nbytes);
> -
> -	if (req_op(rq) == REQ_OP_ZONE_APPEND && error == BLK_STS_OK) {
> +	} else if (req_op(rq) == REQ_OP_ZONE_APPEND) {
>  		/*
>  		 * Partial zone append completions cannot be supported as the
>  		 * BIO fragments may end up not being written sequentially.
>  		 */
> -		if (bio->bi_iter.bi_size)
> +		if (bio->bi_iter.bi_size == nbytes)

I think the line above should be,

		if (bio->bi_iter.bi_size != nbytes)

Before applying the patch, the if statement checked "bi_size is not zero".
After applying the patch, bio_advance(bio, nbytes) moved after this check.
Then bi_size is not decremented by nbytes and the check should be "bi_size is
not nbytes". With this modification, the I/O error and the warnings go away.

>  			bio->bi_status = BLK_STS_IOERR;
>  		else
>  			bio->bi_iter.bi_sector = rq->__sector;
>  	}
>  
> +	bio_advance(bio, nbytes);
> +
> +	if (unlikely(rq->rq_flags & RQF_QUIET))
> +		bio_set_flag(bio, BIO_QUIET);
>  	/* don't actually finish bio if it's part of flush sequence */
>  	if (bio->bi_iter.bi_size == 0 && !(rq->rq_flags & RQF_FLUSH_SEQ))
>  		bio_endio(bio);
> -- 
> 2.33.1
>
Pavel Begunkov Oct. 22, 2021, 10:58 a.m. UTC | #3
On 10/22/21 10:58, Shinichiro Kawasaki wrote:
> Hello Pavel,
> 
> Recently I tried out for-next branch and observed that simple dd command to
> zonefs files causes an I/O error.
> 
> $ sudo dd if=/dev/zero of=/mnt/seq/0 bs=4096 count=1 oflag=direct
> dd: error writing '/mnt/seq/0': Input/output error
> 1+0 records in
> 0+0 records out
> 0 bytes copied, 0.00409641 s, 0.0 kB/s
> 
> At that time, kernel reported warnings.
> 
> [90713.298721][ T2735] zonefs (nvme0n1) WARNING: inode 1: invalid size 0 (should be 4096)
> [90713.299761][ T2735] zonefs (nvme0n1) WARNING: remounting filesystem read-only
> 
> I bisected and found that this patch triggers the error and warnings. I think
> one liner change is needed in this patch. Please find it below, in line.

[...]
> -	if (req_op(rq) == REQ_OP_ZONE_APPEND && error == BLK_STS_OK) {
>> +	} else if (req_op(rq) == REQ_OP_ZONE_APPEND) {
>>   		/*
>>   		 * Partial zone append completions cannot be supported as the
>>   		 * BIO fragments may end up not being written sequentially.
>>   		 */
>> -		if (bio->bi_iter.bi_size)
>> +		if (bio->bi_iter.bi_size == nbytes)
> 
> I think the line above should be,
> 
> 		if (bio->bi_iter.bi_size != nbytes)

You're right, that was a stupid mistake, thanks!

Jens, will you fold it in or would you prefer a patch?


> Before applying the patch, the if statement checked "bi_size is not zero".
> After applying the patch, bio_advance(bio, nbytes) moved after this check.
> Then bi_size is not decremented by nbytes and the check should be "bi_size is
> not nbytes". With this modification, the I/O error and the warnings go away.
> 
>>   			bio->bi_status = BLK_STS_IOERR;
>>   		else
>>   			bio->bi_iter.bi_sector = rq->__sector;
>>   	}
>>   
>> +	bio_advance(bio, nbytes);
>> +
>> +	if (unlikely(rq->rq_flags & RQF_QUIET))
>> +		bio_set_flag(bio, BIO_QUIET);
>>   	/* don't actually finish bio if it's part of flush sequence */
>>   	if (bio->bi_iter.bi_size == 0 && !(rq->rq_flags & RQF_FLUSH_SEQ))
>>   		bio_endio(bio);
>> -- 
>> 2.33.1
>>
>
diff mbox series

Patch

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 3481a8712234..bab1fccda6ca 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -617,25 +617,23 @@  void blk_mq_free_plug_rqs(struct blk_plug *plug)
 static void req_bio_endio(struct request *rq, struct bio *bio,
 			  unsigned int nbytes, blk_status_t error)
 {
-	if (error)
+	if (unlikely(error)) {
 		bio->bi_status = error;
-
-	if (unlikely(rq->rq_flags & RQF_QUIET))
-		bio_set_flag(bio, BIO_QUIET);
-
-	bio_advance(bio, nbytes);
-
-	if (req_op(rq) == REQ_OP_ZONE_APPEND && error == BLK_STS_OK) {
+	} else if (req_op(rq) == REQ_OP_ZONE_APPEND) {
 		/*
 		 * Partial zone append completions cannot be supported as the
 		 * BIO fragments may end up not being written sequentially.
 		 */
-		if (bio->bi_iter.bi_size)
+		if (bio->bi_iter.bi_size == nbytes)
 			bio->bi_status = BLK_STS_IOERR;
 		else
 			bio->bi_iter.bi_sector = rq->__sector;
 	}
 
+	bio_advance(bio, nbytes);
+
+	if (unlikely(rq->rq_flags & RQF_QUIET))
+		bio_set_flag(bio, BIO_QUIET);
 	/* don't actually finish bio if it's part of flush sequence */
 	if (bio->bi_iter.bi_size == 0 && !(rq->rq_flags & RQF_FLUSH_SEQ))
 		bio_endio(bio);