diff mbox series

io_uring/rw: always clear ->bytes_done on io_async_rw setup

Message ID 1e3d150c-8d0c-42b9-b479-0aa55f0ab86f@kernel.dk (mailing list archive)
State New
Headers show
Series io_uring/rw: always clear ->bytes_done on io_async_rw setup | expand

Commit Message

Jens Axboe Dec. 27, 2024, 4:53 p.m. UTC
A previous commit mistakenly moved the clearing of the in-progress byte
count into the section that's dependent on having a cached iovec or not,
but it should be cleared for any IO. If not, then extra bytes may be
added at IO completion time, causing potentially weird behavior like
over-reporting the amount of IO done.

Fixes: f628c7e5a7c0 ("io_uring/rw: Allocate async data through helper")
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202412271132.a09c3500-lkp@intel.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

---

Comments

Gabriel Krisman Bertazi Dec. 30, 2024, 4:08 p.m. UTC | #1
Jens Axboe <axboe@kernel.dk> writes:

> A previous commit mistakenly moved the clearing of the in-progress byte
> count into the section that's dependent on having a cached iovec or not,
> but it should be cleared for any IO. If not, then extra bytes may be
> added at IO completion time, causing potentially weird behavior like
> over-reporting the amount of IO done.

Hi Jens,

Sorry for the delay.  I went completely offline during the christmas
week.

Did this solve the sysbot report?  I'm failing to understand how it can
happen.  This could only be hit if the allocation returned a cached
object that doesn't have a free_iov, since any newly kmalloc'ed object
will have this field cleaned inside the io_rw_async_data_init callback.
But I don't understand where we can cache the rw object without having a
valid free_iov - it didn't seem possible to me before or now.

the iov is freed only by io_rw_iovec_free, which is called from

(1) io_rw_recycle, in the case where we don't cache.  we drop also
drop the CLEANUP flag, so we will just call kfree inside io_clean_op later.
(2) io_readv_writev_cleanup: where we also don't cache, since we are inside
    the io_clean_op, we'll just hit the kfree(req->async_data), and
(3) io_rw_cache_free:  where we are emptying the cache to shut down.

> diff --git a/io_uring/rw.c b/io_uring/rw.c
> index 75f70935ccf4..ca1b19d3d142 100644
> --- a/io_uring/rw.c
> +++ b/io_uring/rw.c
> @@ -228,8 +228,8 @@ static int io_rw_alloc_async(struct io_kiocb *req)
>  		kasan_mempool_unpoison_object(rw->free_iovec,
>  					      rw->free_iov_nr * sizeof(struct iovec));
>  		req->flags |= REQ_F_NEED_CLEANUP;
> -		rw->bytes_done = 0;
>  	}
> +	rw->bytes_done = 0;
>  	return 0;
>  }
Jens Axboe Dec. 30, 2024, 4:58 p.m. UTC | #2
On 12/30/24 9:08 AM, Gabriel Krisman Bertazi wrote:
> Jens Axboe <axboe@kernel.dk> writes:
> 
>> A previous commit mistakenly moved the clearing of the in-progress byte
>> count into the section that's dependent on having a cached iovec or not,
>> but it should be cleared for any IO. If not, then extra bytes may be
>> added at IO completion time, causing potentially weird behavior like
>> over-reporting the amount of IO done.
> 
> Hi Jens,
> 
> Sorry for the delay.  I went completely offline during the christmas
> week.

No worries, sounds like a good plan!

> Did this solve the sysbot report?  I'm failing to understand how it can
> happen.  This could only be hit if the allocation returned a cached
> object that doesn't have a free_iov, since any newly kmalloc'ed object
> will have this field cleaned inside the io_rw_async_data_init callback.
> But I don't understand where we can cache the rw object without having a
> valid free_iov - it didn't seem possible to me before or now.

Not sure I follow - you may never have a valid free_iov, it completely
depends on whether or not the existing rw user needed to allocate an iov
or not. Hence it's indeed possible that there's a free_iov and the user
doesn't need or use it, or the opposite of there not being one and the
user then allocating one that persists.

In any case, it's of course orthogonal to the issue here, which is that
->bytes_done must _always_ be initialized, it has no dependency on a
free_iovec or not. Whenever someone gets an 'rw', it should be pristine
in that sense.
Gabriel Krisman Bertazi Dec. 30, 2024, 11:02 p.m. UTC | #3
Jens Axboe <axboe@kernel.dk> writes:

> On 12/30/24 9:08 AM, Gabriel Krisman Bertazi wrote:
>> Jens Axboe <axboe@kernel.dk> writes:
>> 
>>> A previous commit mistakenly moved the clearing of the in-progress byte
>>> count into the section that's dependent on having a cached iovec or not,
>>> but it should be cleared for any IO. If not, then extra bytes may be
>>> added at IO completion time, causing potentially weird behavior like
>>> over-reporting the amount of IO done.
>> 
>> Hi Jens,
>> 
>> Sorry for the delay.  I went completely offline during the christmas
>> week.
>
> No worries, sounds like a good plan!
>
>> Did this solve the sysbot report?  I'm failing to understand how it can
>> happen.  This could only be hit if the allocation returned a cached
>> object that doesn't have a free_iov, since any newly kmalloc'ed object
>> will have this field cleaned inside the io_rw_async_data_init callback.
>> But I don't understand where we can cache the rw object without having a
>> valid free_iov - it didn't seem possible to me before or now.
>
> Not sure I follow - you may never have a valid free_iov, it completely
> depends on whether or not the existing rw user needed to allocate an iov
> or not.

> Hence it's indeed possible that there's a free_iov and the user
> doesn't need or use it, or the opposite of there not being one and the
> user then allocating one that persists.
>
> In any case, it's of course orthogonal to the issue here, which is that
> ->bytes_done must _always_ be initialized, it has no dependency on a
> free_iovec or not. Whenever someone gets an 'rw', it should be pristine
> in that sense.

I see. In addition, I was actually confusing rw->free_iov_nr with
rw->bytes_done when writing my previous message.  The first needs to
have a valid value if ->free_iov is valid. Thanks for the explanation
and making me review this code.

The fix looks good to me now, obviously.

Thanks,
Jens Axboe Dec. 31, 2024, 12:13 a.m. UTC | #4
On 12/30/24 4:02 PM, Gabriel Krisman Bertazi wrote:
> Jens Axboe <axboe@kernel.dk> writes:
> 
>> On 12/30/24 9:08 AM, Gabriel Krisman Bertazi wrote:
>>> Jens Axboe <axboe@kernel.dk> writes:
>>>
>>>> A previous commit mistakenly moved the clearing of the in-progress byte
>>>> count into the section that's dependent on having a cached iovec or not,
>>>> but it should be cleared for any IO. If not, then extra bytes may be
>>>> added at IO completion time, causing potentially weird behavior like
>>>> over-reporting the amount of IO done.
>>>
>>> Hi Jens,
>>>
>>> Sorry for the delay.  I went completely offline during the christmas
>>> week.
>>
>> No worries, sounds like a good plan!
>>
>>> Did this solve the sysbot report?  I'm failing to understand how it can
>>> happen.  This could only be hit if the allocation returned a cached
>>> object that doesn't have a free_iov, since any newly kmalloc'ed object
>>> will have this field cleaned inside the io_rw_async_data_init callback.
>>> But I don't understand where we can cache the rw object without having a
>>> valid free_iov - it didn't seem possible to me before or now.
>>
>> Not sure I follow - you may never have a valid free_iov, it completely
>> depends on whether or not the existing rw user needed to allocate an iov
>> or not.
> 
>> Hence it's indeed possible that there's a free_iov and the user
>> doesn't need or use it, or the opposite of there not being one and the
>> user then allocating one that persists.
>>
>> In any case, it's of course orthogonal to the issue here, which is that
>> ->bytes_done must _always_ be initialized, it has no dependency on a
>> free_iovec or not. Whenever someone gets an 'rw', it should be pristine
>> in that sense.
> 
> I see. In addition, I was actually confusing rw->free_iov_nr with
> rw->bytes_done when writing my previous message.  The first needs to
> have a valid value if ->free_iov is valid. Thanks for the explanation
> and making me review this code.

Ah right, yes free_iov_nr would obviously need to be valid if free_iov
is set.

> The fix looks good to me now, obviously.

Thanks for checking!
diff mbox series

Patch

diff --git a/io_uring/rw.c b/io_uring/rw.c
index 75f70935ccf4..ca1b19d3d142 100644
--- a/io_uring/rw.c
+++ b/io_uring/rw.c
@@ -228,8 +228,8 @@  static int io_rw_alloc_async(struct io_kiocb *req)
 		kasan_mempool_unpoison_object(rw->free_iovec,
 					      rw->free_iov_nr * sizeof(struct iovec));
 		req->flags |= REQ_F_NEED_CLEANUP;
-		rw->bytes_done = 0;
 	}
+	rw->bytes_done = 0;
 	return 0;
 }