Message ID | 1e3d150c-8d0c-42b9-b479-0aa55f0ab86f@kernel.dk (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | io_uring/rw: always clear ->bytes_done on io_async_rw setup | expand |
Jens Axboe <axboe@kernel.dk> writes: > A previous commit mistakenly moved the clearing of the in-progress byte > count into the section that's dependent on having a cached iovec or not, > but it should be cleared for any IO. If not, then extra bytes may be > added at IO completion time, causing potentially weird behavior like > over-reporting the amount of IO done. Hi Jens, Sorry for the delay. I went completely offline during the christmas week. Did this solve the sysbot report? I'm failing to understand how it can happen. This could only be hit if the allocation returned a cached object that doesn't have a free_iov, since any newly kmalloc'ed object will have this field cleaned inside the io_rw_async_data_init callback. But I don't understand where we can cache the rw object without having a valid free_iov - it didn't seem possible to me before or now. the iov is freed only by io_rw_iovec_free, which is called from (1) io_rw_recycle, in the case where we don't cache. we drop also drop the CLEANUP flag, so we will just call kfree inside io_clean_op later. (2) io_readv_writev_cleanup: where we also don't cache, since we are inside the io_clean_op, we'll just hit the kfree(req->async_data), and (3) io_rw_cache_free: where we are emptying the cache to shut down. > diff --git a/io_uring/rw.c b/io_uring/rw.c > index 75f70935ccf4..ca1b19d3d142 100644 > --- a/io_uring/rw.c > +++ b/io_uring/rw.c > @@ -228,8 +228,8 @@ static int io_rw_alloc_async(struct io_kiocb *req) > kasan_mempool_unpoison_object(rw->free_iovec, > rw->free_iov_nr * sizeof(struct iovec)); > req->flags |= REQ_F_NEED_CLEANUP; > - rw->bytes_done = 0; > } > + rw->bytes_done = 0; > return 0; > }
On 12/30/24 9:08 AM, Gabriel Krisman Bertazi wrote: > Jens Axboe <axboe@kernel.dk> writes: > >> A previous commit mistakenly moved the clearing of the in-progress byte >> count into the section that's dependent on having a cached iovec or not, >> but it should be cleared for any IO. If not, then extra bytes may be >> added at IO completion time, causing potentially weird behavior like >> over-reporting the amount of IO done. > > Hi Jens, > > Sorry for the delay. I went completely offline during the christmas > week. No worries, sounds like a good plan! > Did this solve the sysbot report? I'm failing to understand how it can > happen. This could only be hit if the allocation returned a cached > object that doesn't have a free_iov, since any newly kmalloc'ed object > will have this field cleaned inside the io_rw_async_data_init callback. > But I don't understand where we can cache the rw object without having a > valid free_iov - it didn't seem possible to me before or now. Not sure I follow - you may never have a valid free_iov, it completely depends on whether or not the existing rw user needed to allocate an iov or not. Hence it's indeed possible that there's a free_iov and the user doesn't need or use it, or the opposite of there not being one and the user then allocating one that persists. In any case, it's of course orthogonal to the issue here, which is that ->bytes_done must _always_ be initialized, it has no dependency on a free_iovec or not. Whenever someone gets an 'rw', it should be pristine in that sense.
Jens Axboe <axboe@kernel.dk> writes: > On 12/30/24 9:08 AM, Gabriel Krisman Bertazi wrote: >> Jens Axboe <axboe@kernel.dk> writes: >> >>> A previous commit mistakenly moved the clearing of the in-progress byte >>> count into the section that's dependent on having a cached iovec or not, >>> but it should be cleared for any IO. If not, then extra bytes may be >>> added at IO completion time, causing potentially weird behavior like >>> over-reporting the amount of IO done. >> >> Hi Jens, >> >> Sorry for the delay. I went completely offline during the christmas >> week. > > No worries, sounds like a good plan! > >> Did this solve the sysbot report? I'm failing to understand how it can >> happen. This could only be hit if the allocation returned a cached >> object that doesn't have a free_iov, since any newly kmalloc'ed object >> will have this field cleaned inside the io_rw_async_data_init callback. >> But I don't understand where we can cache the rw object without having a >> valid free_iov - it didn't seem possible to me before or now. > > Not sure I follow - you may never have a valid free_iov, it completely > depends on whether or not the existing rw user needed to allocate an iov > or not. > Hence it's indeed possible that there's a free_iov and the user > doesn't need or use it, or the opposite of there not being one and the > user then allocating one that persists. > > In any case, it's of course orthogonal to the issue here, which is that > ->bytes_done must _always_ be initialized, it has no dependency on a > free_iovec or not. Whenever someone gets an 'rw', it should be pristine > in that sense. I see. In addition, I was actually confusing rw->free_iov_nr with rw->bytes_done when writing my previous message. The first needs to have a valid value if ->free_iov is valid. Thanks for the explanation and making me review this code. The fix looks good to me now, obviously. Thanks,
On 12/30/24 4:02 PM, Gabriel Krisman Bertazi wrote: > Jens Axboe <axboe@kernel.dk> writes: > >> On 12/30/24 9:08 AM, Gabriel Krisman Bertazi wrote: >>> Jens Axboe <axboe@kernel.dk> writes: >>> >>>> A previous commit mistakenly moved the clearing of the in-progress byte >>>> count into the section that's dependent on having a cached iovec or not, >>>> but it should be cleared for any IO. If not, then extra bytes may be >>>> added at IO completion time, causing potentially weird behavior like >>>> over-reporting the amount of IO done. >>> >>> Hi Jens, >>> >>> Sorry for the delay. I went completely offline during the christmas >>> week. >> >> No worries, sounds like a good plan! >> >>> Did this solve the sysbot report? I'm failing to understand how it can >>> happen. This could only be hit if the allocation returned a cached >>> object that doesn't have a free_iov, since any newly kmalloc'ed object >>> will have this field cleaned inside the io_rw_async_data_init callback. >>> But I don't understand where we can cache the rw object without having a >>> valid free_iov - it didn't seem possible to me before or now. >> >> Not sure I follow - you may never have a valid free_iov, it completely >> depends on whether or not the existing rw user needed to allocate an iov >> or not. > >> Hence it's indeed possible that there's a free_iov and the user >> doesn't need or use it, or the opposite of there not being one and the >> user then allocating one that persists. >> >> In any case, it's of course orthogonal to the issue here, which is that >> ->bytes_done must _always_ be initialized, it has no dependency on a >> free_iovec or not. Whenever someone gets an 'rw', it should be pristine >> in that sense. > > I see. In addition, I was actually confusing rw->free_iov_nr with > rw->bytes_done when writing my previous message. The first needs to > have a valid value if ->free_iov is valid. Thanks for the explanation > and making me review this code. Ah right, yes free_iov_nr would obviously need to be valid if free_iov is set. > The fix looks good to me now, obviously. Thanks for checking!
diff --git a/io_uring/rw.c b/io_uring/rw.c index 75f70935ccf4..ca1b19d3d142 100644 --- a/io_uring/rw.c +++ b/io_uring/rw.c @@ -228,8 +228,8 @@ static int io_rw_alloc_async(struct io_kiocb *req) kasan_mempool_unpoison_object(rw->free_iovec, rw->free_iov_nr * sizeof(struct iovec)); req->flags |= REQ_F_NEED_CLEANUP; - rw->bytes_done = 0; } + rw->bytes_done = 0; return 0; }
A previous commit mistakenly moved the clearing of the in-progress byte count into the section that's dependent on having a cached iovec or not, but it should be cleared for any IO. If not, then extra bytes may be added at IO completion time, causing potentially weird behavior like over-reporting the amount of IO done. Fixes: f628c7e5a7c0 ("io_uring/rw: Allocate async data through helper") Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202412271132.a09c3500-lkp@intel.com Signed-off-by: Jens Axboe <axboe@kernel.dk> ---