Message ID | cover.1572189860.git.asml.silence@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | cleanup submission path | expand |
On 10/27/19 9:35 AM, Pavel Begunkov wrote: > A small cleanup of very similar but diverged io_submit_sqes() and > io_ring_submit() > > Pavel Begunkov (2): > io_uring: handle mm_fault outside of submission > io_uring: merge io_submit_sqes and io_ring_submit > > fs/io_uring.c | 116 ++++++++++++++------------------------------------ > 1 file changed, 33 insertions(+), 83 deletions(-) I like the cleanups here, but one thing that seems off is the assumption that io_sq_thread() always needs to grab the mm. If the sqes processed are just READ/WRITE_FIXED, then it never needs to grab the mm.
On 27/10/2019 19:32, Jens Axboe wrote: > On 10/27/19 9:35 AM, Pavel Begunkov wrote: >> A small cleanup of very similar but diverged io_submit_sqes() and >> io_ring_submit() >> >> Pavel Begunkov (2): >> io_uring: handle mm_fault outside of submission >> io_uring: merge io_submit_sqes and io_ring_submit >> >> fs/io_uring.c | 116 ++++++++++++++------------------------------------ >> 1 file changed, 33 insertions(+), 83 deletions(-) > > I like the cleanups here, but one thing that seems off is the > assumption that io_sq_thread() always needs to grab the mm. If > the sqes processed are just READ/WRITE_FIXED, then it never needs > to grab the mm. > Yeah, we removed it to fix bugs. Personally, I think it would be clearer to do lazy grabbing conditionally, rather than have two functions. And in this case it's easier to do after merging. Do you prefer to return it back first?
On 10/27/19 10:44 AM, Pavel Begunkov wrote: > On 27/10/2019 19:32, Jens Axboe wrote: >> On 10/27/19 9:35 AM, Pavel Begunkov wrote: >>> A small cleanup of very similar but diverged io_submit_sqes() and >>> io_ring_submit() >>> >>> Pavel Begunkov (2): >>> io_uring: handle mm_fault outside of submission >>> io_uring: merge io_submit_sqes and io_ring_submit >>> >>> fs/io_uring.c | 116 ++++++++++++++------------------------------------ >>> 1 file changed, 33 insertions(+), 83 deletions(-) >> >> I like the cleanups here, but one thing that seems off is the >> assumption that io_sq_thread() always needs to grab the mm. If >> the sqes processed are just READ/WRITE_FIXED, then it never needs >> to grab the mm. >> Yeah, we removed it to fix bugs. Personally, I think it would be > clearer to do lazy grabbing conditionally, rather than have two > functions. And in this case it's easier to do after merging. > > Do you prefer to return it back first? Ah I see, no I don't care about that.
On 10/27/19 10:49 AM, Jens Axboe wrote: > On 10/27/19 10:44 AM, Pavel Begunkov wrote: >> On 27/10/2019 19:32, Jens Axboe wrote: >>> On 10/27/19 9:35 AM, Pavel Begunkov wrote: >>>> A small cleanup of very similar but diverged io_submit_sqes() and >>>> io_ring_submit() >>>> >>>> Pavel Begunkov (2): >>>> io_uring: handle mm_fault outside of submission >>>> io_uring: merge io_submit_sqes and io_ring_submit >>>> >>>> fs/io_uring.c | 116 ++++++++++++++------------------------------------ >>>> 1 file changed, 33 insertions(+), 83 deletions(-) >>> >>> I like the cleanups here, but one thing that seems off is the >>> assumption that io_sq_thread() always needs to grab the mm. If >>> the sqes processed are just READ/WRITE_FIXED, then it never needs >>> to grab the mm. >>> Yeah, we removed it to fix bugs. Personally, I think it would be >> clearer to do lazy grabbing conditionally, rather than have two >> functions. And in this case it's easier to do after merging. >> >> Do you prefer to return it back first? > > Ah I see, no I don't care about that. OK, looked at the post-patches state. It's still not correct. You are grabbing the mm from io_sq_thread() unconditionally. We should not do that, only if the sqes we need to submit need mm context.
On 27/10/2019 19:56, Jens Axboe wrote: > On 10/27/19 10:49 AM, Jens Axboe wrote: >> On 10/27/19 10:44 AM, Pavel Begunkov wrote: >>> On 27/10/2019 19:32, Jens Axboe wrote: >>>> On 10/27/19 9:35 AM, Pavel Begunkov wrote: >>>>> A small cleanup of very similar but diverged io_submit_sqes() and >>>>> io_ring_submit() >>>>> >>>>> Pavel Begunkov (2): >>>>> io_uring: handle mm_fault outside of submission >>>>> io_uring: merge io_submit_sqes and io_ring_submit >>>>> >>>>> fs/io_uring.c | 116 ++++++++++++++------------------------------------ >>>>> 1 file changed, 33 insertions(+), 83 deletions(-) >>>> >>>> I like the cleanups here, but one thing that seems off is the >>>> assumption that io_sq_thread() always needs to grab the mm. If >>>> the sqes processed are just READ/WRITE_FIXED, then it never needs >>>> to grab the mm. >>>> Yeah, we removed it to fix bugs. Personally, I think it would be >>> clearer to do lazy grabbing conditionally, rather than have two >>> functions. And in this case it's easier to do after merging. >>> >>> Do you prefer to return it back first? >> >> Ah I see, no I don't care about that. > > OK, looked at the post-patches state. It's still not correct. You are > grabbing the mm from io_sq_thread() unconditionally. We should not do > that, only if the sqes we need to submit need mm context. > That's what my question to the fix was about :) 1. Then, what the case it could fail? 2. Is it ok to hold it while polling? It could keep it for quite a long time if host is swift, e.g. submit->poll->submit->poll-> ... Anyway, I will add it back and resend the patchset.
On 10/27/19 11:19 AM, Pavel Begunkov wrote: > On 27/10/2019 19:56, Jens Axboe wrote: >> On 10/27/19 10:49 AM, Jens Axboe wrote: >>> On 10/27/19 10:44 AM, Pavel Begunkov wrote: >>>> On 27/10/2019 19:32, Jens Axboe wrote: >>>>> On 10/27/19 9:35 AM, Pavel Begunkov wrote: >>>>>> A small cleanup of very similar but diverged io_submit_sqes() and >>>>>> io_ring_submit() >>>>>> >>>>>> Pavel Begunkov (2): >>>>>> io_uring: handle mm_fault outside of submission >>>>>> io_uring: merge io_submit_sqes and io_ring_submit >>>>>> >>>>>> fs/io_uring.c | 116 ++++++++++++++------------------------------------ >>>>>> 1 file changed, 33 insertions(+), 83 deletions(-) >>>>> >>>>> I like the cleanups here, but one thing that seems off is the >>>>> assumption that io_sq_thread() always needs to grab the mm. If >>>>> the sqes processed are just READ/WRITE_FIXED, then it never needs >>>>> to grab the mm. >>>>> Yeah, we removed it to fix bugs. Personally, I think it would be >>>> clearer to do lazy grabbing conditionally, rather than have two >>>> functions. And in this case it's easier to do after merging. >>>> >>>> Do you prefer to return it back first? >>> >>> Ah I see, no I don't care about that. >> >> OK, looked at the post-patches state. It's still not correct. You are >> grabbing the mm from io_sq_thread() unconditionally. We should not do >> that, only if the sqes we need to submit need mm context. >> > That's what my question to the fix was about :) > 1. Then, what the case it could fail? > 2. Is it ok to hold it while polling? It could keep it for quite > a long time if host is swift, e.g. submit->poll->submit->poll-> ... > > Anyway, I will add it back and resend the patchset. If possible in a simple way, I'd prefer if we do it as a prep patch and then queue that up for 5.4 since we now lost that optimization. Then layer the other 2 on top of that, since I'll just rebase the 5.5 stuff on top of that. If not trivially possible for 5.4, then we'll just have to leave with it in that release. For that case, you can fold the change in with these two patches.
On 27/10/2019 20:26, Jens Axboe wrote: > On 10/27/19 11:19 AM, Pavel Begunkov wrote: >> On 27/10/2019 19:56, Jens Axboe wrote: >>> On 10/27/19 10:49 AM, Jens Axboe wrote: >>>> On 10/27/19 10:44 AM, Pavel Begunkov wrote: >>>>> On 27/10/2019 19:32, Jens Axboe wrote: >>>>>> On 10/27/19 9:35 AM, Pavel Begunkov wrote: >>>>>>> A small cleanup of very similar but diverged io_submit_sqes() and >>>>>>> io_ring_submit() >>>>>>> >>>>>>> Pavel Begunkov (2): >>>>>>> io_uring: handle mm_fault outside of submission >>>>>>> io_uring: merge io_submit_sqes and io_ring_submit >>>>>>> >>>>>>> fs/io_uring.c | 116 ++++++++++++++------------------------------------ >>>>>>> 1 file changed, 33 insertions(+), 83 deletions(-) >>>>>> >>>>>> I like the cleanups here, but one thing that seems off is the >>>>>> assumption that io_sq_thread() always needs to grab the mm. If >>>>>> the sqes processed are just READ/WRITE_FIXED, then it never needs >>>>>> to grab the mm. >>>>>> Yeah, we removed it to fix bugs. Personally, I think it would be >>>>> clearer to do lazy grabbing conditionally, rather than have two >>>>> functions. And in this case it's easier to do after merging. >>>>> >>>>> Do you prefer to return it back first? >>>> >>>> Ah I see, no I don't care about that. >>> >>> OK, looked at the post-patches state. It's still not correct. You are >>> grabbing the mm from io_sq_thread() unconditionally. We should not do >>> that, only if the sqes we need to submit need mm context. >>> >> That's what my question to the fix was about :) >> 1. Then, what the case it could fail? >> 2. Is it ok to hold it while polling? It could keep it for quite >> a long time if host is swift, e.g. submit->poll->submit->poll-> ... >> >> Anyway, I will add it back and resend the patchset. > > If possible in a simple way, I'd prefer if we do it as a prep patch and > then queue that up for 5.4 since we now lost that optimization. Then > layer the other 2 on top of that, since I'll just rebase the 5.5 stuff > on top of that. Sure, will do this way. There won't be much difference. > > If not trivially possible for 5.4, then we'll just have to leave with it > in that release. For that case, you can fold the change in with these > two patches. >
On 27/10/2019 20:26, Jens Axboe wrote: > On 10/27/19 11:19 AM, Pavel Begunkov wrote: >> On 27/10/2019 19:56, Jens Axboe wrote: >>> On 10/27/19 10:49 AM, Jens Axboe wrote: >>>> On 10/27/19 10:44 AM, Pavel Begunkov wrote: >>>>> On 27/10/2019 19:32, Jens Axboe wrote: >>>>>> On 10/27/19 9:35 AM, Pavel Begunkov wrote: >>>>>>> A small cleanup of very similar but diverged io_submit_sqes() and >>>>>>> io_ring_submit() >>>>>>> >>>>>>> Pavel Begunkov (2): >>>>>>> io_uring: handle mm_fault outside of submission >>>>>>> io_uring: merge io_submit_sqes and io_ring_submit >>>>>>> >>>>>>> fs/io_uring.c | 116 ++++++++++++++------------------------------------ >>>>>>> 1 file changed, 33 insertions(+), 83 deletions(-) >>>>>> >>>>>> I like the cleanups here, but one thing that seems off is the >>>>>> assumption that io_sq_thread() always needs to grab the mm. If >>>>>> the sqes processed are just READ/WRITE_FIXED, then it never needs >>>>>> to grab the mm. >>>>>> Yeah, we removed it to fix bugs. Personally, I think it would be >>>>> clearer to do lazy grabbing conditionally, rather than have two >>>>> functions. And in this case it's easier to do after merging. >>>>> >>>>> Do you prefer to return it back first? >>>> >>>> Ah I see, no I don't care about that. >>> >>> OK, looked at the post-patches state. It's still not correct. You are >>> grabbing the mm from io_sq_thread() unconditionally. We should not do >>> that, only if the sqes we need to submit need mm context. >>> >> That's what my question to the fix was about :) >> 1. Then, what the case it could fail? >> 2. Is it ok to hold it while polling? It could keep it for quite >> a long time if host is swift, e.g. submit->poll->submit->poll-> ... >> >> Anyway, I will add it back and resend the patchset. > > If possible in a simple way, I'd prefer if we do it as a prep patch and > then queue that up for 5.4 since we now lost that optimization. Then > layer the other 2 on top of that, since I'll just rebase the 5.5 stuff > on top of that. > > If not trivially possible for 5.4, then we'll just have to leave with it > in that release. For that case, you can fold the change in with these > two patches. > Hmm, what's the semantics? I think we should fail only those who need mm, but can't get it. The alternative is to fail all subsequent after the first mm_fault.
On 10/27/19 12:56 PM, Pavel Begunkov wrote: > On 27/10/2019 20:26, Jens Axboe wrote: >> On 10/27/19 11:19 AM, Pavel Begunkov wrote: >>> On 27/10/2019 19:56, Jens Axboe wrote: >>>> On 10/27/19 10:49 AM, Jens Axboe wrote: >>>>> On 10/27/19 10:44 AM, Pavel Begunkov wrote: >>>>>> On 27/10/2019 19:32, Jens Axboe wrote: >>>>>>> On 10/27/19 9:35 AM, Pavel Begunkov wrote: >>>>>>>> A small cleanup of very similar but diverged io_submit_sqes() and >>>>>>>> io_ring_submit() >>>>>>>> >>>>>>>> Pavel Begunkov (2): >>>>>>>> io_uring: handle mm_fault outside of submission >>>>>>>> io_uring: merge io_submit_sqes and io_ring_submit >>>>>>>> >>>>>>>> fs/io_uring.c | 116 ++++++++++++++------------------------------------ >>>>>>>> 1 file changed, 33 insertions(+), 83 deletions(-) >>>>>>> >>>>>>> I like the cleanups here, but one thing that seems off is the >>>>>>> assumption that io_sq_thread() always needs to grab the mm. If >>>>>>> the sqes processed are just READ/WRITE_FIXED, then it never needs >>>>>>> to grab the mm. >>>>>>> Yeah, we removed it to fix bugs. Personally, I think it would be >>>>>> clearer to do lazy grabbing conditionally, rather than have two >>>>>> functions. And in this case it's easier to do after merging. >>>>>> >>>>>> Do you prefer to return it back first? >>>>> >>>>> Ah I see, no I don't care about that. >>>> >>>> OK, looked at the post-patches state. It's still not correct. You are >>>> grabbing the mm from io_sq_thread() unconditionally. We should not do >>>> that, only if the sqes we need to submit need mm context. >>>> >>> That's what my question to the fix was about :) >>> 1. Then, what the case it could fail? >>> 2. Is it ok to hold it while polling? It could keep it for quite >>> a long time if host is swift, e.g. submit->poll->submit->poll-> ... >>> >>> Anyway, I will add it back and resend the patchset. >> >> If possible in a simple way, I'd prefer if we do it as a prep patch and >> then queue that up for 5.4 since we now lost that optimization. Then >> layer the other 2 on top of that, since I'll just rebase the 5.5 stuff >> on top of that. >> >> If not trivially possible for 5.4, then we'll just have to leave with it >> in that release. For that case, you can fold the change in with these >> two patches. >> > Hmm, what's the semantics? I think we should fail only those who need > mm, but can't get it. The alternative is to fail all subsequent after > the first mm_fault. For the sqthread setup, there's no notion of "do this many". It just grabs whatever it can and issues it. This means that the mm assign is really per-sqe. What we did before, with the batching, just optimized it so we'd only grab it for one batch IFF at least one sqe in that batch needed the mm. Since you've killed the batching, I think the logic should be something ala: if (io_sqe_needs_user(sqe) && !cur_mm)) { if (already_attempted_mmget_and_failed_ { -EFAULT end sqe } else { do mm_get and mmuse dance } } Hence if the sqe doesn't need the mm, doesn't matter if we previously failed. If we need the mm and previously failed, -EFAULT.
On 27/10/2019 22:02, Jens Axboe wrote: > On 10/27/19 12:56 PM, Pavel Begunkov wrote: >> On 27/10/2019 20:26, Jens Axboe wrote: >>> On 10/27/19 11:19 AM, Pavel Begunkov wrote: >>>> On 27/10/2019 19:56, Jens Axboe wrote: >>>>> On 10/27/19 10:49 AM, Jens Axboe wrote: >>>>>> On 10/27/19 10:44 AM, Pavel Begunkov wrote: >>>>>>> On 27/10/2019 19:32, Jens Axboe wrote: >>>>>>>> On 10/27/19 9:35 AM, Pavel Begunkov wrote: >>>>>>>>> A small cleanup of very similar but diverged io_submit_sqes() and >>>>>>>>> io_ring_submit() >>>>>>>>> >>>>>>>>> Pavel Begunkov (2): >>>>>>>>> io_uring: handle mm_fault outside of submission >>>>>>>>> io_uring: merge io_submit_sqes and io_ring_submit >>>>>>>>> >>>>>>>>> fs/io_uring.c | 116 ++++++++++++++------------------------------------ >>>>>>>>> 1 file changed, 33 insertions(+), 83 deletions(-) >>>>>>>> >>>>>>>> I like the cleanups here, but one thing that seems off is the >>>>>>>> assumption that io_sq_thread() always needs to grab the mm. If >>>>>>>> the sqes processed are just READ/WRITE_FIXED, then it never needs >>>>>>>> to grab the mm. >>>>>>>> Yeah, we removed it to fix bugs. Personally, I think it would be >>>>>>> clearer to do lazy grabbing conditionally, rather than have two >>>>>>> functions. And in this case it's easier to do after merging. >>>>>>> >>>>>>> Do you prefer to return it back first? >>>>>> >>>>>> Ah I see, no I don't care about that. >>>>> >>>>> OK, looked at the post-patches state. It's still not correct. You are >>>>> grabbing the mm from io_sq_thread() unconditionally. We should not do >>>>> that, only if the sqes we need to submit need mm context. >>>>> >>>> That's what my question to the fix was about :) >>>> 1. Then, what the case it could fail? >>>> 2. Is it ok to hold it while polling? It could keep it for quite >>>> a long time if host is swift, e.g. submit->poll->submit->poll-> ... >>>> >>>> Anyway, I will add it back and resend the patchset. >>> >>> If possible in a simple way, I'd prefer if we do it as a prep patch and >>> then queue that up for 5.4 since we now lost that optimization. Then >>> layer the other 2 on top of that, since I'll just rebase the 5.5 stuff >>> on top of that. >>> >>> If not trivially possible for 5.4, then we'll just have to leave with it >>> in that release. For that case, you can fold the change in with these >>> two patches. >>> >> Hmm, what's the semantics? I think we should fail only those who need >> mm, but can't get it. The alternative is to fail all subsequent after >> the first mm_fault. > > For the sqthread setup, there's no notion of "do this many". It just > grabs whatever it can and issues it. This means that the mm assign > is really per-sqe. What we did before, with the batching, just optimized > it so we'd only grab it for one batch IFF at least one sqe in that batch > needed the mm. > > Since you've killed the batching, I think the logic should be something > ala: > > if (io_sqe_needs_user(sqe) && !cur_mm)) { > if (already_attempted_mmget_and_failed_ { > -EFAULT end sqe > } else { > do mm_get and mmuse dance > } > } > > Hence if the sqe doesn't need the mm, doesn't matter if we previously > failed. If we need the mm and previously failed, -EFAULT. > That makes sense, but a bit hard to implement honoring links and drains
On 10/27/19 1:17 PM, Pavel Begunkov wrote: > On 27/10/2019 22:02, Jens Axboe wrote: >> On 10/27/19 12:56 PM, Pavel Begunkov wrote: >>> On 27/10/2019 20:26, Jens Axboe wrote: >>>> On 10/27/19 11:19 AM, Pavel Begunkov wrote: >>>>> On 27/10/2019 19:56, Jens Axboe wrote: >>>>>> On 10/27/19 10:49 AM, Jens Axboe wrote: >>>>>>> On 10/27/19 10:44 AM, Pavel Begunkov wrote: >>>>>>>> On 27/10/2019 19:32, Jens Axboe wrote: >>>>>>>>> On 10/27/19 9:35 AM, Pavel Begunkov wrote: >>>>>>>>>> A small cleanup of very similar but diverged io_submit_sqes() and >>>>>>>>>> io_ring_submit() >>>>>>>>>> >>>>>>>>>> Pavel Begunkov (2): >>>>>>>>>> io_uring: handle mm_fault outside of submission >>>>>>>>>> io_uring: merge io_submit_sqes and io_ring_submit >>>>>>>>>> >>>>>>>>>> fs/io_uring.c | 116 ++++++++++++++------------------------------------ >>>>>>>>>> 1 file changed, 33 insertions(+), 83 deletions(-) >>>>>>>>> >>>>>>>>> I like the cleanups here, but one thing that seems off is the >>>>>>>>> assumption that io_sq_thread() always needs to grab the mm. If >>>>>>>>> the sqes processed are just READ/WRITE_FIXED, then it never needs >>>>>>>>> to grab the mm. >>>>>>>>> Yeah, we removed it to fix bugs. Personally, I think it would be >>>>>>>> clearer to do lazy grabbing conditionally, rather than have two >>>>>>>> functions. And in this case it's easier to do after merging. >>>>>>>> >>>>>>>> Do you prefer to return it back first? >>>>>>> >>>>>>> Ah I see, no I don't care about that. >>>>>> >>>>>> OK, looked at the post-patches state. It's still not correct. You are >>>>>> grabbing the mm from io_sq_thread() unconditionally. We should not do >>>>>> that, only if the sqes we need to submit need mm context. >>>>>> >>>>> That's what my question to the fix was about :) >>>>> 1. Then, what the case it could fail? >>>>> 2. Is it ok to hold it while polling? It could keep it for quite >>>>> a long time if host is swift, e.g. submit->poll->submit->poll-> ... >>>>> >>>>> Anyway, I will add it back and resend the patchset. >>>> >>>> If possible in a simple way, I'd prefer if we do it as a prep patch and >>>> then queue that up for 5.4 since we now lost that optimization. Then >>>> layer the other 2 on top of that, since I'll just rebase the 5.5 stuff >>>> on top of that. >>>> >>>> If not trivially possible for 5.4, then we'll just have to leave with it >>>> in that release. For that case, you can fold the change in with these >>>> two patches. >>>> >>> Hmm, what's the semantics? I think we should fail only those who need >>> mm, but can't get it. The alternative is to fail all subsequent after >>> the first mm_fault. >> >> For the sqthread setup, there's no notion of "do this many". It just >> grabs whatever it can and issues it. This means that the mm assign >> is really per-sqe. What we did before, with the batching, just optimized >> it so we'd only grab it for one batch IFF at least one sqe in that batch >> needed the mm. >> >> Since you've killed the batching, I think the logic should be something >> ala: >> >> if (io_sqe_needs_user(sqe) && !cur_mm)) { >> if (already_attempted_mmget_and_failed_ { >> -EFAULT end sqe >> } else { >> do mm_get and mmuse dance >> } >> } >> >> Hence if the sqe doesn't need the mm, doesn't matter if we previously >> failed. If we need the mm and previously failed, -EFAULT. >> > That makes sense, but a bit hard to implement honoring links and drains If it becomes too complicated or convoluted, just drop it. It's not worth spending that much time on.
On 27/10/2019 22:51, Jens Axboe wrote: > On 10/27/19 1:17 PM, Pavel Begunkov wrote: >> On 27/10/2019 22:02, Jens Axboe wrote: >>> On 10/27/19 12:56 PM, Pavel Begunkov wrote: >>>> On 27/10/2019 20:26, Jens Axboe wrote: >>>>> On 10/27/19 11:19 AM, Pavel Begunkov wrote: >>>>>> On 27/10/2019 19:56, Jens Axboe wrote: >>>>>>> On 10/27/19 10:49 AM, Jens Axboe wrote: >>>>>>>> On 10/27/19 10:44 AM, Pavel Begunkov wrote: >>>>>>>>> On 27/10/2019 19:32, Jens Axboe wrote: >>>>>>>>>> On 10/27/19 9:35 AM, Pavel Begunkov wrote: >>>>>>>>>>> A small cleanup of very similar but diverged io_submit_sqes() and >>>>>>>>>>> io_ring_submit() >>>>>>>>>>> >>>>>>>>>>> Pavel Begunkov (2): >>>>>>>>>>> io_uring: handle mm_fault outside of submission >>>>>>>>>>> io_uring: merge io_submit_sqes and io_ring_submit >>>>>>>>>>> >>>>>>>>>>> fs/io_uring.c | 116 ++++++++++++++------------------------------------ >>>>>>>>>>> 1 file changed, 33 insertions(+), 83 deletions(-) >>>>>>>>>> >>>>>>>>>> I like the cleanups here, but one thing that seems off is the >>>>>>>>>> assumption that io_sq_thread() always needs to grab the mm. If >>>>>>>>>> the sqes processed are just READ/WRITE_FIXED, then it never needs >>>>>>>>>> to grab the mm. >>>>>>>>>> Yeah, we removed it to fix bugs. Personally, I think it would be >>>>>>>>> clearer to do lazy grabbing conditionally, rather than have two >>>>>>>>> functions. And in this case it's easier to do after merging. >>>>>>>>> >>>>>>>>> Do you prefer to return it back first? >>>>>>>> >>>>>>>> Ah I see, no I don't care about that. >>>>>>> >>>>>>> OK, looked at the post-patches state. It's still not correct. You are >>>>>>> grabbing the mm from io_sq_thread() unconditionally. We should not do >>>>>>> that, only if the sqes we need to submit need mm context. >>>>>>> >>>>>> That's what my question to the fix was about :) >>>>>> 1. Then, what the case it could fail? >>>>>> 2. Is it ok to hold it while polling? It could keep it for quite >>>>>> a long time if host is swift, e.g. submit->poll->submit->poll-> ... >>>>>> >>>>>> Anyway, I will add it back and resend the patchset. >>>>> >>>>> If possible in a simple way, I'd prefer if we do it as a prep patch and >>>>> then queue that up for 5.4 since we now lost that optimization. Then >>>>> layer the other 2 on top of that, since I'll just rebase the 5.5 stuff >>>>> on top of that. >>>>> >>>>> If not trivially possible for 5.4, then we'll just have to leave with it >>>>> in that release. For that case, you can fold the change in with these >>>>> two patches. >>>>> >>>> Hmm, what's the semantics? I think we should fail only those who need >>>> mm, but can't get it. The alternative is to fail all subsequent after >>>> the first mm_fault. >>> >>> For the sqthread setup, there's no notion of "do this many". It just >>> grabs whatever it can and issues it. This means that the mm assign >>> is really per-sqe. What we did before, with the batching, just optimized >>> it so we'd only grab it for one batch IFF at least one sqe in that batch >>> needed the mm. >>> >>> Since you've killed the batching, I think the logic should be something >>> ala: >>> >>> if (io_sqe_needs_user(sqe) && !cur_mm)) { >>> if (already_attempted_mmget_and_failed_ { >>> -EFAULT end sqe >>> } else { >>> do mm_get and mmuse dance >>> } >>> } >>> >>> Hence if the sqe doesn't need the mm, doesn't matter if we previously >>> failed. If we need the mm and previously failed, -EFAULT. >>> >> That makes sense, but a bit hard to implement honoring links and drains > > If it becomes too complicated or convoluted, just drop it. It's not > worth spending that much time on. > I've already done it more or less elegantly, just prefer to test commits before sending.
On 10/27/19 1:59 PM, Pavel Begunkov wrote: > On 27/10/2019 22:51, Jens Axboe wrote: >> On 10/27/19 1:17 PM, Pavel Begunkov wrote: >>> On 27/10/2019 22:02, Jens Axboe wrote: >>>> On 10/27/19 12:56 PM, Pavel Begunkov wrote: >>>>> On 27/10/2019 20:26, Jens Axboe wrote: >>>>>> On 10/27/19 11:19 AM, Pavel Begunkov wrote: >>>>>>> On 27/10/2019 19:56, Jens Axboe wrote: >>>>>>>> On 10/27/19 10:49 AM, Jens Axboe wrote: >>>>>>>>> On 10/27/19 10:44 AM, Pavel Begunkov wrote: >>>>>>>>>> On 27/10/2019 19:32, Jens Axboe wrote: >>>>>>>>>>> On 10/27/19 9:35 AM, Pavel Begunkov wrote: >>>>>>>>>>>> A small cleanup of very similar but diverged io_submit_sqes() and >>>>>>>>>>>> io_ring_submit() >>>>>>>>>>>> >>>>>>>>>>>> Pavel Begunkov (2): >>>>>>>>>>>> io_uring: handle mm_fault outside of submission >>>>>>>>>>>> io_uring: merge io_submit_sqes and io_ring_submit >>>>>>>>>>>> >>>>>>>>>>>> fs/io_uring.c | 116 ++++++++++++++------------------------------------ >>>>>>>>>>>> 1 file changed, 33 insertions(+), 83 deletions(-) >>>>>>>>>>> >>>>>>>>>>> I like the cleanups here, but one thing that seems off is the >>>>>>>>>>> assumption that io_sq_thread() always needs to grab the mm. If >>>>>>>>>>> the sqes processed are just READ/WRITE_FIXED, then it never needs >>>>>>>>>>> to grab the mm. >>>>>>>>>>> Yeah, we removed it to fix bugs. Personally, I think it would be >>>>>>>>>> clearer to do lazy grabbing conditionally, rather than have two >>>>>>>>>> functions. And in this case it's easier to do after merging. >>>>>>>>>> >>>>>>>>>> Do you prefer to return it back first? >>>>>>>>> >>>>>>>>> Ah I see, no I don't care about that. >>>>>>>> >>>>>>>> OK, looked at the post-patches state. It's still not correct. You are >>>>>>>> grabbing the mm from io_sq_thread() unconditionally. We should not do >>>>>>>> that, only if the sqes we need to submit need mm context. >>>>>>>> >>>>>>> That's what my question to the fix was about :) >>>>>>> 1. Then, what the case it could fail? >>>>>>> 2. Is it ok to hold it while polling? It could keep it for quite >>>>>>> a long time if host is swift, e.g. submit->poll->submit->poll-> ... >>>>>>> >>>>>>> Anyway, I will add it back and resend the patchset. >>>>>> >>>>>> If possible in a simple way, I'd prefer if we do it as a prep patch and >>>>>> then queue that up for 5.4 since we now lost that optimization. Then >>>>>> layer the other 2 on top of that, since I'll just rebase the 5.5 stuff >>>>>> on top of that. >>>>>> >>>>>> If not trivially possible for 5.4, then we'll just have to leave with it >>>>>> in that release. For that case, you can fold the change in with these >>>>>> two patches. >>>>>> >>>>> Hmm, what's the semantics? I think we should fail only those who need >>>>> mm, but can't get it. The alternative is to fail all subsequent after >>>>> the first mm_fault. >>>> >>>> For the sqthread setup, there's no notion of "do this many". It just >>>> grabs whatever it can and issues it. This means that the mm assign >>>> is really per-sqe. What we did before, with the batching, just optimized >>>> it so we'd only grab it for one batch IFF at least one sqe in that batch >>>> needed the mm. >>>> >>>> Since you've killed the batching, I think the logic should be something >>>> ala: >>>> >>>> if (io_sqe_needs_user(sqe) && !cur_mm)) { >>>> if (already_attempted_mmget_and_failed_ { >>>> -EFAULT end sqe >>>> } else { >>>> do mm_get and mmuse dance >>>> } >>>> } >>>> >>>> Hence if the sqe doesn't need the mm, doesn't matter if we previously >>>> failed. If we need the mm and previously failed, -EFAULT. >>>> >>> That makes sense, but a bit hard to implement honoring links and drains >> >> If it becomes too complicated or convoluted, just drop it. It's not >> worth spending that much time on. >> > I've already done it more or less elegantly, just prefer to test commits > before sending. That's always appreciated! It struck me that while I've added quite a few regression tests, we don't have any that just do basic read/write using the variety of settings we have for that. So I added that to liburing.
On 28/10/2019 06:38, Jens Axboe wrote: > On 10/27/19 1:59 PM, Pavel Begunkov wrote: >> On 27/10/2019 22:51, Jens Axboe wrote: >>> On 10/27/19 1:17 PM, Pavel Begunkov wrote: >>>> On 27/10/2019 22:02, Jens Axboe wrote: >>>>> On 10/27/19 12:56 PM, Pavel Begunkov wrote: >>>>>> On 27/10/2019 20:26, Jens Axboe wrote: >>>>>>> On 10/27/19 11:19 AM, Pavel Begunkov wrote: >>>>>>>> On 27/10/2019 19:56, Jens Axboe wrote: >>>>>>>>> On 10/27/19 10:49 AM, Jens Axboe wrote: >>>>>>>>>> On 10/27/19 10:44 AM, Pavel Begunkov wrote: >>>>>>>>>>> On 27/10/2019 19:32, Jens Axboe wrote: >>>>>>>>>>>> On 10/27/19 9:35 AM, Pavel Begunkov wrote: >>>>>>>>>>>>> A small cleanup of very similar but diverged io_submit_sqes() and >>>>>>>>>>>>> io_ring_submit() >>>>>>>>>>>>> >>>>>>>>>>>>> Pavel Begunkov (2): >>>>>>>>>>>>> io_uring: handle mm_fault outside of submission >>>>>>>>>>>>> io_uring: merge io_submit_sqes and io_ring_submit >>>>>>>>>>>>> >>>>>>>>>>>>> fs/io_uring.c | 116 ++++++++++++++------------------------------------ >>>>>>>>>>>>> 1 file changed, 33 insertions(+), 83 deletions(-) >>>>>>>>>>>> >>>>>>>>>>>> I like the cleanups here, but one thing that seems off is the >>>>>>>>>>>> assumption that io_sq_thread() always needs to grab the mm. If >>>>>>>>>>>> the sqes processed are just READ/WRITE_FIXED, then it never needs >>>>>>>>>>>> to grab the mm. >>>>>>>>>>>> Yeah, we removed it to fix bugs. Personally, I think it would be >>>>>>>>>>> clearer to do lazy grabbing conditionally, rather than have two >>>>>>>>>>> functions. And in this case it's easier to do after merging. >>>>>>>>>>> >>>>>>>>>>> Do you prefer to return it back first? >>>>>>>>>> >>>>>>>>>> Ah I see, no I don't care about that. >>>>>>>>> >>>>>>>>> OK, looked at the post-patches state. It's still not correct. You are >>>>>>>>> grabbing the mm from io_sq_thread() unconditionally. We should not do >>>>>>>>> that, only if the sqes we need to submit need mm context. >>>>>>>>> >>>>>>>> That's what my question to the fix was about :) >>>>>>>> 1. Then, what the case it could fail? >>>>>>>> 2. Is it ok to hold it while polling? It could keep it for quite >>>>>>>> a long time if host is swift, e.g. submit->poll->submit->poll-> ... >>>>>>>> >>>>>>>> Anyway, I will add it back and resend the patchset. >>>>>>> >>>>>>> If possible in a simple way, I'd prefer if we do it as a prep patch and >>>>>>> then queue that up for 5.4 since we now lost that optimization. Then >>>>>>> layer the other 2 on top of that, since I'll just rebase the 5.5 stuff >>>>>>> on top of that. >>>>>>> >>>>>>> If not trivially possible for 5.4, then we'll just have to leave with it >>>>>>> in that release. For that case, you can fold the change in with these >>>>>>> two patches. >>>>>>> >>>>>> Hmm, what's the semantics? I think we should fail only those who need >>>>>> mm, but can't get it. The alternative is to fail all subsequent after >>>>>> the first mm_fault. >>>>> >>>>> For the sqthread setup, there's no notion of "do this many". It just >>>>> grabs whatever it can and issues it. This means that the mm assign >>>>> is really per-sqe. What we did before, with the batching, just optimized >>>>> it so we'd only grab it for one batch IFF at least one sqe in that batch >>>>> needed the mm. >>>>> >>>>> Since you've killed the batching, I think the logic should be something >>>>> ala: >>>>> >>>>> if (io_sqe_needs_user(sqe) && !cur_mm)) { >>>>> if (already_attempted_mmget_and_failed_ { >>>>> -EFAULT end sqe >>>>> } else { >>>>> do mm_get and mmuse dance >>>>> } >>>>> } >>>>> >>>>> Hence if the sqe doesn't need the mm, doesn't matter if we previously >>>>> failed. If we need the mm and previously failed, -EFAULT. >>>>> >>>> That makes sense, but a bit hard to implement honoring links and drains >>> >>> If it becomes too complicated or convoluted, just drop it. It's not >>> worth spending that much time on. >>> >> I've already done it more or less elegantly, just prefer to test commits >> before sending. > > That's always appreciated! > > It struck me that while I've added quite a few regression tests, we don't > have any that just do basic read/write using the variety of settings we > have for that. So I added that to liburing. > Great, thanks! I think, I'll postpone patches including these until start of 5.5