Message ID | 20240731214256.3588718-1-andrii@kernel.org (mailing list archive) |
---|---|
Headers | show |
Series | uprobes: RCU-protected hot path optimizations | expand |
On 07/31, Andrii Nakryiko wrote: > > Andrii Nakryiko (6): > uprobes: revamp uprobe refcounting and lifetime management > uprobes: protected uprobe lifetime with SRCU > uprobes: get rid of enum uprobe_filter_ctx in uprobe filter callbacks > uprobes: travers uprobe's consumer list locklessly under SRCU > protection > uprobes: perform lockless SRCU-protected uprobes_tree lookup > uprobes: switch to RCU Tasks Trace flavor for better performance > > Peter Zijlstra (2): > rbtree: provide rb_find_rcu() / rb_find_add_rcu() > perf/uprobe: split uprobe_unregister() I see nothing wrong in 1-7. LGTM. But since you are going to send the new version, I'd like to apply V2 and then try to re-check the resulting code. As for 8/8 - I leave it to you and Peter. I'd prefer SRCU though ;) Oleg.
On Wed, Aug 7, 2024 at 6:30 AM Oleg Nesterov <oleg@redhat.com> wrote: > > On 07/31, Andrii Nakryiko wrote: > > > > Andrii Nakryiko (6): > > uprobes: revamp uprobe refcounting and lifetime management > > uprobes: protected uprobe lifetime with SRCU > > uprobes: get rid of enum uprobe_filter_ctx in uprobe filter callbacks > > uprobes: travers uprobe's consumer list locklessly under SRCU > > protection > > uprobes: perform lockless SRCU-protected uprobes_tree lookup > > uprobes: switch to RCU Tasks Trace flavor for better performance > > > > Peter Zijlstra (2): > > rbtree: provide rb_find_rcu() / rb_find_add_rcu() > > perf/uprobe: split uprobe_unregister() > > I see nothing wrong in 1-7. LGTM. > So, I don't know how it slipped the first time, because I tested overnight with uprobe-stress, perhaps I adjusted some parameters (more threads or different ratio of threads, not sure by now), but it turns out that lockless RB-tree traversal actually crashes after a few minutes of running uprobe-stress. I'll post details separately later today, but I suspect that rb_find_rcu() and rb_find_add_rcu() is not enough to make it safe. Hopefully someone can help me figure this out. > But since you are going to send the new version, I'd like to apply V2 > and then try to re-check the resulting code. Yes, I was waiting for more of Peter's comments, but I guess I'll just send a v2 today. I'll probably include the SRCU+timeout logic for return_instances, and maybe lockless VMA parts as well. Those won't be yet for landing, but it's probably useful to see everything end-to-end. Given the hiccup with lockless uprobes_tree lookup, we should land that change, but the first 4 patches hopefully are in good enough shape to apply and reduce the amount of patches that need to be juggled around. > > As for 8/8 - I leave it to you and Peter. I'd prefer SRCU though ;) Honestly curious, why the preference? > > Oleg. > BTW, while you are here :) What can you say about current->sighand->siglock use in handle_singlestep()? Is there any way to avoid that (or at least not have to take it every single time a single-stepped uprobe is triggered?). This turned out to be a huge issue for single-stepped uprobe when scaling to multiple CPUs even with all the other things (all the SRCU and lockless VMA lookup) taken care of.
On 08/07, Andrii Nakryiko wrote: > > Yes, I was waiting for more of Peter's comments, but I guess I'll just > send a v2 today. OK, > I'll probably include the SRCU+timeout logic for > return_instances, and maybe lockless VMA parts as well. Well, feel free to do what you think right, but perhaps it would be better to push this series first? at least 1-4. As for lockless VMA. To me this needs more discussions. I didn't read your conversation with Peter and Suren carefully, but I too have some concerns. Most probably I am wrong, and until I saw this thread I didn't even know that vm_area_free() uses call_rcu() if CONFIG_PER_VMA_LOCK, but still. > > As for 8/8 - I leave it to you and Peter. I'd prefer SRCU though ;) > > Honestly curious, why the preference? Well, you can safely ignore me, but since you have asked ;) I understand what SRCU does, and years ago I even understood (I hope) the implementation. More or less the same for rcu_tasks. But as for the _trace flavour, I simply fail to understand its semantics. > BTW, while you are here :) What can you say about > current->sighand->siglock use in handle_singlestep()? It should die, and this looks simple. I disagree with the patches from Liao, see the https://lore.kernel.org/all/20240801082407.1618451-1-liaochang1@huawei.com/ thread, but I agree with the intent. IMO, we need a simple "bool restore_sigpending" in uprobe_task, it will make the necessary changes really simple. (To clarify. In fact I think that a new TIF_ or even PF_ flag makes more sense, afaics it can have more users. But I don't think that uprobes can provide enough justification for that right now) Oleg.
On Wed, Aug 7, 2024 at 10:11 AM Oleg Nesterov <oleg@redhat.com> wrote: > > On 08/07, Andrii Nakryiko wrote: > > > > Yes, I was waiting for more of Peter's comments, but I guess I'll just > > send a v2 today. > > OK, > > > I'll probably include the SRCU+timeout logic for > > return_instances, and maybe lockless VMA parts as well. > > Well, feel free to do what you think right, but perhaps it would be > better to push this series first? at least 1-4. Ok, I can send those first 4 patches first and hopefully we can land them soon and move to the next part. I just also wrote up details about that crash in rb_find_rcu(). > > As for lockless VMA. To me this needs more discussions. I didn't read We are still discussing, feel free to join the conversation. > your conversation with Peter and Suren carefully, but I too have some > concerns. Most probably I am wrong, and until I saw this thread I didn't > even know that vm_area_free() uses call_rcu() if CONFIG_PER_VMA_LOCK, > but still. > > > > As for 8/8 - I leave it to you and Peter. I'd prefer SRCU though ;) > > > > Honestly curious, why the preference? > > Well, you can safely ignore me, but since you have asked ;) > > I understand what SRCU does, and years ago I even understood (I hope) > the implementation. More or less the same for rcu_tasks. But as for > the _trace flavour, I simply fail to understand its semantics. Ok, I won't try to repeat Paul's explanations. If you are curious you can find them in comments to my previous batch register/unregister API patches. > > > BTW, while you are here :) What can you say about > > current->sighand->siglock use in handle_singlestep()? > > It should die, and this looks simple. I disagree with the patches > from Liao, see the > https://lore.kernel.org/all/20240801082407.1618451-1-liaochang1@huawei.com/ > thread, but I agree with the intent. I wasn't aware of this patch, thanks for mentioning it. Strange that me or at least bpf@vger.kernel.org wasn't CC'ed. Liao, please cc bpf@ mailing list for future patches like that. > > IMO, we need a simple "bool restore_sigpending" in uprobe_task, it will make the > necessary changes really simple. The simpler the better, I can't comment on correctness as I don't understand the logic well enough. Are you going to send a patch with your bool flag proposal? > > (To clarify. In fact I think that a new TIF_ or even PF_ flag makes more sense, > afaics it can have more users. But I don't think that uprobes can provide enough > justification for that right now) > > Oleg. >
On 08/07, Andrii Nakryiko wrote: > > Are you going to send a patch with > your bool flag proposal? No, I will wait for the next version from Liao. If he agrees with my comments. If not, we will continue the discussion in that thread. Note also another patch from Liao which removes ->siglock from uprobe_deny_signal(), https://lore.kernel.org/all/20240731095017.1560516-1-liaochang1@huawei.com/ Oleg.
在 2024/8/8 1:31, Andrii Nakryiko 写道: > On Wed, Aug 7, 2024 at 10:11 AM Oleg Nesterov <oleg@redhat.com> wrote: >> >> On 08/07, Andrii Nakryiko wrote: >>> >>> Yes, I was waiting for more of Peter's comments, but I guess I'll just >>> send a v2 today. >> >> OK, >> >>> I'll probably include the SRCU+timeout logic for >>> return_instances, and maybe lockless VMA parts as well. >> >> Well, feel free to do what you think right, but perhaps it would be >> better to push this series first? at least 1-4. > > Ok, I can send those first 4 patches first and hopefully we can land > them soon and move to the next part. I just also wrote up details > about that crash in rb_find_rcu(). > >> >> As for lockless VMA. To me this needs more discussions. I didn't read > > We are still discussing, feel free to join the conversation. > >> your conversation with Peter and Suren carefully, but I too have some >> concerns. Most probably I am wrong, and until I saw this thread I didn't >> even know that vm_area_free() uses call_rcu() if CONFIG_PER_VMA_LOCK, >> but still. >> >>>> As for 8/8 - I leave it to you and Peter. I'd prefer SRCU though ;) >>> >>> Honestly curious, why the preference? >> >> Well, you can safely ignore me, but since you have asked ;) >> >> I understand what SRCU does, and years ago I even understood (I hope) >> the implementation. More or less the same for rcu_tasks. But as for >> the _trace flavour, I simply fail to understand its semantics. > > Ok, I won't try to repeat Paul's explanations. If you are curious you > can find them in comments to my previous batch register/unregister API > patches. > >> >>> BTW, while you are here :) What can you say about >>> current->sighand->siglock use in handle_singlestep()? >> >> It should die, and this looks simple. I disagree with the patches >> from Liao, see the >> https://lore.kernel.org/all/20240801082407.1618451-1-liaochang1@huawei.com/ >> thread, but I agree with the intent. > > I wasn't aware of this patch, thanks for mentioning it. Strange that > me or at least bpf@vger.kernel.org wasn't CC'ed. > > Liao, please cc bpf@ mailing list for future patches like that. OK, sorry about that. > >> >> IMO, we need a simple "bool restore_sigpending" in uprobe_task, it will make the >> necessary changes really simple. > [...] >> >> (To clarify. In fact I think that a new TIF_ or even PF_ flag makes more sense, >> afaics it can have more users. But I don't think that uprobes can provide enough >> justification for that right now) I also face the same choice when Oleg suggested me to add new flag to track the denied flag, due to I haven't encountered scenarios outside of uprobe that would deny signal, so I'm not confident of introduce new TIF_ flag without a fully understanding of potential potential impacts. >> >> Oleg. >>