Message ID | 20250226180727.2499531-2-matttbe@kernel.org (mailing list archive) |
---|---|
State | Accepted, archived |
Commit | dd8797c750fe851f953242b2ddf0e538a761854a |
Delegated to: | Matthieu Baerts |
Headers | show |
Series | [mptcp-next] Squash to "bpf: Add mptcp_subflow bpf_iter" | expand |
Context | Check | Description |
---|---|---|
matttbe/build | success | Build and static analysis OK |
matttbe/checkpatch | success | total: 0 errors, 0 warnings, 0 checks, 9 lines checked |
matttbe/shellcheck | success | MPTCP selftests files have not been modified |
matttbe/KVM_Validation__normal | success | Success! ✅ |
matttbe/KVM_Validation__debug | success | Success! ✅ |
matttbe/KVM_Validation__btf-normal__only_bpftest_all_ | success | Success! ✅ |
matttbe/KVM_Validation__btf-debug__only_bpftest_all_ | success | Success! ✅ |
Hi Matthieu, Thank you for your modifications, that's great! Our CI did some validations and here is its report: - KVM Validation: normal: Success! ✅ - KVM Validation: debug: Success! ✅ - KVM Validation: btf-normal (only bpftest_all): Success! ✅ - KVM Validation: btf-debug (only bpftest_all): Success! ✅ - Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/13550710113 Initiator: Patchew Applier Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/c610b4dde837 Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=938175 If there are some issues, you can reproduce them using the same environment as the one used by the CI thanks to a docker image, e.g.: $ cd [kernel source code] $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \ --pull always mptcp/mptcp-upstream-virtme-docker:latest \ auto-normal For more details: https://github.com/multipath-tcp/mptcp-upstream-virtme-docker Please note that despite all the efforts that have been already done to have a stable tests suite when executed on a public CI like here, it is possible some reported issues are not due to your modifications. Still, do not hesitate to help us improve that ;-) Cheers, MPTCP GH Action bot Bot operated by Matthieu Baerts (NGI0 Core)
Hi Matt, On Wed, 2025-02-26 at 19:07 +0100, Matthieu Baerts (NGI0) wrote: > From what we understood, when being used from a CG [gs]etsockopt > hook, > the socket lock will be held. It seems that the BPF infra will make > sure > in this case. Mat will try to get the confirmation. > > The idea is then to add this msk_owned_by_me() check to make sure the > assumption is correct, and will continue to be: the new selftest > should > complain if not when used with a debug kconfig. In other words, if > the > tests continue to pass with this patch, the Squash-to series can > probably be applied. > > Note that this is just a debug check, hoping that the selftest will > cover all cases. We can then not use this check to return an error if > it > is not held. > > Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Thanks for updating this for me. LGTM! Reviewed-by: Geliang Tang <geliang@kernel.org> > --- > Based-on: <cover.1740368110.git.tanggeliang@kylinos.cn> I changed the state of this set, Squash to "Add mptcp_subflow bpf_iter support" v4, together with this squash-to patch, as "Queued". Thanks, -Geliang > Cc: Geliang Tang <geliang@kernel.org> > Cc: Mat Martineau <martineau@kernel.org> > --- > net/mptcp/bpf.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/net/mptcp/bpf.c b/net/mptcp/bpf.c > index ac7744c6006f..9e8c24c88022 100644 > --- a/net/mptcp/bpf.c > +++ b/net/mptcp/bpf.c > @@ -252,6 +252,9 @@ bpf_iter_mptcp_subflow_new(struct > bpf_iter_mptcp_subflow *it, > return -EINVAL; > > msk = mptcp_sk(sk); > + > + msk_owned_by_me(msk); > + > kit->msk = msk; > kit->pos = &msk->conn_list; > return 0;
Hi Geliang, Mat, On 27/02/2025 03:03, Geliang Tang wrote: > Hi Matt, > > On Wed, 2025-02-26 at 19:07 +0100, Matthieu Baerts (NGI0) wrote: >> From what we understood, when being used from a CG [gs]etsockopt >> hook, >> the socket lock will be held. It seems that the BPF infra will make >> sure >> in this case. Mat will try to get the confirmation. >> >> The idea is then to add this msk_owned_by_me() check to make sure the >> assumption is correct, and will continue to be: the new selftest >> should >> complain if not when used with a debug kconfig. In other words, if >> the >> tests continue to pass with this patch, the Squash-to series can >> probably be applied. >> >> Note that this is just a debug check, hoping that the selftest will >> cover all cases. We can then not use this check to return an error if >> it >> is not held. >> >> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> > > Thanks for updating this for me. > > LGTM! > > Reviewed-by: Geliang Tang <geliang@kernel.org> Thank you for the review! >> --- >> Based-on: <cover.1740368110.git.tanggeliang@kylinos.cn> > > I changed the state of this set, Squash to "Add mptcp_subflow bpf_iter > support" v4, together with this squash-to patch, as "Queued". Thank you. If that's OK, I will wait for Mat's green light before applying the patches, because he wanted to look at the BPF code to make sure our assumption was correct. Cheers, Matt
On Thu, 27 Feb 2025, Matthieu Baerts wrote: > Hi Geliang, Mat, > > On 27/02/2025 03:03, Geliang Tang wrote: >> Hi Matt, >> >> On Wed, 2025-02-26 at 19:07 +0100, Matthieu Baerts (NGI0) wrote: >>> From what we understood, when being used from a CG [gs]etsockopt >>> hook, >>> the socket lock will be held. It seems that the BPF infra will make >>> sure >>> in this case. Mat will try to get the confirmation. >>> >>> The idea is then to add this msk_owned_by_me() check to make sure the >>> assumption is correct, and will continue to be: the new selftest >>> should >>> complain if not when used with a debug kconfig. In other words, if >>> the >>> tests continue to pass with this patch, the Squash-to series can >>> probably be applied. >>> >>> Note that this is just a debug check, hoping that the selftest will >>> cover all cases. We can then not use this check to return an error if >>> it >>> is not held. >>> >>> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> >> >> Thanks for updating this for me. >> >> LGTM! >> >> Reviewed-by: Geliang Tang <geliang@kernel.org> > > Thank you for the review! > >>> --- >>> Based-on: <cover.1740368110.git.tanggeliang@kylinos.cn> >> >> I changed the state of this set, Squash to "Add mptcp_subflow bpf_iter >> support" v4, together with this squash-to patch, as "Queued". > > Thank you. If that's OK, I will wait for Mat's green light before > applying the patches, because he wanted to look at the BPF code to make > sure our assumption was correct. > Ok, I did miss it before but in the cases of setsockopt/getsockopt, the socket lock is acquired by the bpf wrapper code: https://elixir.bootlin.com/linux/v6.14-rc4/source/kernel/bpf/cgroup.c#L1955 That's why the test passes. There's still the general issue that the iterator could be used in any bpf code that can pass a valid msk pointer. The existing lock checks in the export branch check that the socket is locked, but it could be locked by some other owner. As we've already discussed, msk_owned_by_me() doesn't work everywhere because we can't change behavior if the lock isn't owned. However, we don't need to handle every locking case like lockdep does - for the scheduler BPF case, we only care if the BPF code is running in the scheduler context. So, I think we can do a simplified version of tracking the lock context: 1. Add a 'struct task_struct *bpf_scheduler_task' field to struct mptcp_sock. 2. Do a WRITE_ONCE(msk->bpf_scheduler_context, current) before calling a packet scheduler hook, and WRITE_ONCE(msk->bpf_scheduler_task, NULL) after the hook returns. 3. In bpf_iter_mptcp_subflow_new(), check "READ_ONCE(msk->bpf_scheduler_task) == current" to confirm the correct task, return -EINVAL if it doesn't match. (It would help to create helper functions for setting and checking that value) Do you think this iterator will be useful in other places, like bpf path manager? The same approach would work there, but a better name than "bpf_scheduler_task" would help. - Mat
Hi Mat, On 01/03/2025 02:37, Mat Martineau wrote: > On Thu, 27 Feb 2025, Matthieu Baerts wrote: > >> Hi Geliang, Mat, >> >> On 27/02/2025 03:03, Geliang Tang wrote: >>> Hi Matt, >>> >>> On Wed, 2025-02-26 at 19:07 +0100, Matthieu Baerts (NGI0) wrote: >>>> From what we understood, when being used from a CG [gs]etsockopt >>>> hook, >>>> the socket lock will be held. It seems that the BPF infra will make >>>> sure >>>> in this case. Mat will try to get the confirmation. >>>> >>>> The idea is then to add this msk_owned_by_me() check to make sure the >>>> assumption is correct, and will continue to be: the new selftest >>>> should >>>> complain if not when used with a debug kconfig. In other words, if >>>> the >>>> tests continue to pass with this patch, the Squash-to series can >>>> probably be applied. >>>> >>>> Note that this is just a debug check, hoping that the selftest will >>>> cover all cases. We can then not use this check to return an error if >>>> it >>>> is not held. >>>> >>>> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> >>> >>> Thanks for updating this for me. >>> >>> LGTM! >>> >>> Reviewed-by: Geliang Tang <geliang@kernel.org> >> >> Thank you for the review! >> >>>> --- >>>> Based-on: <cover.1740368110.git.tanggeliang@kylinos.cn> >>> >>> I changed the state of this set, Squash to "Add mptcp_subflow bpf_iter >>> support" v4, together with this squash-to patch, as "Queued". >> >> Thank you. If that's OK, I will wait for Mat's green light before >> applying the patches, because he wanted to look at the BPF code to make >> sure our assumption was correct. >> > > Ok, I did miss it before but in the cases of setsockopt/getsockopt, the > socket lock is acquired by the bpf wrapper code: > > https://elixir.bootlin.com/linux/v6.14-rc4/source/kernel/bpf/cgroup.c#L1955 > > That's why the test passes. There's still the general issue that the > iterator could be used in any bpf code that can pass a valid msk > pointer. The existing lock checks in the export branch check that the > socket is locked, but it could be locked by some other owner. Thank you for having looked! Because these new kfuncs are currently only tied to BPF_PROG_TYPE_CGROUP_SOCKOPT, can we apply all these squash-to patches and send this to BPF maintainers? When bpf_iter will be used in the schedulers and the PM, more checks can be added later. > As we've already discussed, msk_owned_by_me() doesn't work everywhere > because we can't change behavior if the lock isn't owned. However, we > don't need to handle every locking case like lockdep does - for the > scheduler BPF case, we only care if the BPF code is running in the > scheduler context. So, I think we can do a simplified version of > tracking the lock context: > > 1. Add a 'struct task_struct *bpf_scheduler_task' field to struct > mptcp_sock. > > 2. Do a WRITE_ONCE(msk->bpf_scheduler_context, current) before calling a > packet scheduler hook, and WRITE_ONCE(msk->bpf_scheduler_task, NULL) > after the hook returns. > > 3. In bpf_iter_mptcp_subflow_new(), check "READ_ONCE(msk-> > bpf_scheduler_task) == current" to confirm the correct task, return > -EINVAL if it doesn't match. That's a very good idea! Yes, we will need to add such checks before msk_owned_by_me() when the bpf_iter will be used elsewhere. > (It would help to create helper functions for setting and checking that > value) +1. > Do you think this iterator will be useful in other places, like bpf path > manager? The same approach would work there, but a better name than > "bpf_scheduler_task" would help. Yes, good point! "msk->bpf_struct_ops_task"? (that's something that could also be added after having applied "use bpf_iter in bpf schedulers") series) Cheers, Matt
On Sat, 1 Mar 2025, Matthieu Baerts wrote: > Hi Mat, > > On 01/03/2025 02:37, Mat Martineau wrote: >> On Thu, 27 Feb 2025, Matthieu Baerts wrote: >> >>> Hi Geliang, Mat, >>> >>> On 27/02/2025 03:03, Geliang Tang wrote: >>>> Hi Matt, >>>> >>>> On Wed, 2025-02-26 at 19:07 +0100, Matthieu Baerts (NGI0) wrote: >>>>> From what we understood, when being used from a CG [gs]etsockopt >>>>> hook, >>>>> the socket lock will be held. It seems that the BPF infra will make >>>>> sure >>>>> in this case. Mat will try to get the confirmation. >>>>> >>>>> The idea is then to add this msk_owned_by_me() check to make sure the >>>>> assumption is correct, and will continue to be: the new selftest >>>>> should >>>>> complain if not when used with a debug kconfig. In other words, if >>>>> the >>>>> tests continue to pass with this patch, the Squash-to series can >>>>> probably be applied. >>>>> >>>>> Note that this is just a debug check, hoping that the selftest will >>>>> cover all cases. We can then not use this check to return an error if >>>>> it >>>>> is not held. >>>>> >>>>> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> >>>> >>>> Thanks for updating this for me. >>>> >>>> LGTM! >>>> >>>> Reviewed-by: Geliang Tang <geliang@kernel.org> >>> >>> Thank you for the review! >>> >>>>> --- >>>>> Based-on: <cover.1740368110.git.tanggeliang@kylinos.cn> >>>> >>>> I changed the state of this set, Squash to "Add mptcp_subflow bpf_iter >>>> support" v4, together with this squash-to patch, as "Queued". >>> >>> Thank you. If that's OK, I will wait for Mat's green light before >>> applying the patches, because he wanted to look at the BPF code to make >>> sure our assumption was correct. >>> >> >> Ok, I did miss it before but in the cases of setsockopt/getsockopt, the >> socket lock is acquired by the bpf wrapper code: >> >> https://elixir.bootlin.com/linux/v6.14-rc4/source/kernel/bpf/cgroup.c#L1955 >> >> That's why the test passes. There's still the general issue that the >> iterator could be used in any bpf code that can pass a valid msk >> pointer. The existing lock checks in the export branch check that the >> socket is locked, but it could be locked by some other owner. > > Thank you for having looked! Because these new kfuncs are currently only > tied to BPF_PROG_TYPE_CGROUP_SOCKOPT, can we apply all these squash-to > patches and send this to BPF maintainers? Hi Matthieu - I think once the squash-to patches are accepted into our repo (a few more changes are needed), it would be ok to upstream them to the BPF maintainers. - Mat > > > When bpf_iter will be used in the schedulers and the PM, more checks can > be added later. > >> As we've already discussed, msk_owned_by_me() doesn't work everywhere >> because we can't change behavior if the lock isn't owned. However, we >> don't need to handle every locking case like lockdep does - for the >> scheduler BPF case, we only care if the BPF code is running in the >> scheduler context. So, I think we can do a simplified version of >> tracking the lock context: >> >> 1. Add a 'struct task_struct *bpf_scheduler_task' field to struct >> mptcp_sock. >> >> 2. Do a WRITE_ONCE(msk->bpf_scheduler_context, current) before calling a >> packet scheduler hook, and WRITE_ONCE(msk->bpf_scheduler_task, NULL) >> after the hook returns. >> >> 3. In bpf_iter_mptcp_subflow_new(), check "READ_ONCE(msk-> >> bpf_scheduler_task) == current" to confirm the correct task, return >> -EINVAL if it doesn't match. > > That's a very good idea! Yes, we will need to add such checks before > msk_owned_by_me() when the bpf_iter will be used elsewhere. > >> (It would help to create helper functions for setting and checking that >> value) > > +1. > >> Do you think this iterator will be useful in other places, like bpf path >> manager? The same approach would work there, but a better name than >> "bpf_scheduler_task" would help. > Yes, good point! "msk->bpf_struct_ops_task"? > > (that's something that could also be added after having applied "use > bpf_iter in bpf schedulers") series) > > Cheers, > Matt > -- > Sponsored by the NGI0 Core fund. > >
Hi Mat, On 04/03/2025 02:42, Mat Martineau wrote: > On Sat, 1 Mar 2025, Matthieu Baerts wrote: > >> Hi Mat, >> >> On 01/03/2025 02:37, Mat Martineau wrote: >>> On Thu, 27 Feb 2025, Matthieu Baerts wrote: >>> >>>> Hi Geliang, Mat, >>>> >>>> On 27/02/2025 03:03, Geliang Tang wrote: >>>>> Hi Matt, >>>>> >>>>> On Wed, 2025-02-26 at 19:07 +0100, Matthieu Baerts (NGI0) wrote: >>>>>> From what we understood, when being used from a CG [gs]etsockopt >>>>>> hook, >>>>>> the socket lock will be held. It seems that the BPF infra will make >>>>>> sure >>>>>> in this case. Mat will try to get the confirmation. >>>>>> >>>>>> The idea is then to add this msk_owned_by_me() check to make sure the >>>>>> assumption is correct, and will continue to be: the new selftest >>>>>> should >>>>>> complain if not when used with a debug kconfig. In other words, if >>>>>> the >>>>>> tests continue to pass with this patch, the Squash-to series can >>>>>> probably be applied. >>>>>> >>>>>> Note that this is just a debug check, hoping that the selftest will >>>>>> cover all cases. We can then not use this check to return an error if >>>>>> it >>>>>> is not held. >>>>>> >>>>>> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> >>>>> >>>>> Thanks for updating this for me. >>>>> >>>>> LGTM! >>>>> >>>>> Reviewed-by: Geliang Tang <geliang@kernel.org> >>>> >>>> Thank you for the review! >>>> >>>>>> --- >>>>>> Based-on: <cover.1740368110.git.tanggeliang@kylinos.cn> >>>>> >>>>> I changed the state of this set, Squash to "Add mptcp_subflow bpf_iter >>>>> support" v4, together with this squash-to patch, as "Queued". >>>> >>>> Thank you. If that's OK, I will wait for Mat's green light before >>>> applying the patches, because he wanted to look at the BPF code to make >>>> sure our assumption was correct. >>>> >>> >>> Ok, I did miss it before but in the cases of setsockopt/getsockopt, the >>> socket lock is acquired by the bpf wrapper code: >>> >>> https://elixir.bootlin.com/linux/v6.14-rc4/source/kernel/bpf/ >>> cgroup.c#L1955 >>> >>> That's why the test passes. There's still the general issue that the >>> iterator could be used in any bpf code that can pass a valid msk >>> pointer. The existing lock checks in the export branch check that the >>> socket is locked, but it could be locked by some other owner. >> >> Thank you for having looked! Because these new kfuncs are currently only >> tied to BPF_PROG_TYPE_CGROUP_SOCKOPT, can we apply all these squash-to >> patches and send this to BPF maintainers? > > Hi Matthieu - > > I think once the squash-to patches are accepted into our repo (a few > more changes are needed), it would be ok to upstream them to the BPF > maintainers. Thank you for your reply! But I admit I'm a bit lost: which changes are you talking about? In the "Add mptcp_subflow bpf_iter support" series we sent to the BPF maintainers, this support is only added to CG [gs]etsockopt where the sk lock is owned, so no need to have extra checks for the moment, no? Then the "Squash to "Add mptcp_subflow bpf_iter support"" v4 (not the last one you looked at, the v5) should be enough, no? My understanding is that setting "msk->bpf_XXX" to "current" will only be needed when mptcp_subflow bpf_iter will be allowed to be used with struct_ops, no? If yes, better to keep this complexity for later, no? Cheers, Matt
On Tue, 4 Mar 2025, Matthieu Baerts wrote: > Hi Mat, > > On 04/03/2025 02:42, Mat Martineau wrote: >> On Sat, 1 Mar 2025, Matthieu Baerts wrote: >> >>> Hi Mat, >>> >>> On 01/03/2025 02:37, Mat Martineau wrote: >>>> On Thu, 27 Feb 2025, Matthieu Baerts wrote: >>>> >>>>> Hi Geliang, Mat, >>>>> >>>>> On 27/02/2025 03:03, Geliang Tang wrote: >>>>>> Hi Matt, >>>>>> >>>>>> On Wed, 2025-02-26 at 19:07 +0100, Matthieu Baerts (NGI0) wrote: >>>>>>> From what we understood, when being used from a CG [gs]etsockopt >>>>>>> hook, >>>>>>> the socket lock will be held. It seems that the BPF infra will make >>>>>>> sure >>>>>>> in this case. Mat will try to get the confirmation. >>>>>>> >>>>>>> The idea is then to add this msk_owned_by_me() check to make sure the >>>>>>> assumption is correct, and will continue to be: the new selftest >>>>>>> should >>>>>>> complain if not when used with a debug kconfig. In other words, if >>>>>>> the >>>>>>> tests continue to pass with this patch, the Squash-to series can >>>>>>> probably be applied. >>>>>>> >>>>>>> Note that this is just a debug check, hoping that the selftest will >>>>>>> cover all cases. We can then not use this check to return an error if >>>>>>> it >>>>>>> is not held. >>>>>>> >>>>>>> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> >>>>>> >>>>>> Thanks for updating this for me. >>>>>> >>>>>> LGTM! >>>>>> >>>>>> Reviewed-by: Geliang Tang <geliang@kernel.org> >>>>> >>>>> Thank you for the review! >>>>> >>>>>>> --- >>>>>>> Based-on: <cover.1740368110.git.tanggeliang@kylinos.cn> >>>>>> >>>>>> I changed the state of this set, Squash to "Add mptcp_subflow bpf_iter >>>>>> support" v4, together with this squash-to patch, as "Queued". >>>>> >>>>> Thank you. If that's OK, I will wait for Mat's green light before >>>>> applying the patches, because he wanted to look at the BPF code to make >>>>> sure our assumption was correct. >>>>> >>>> >>>> Ok, I did miss it before but in the cases of setsockopt/getsockopt, the >>>> socket lock is acquired by the bpf wrapper code: >>>> >>>> https://elixir.bootlin.com/linux/v6.14-rc4/source/kernel/bpf/ >>>> cgroup.c#L1955 >>>> >>>> That's why the test passes. There's still the general issue that the >>>> iterator could be used in any bpf code that can pass a valid msk >>>> pointer. The existing lock checks in the export branch check that the >>>> socket is locked, but it could be locked by some other owner. >>> >>> Thank you for having looked! Because these new kfuncs are currently only >>> tied to BPF_PROG_TYPE_CGROUP_SOCKOPT, can we apply all these squash-to >>> patches and send this to BPF maintainers? >> >> Hi Matthieu - >> >> I think once the squash-to patches are accepted into our repo (a few >> more changes are needed), it would be ok to upstream them to the BPF >> maintainers. > > Thank you for your reply! But I admit I'm a bit lost: which changes are > you talking about? > > In the "Add mptcp_subflow bpf_iter support" series we sent to the BPF > maintainers, this support is only added to CG [gs]etsockopt where the sk > lock is owned, so no need to have extra checks for the moment, no? Then > the "Squash to "Add mptcp_subflow bpf_iter support"" v4 (not the last > one you looked at, the v5) should be enough, no? > Hi Matthieu - Thanks for clarifying, I misunderstood the scope of "all these squash-to patches" to include v5 :) I see what you mean, and I do agree that squashing the v4 patches (https://patchwork.kernel.org/project/mptcp/cover/cover.1740368110.git.tanggeliang@kylinos.cn/) for another round of BPF maintainer review is ok with me since the sockopt calls have the necessary locking, so we can separately upstream the iterator series (with sockopt limitations) now. > My understanding is that setting "msk->bpf_XXX" to "current" will only > be needed when mptcp_subflow bpf_iter will be allowed to be used with > struct_ops, no? If yes, better to keep this complexity for later, no? Agree here as well, the task_struct checking is only needed once there are other locking situations to verify. - Mat
Hi Mat, On 05/03/2025 02:35, Mat Martineau wrote: > On Tue, 4 Mar 2025, Matthieu Baerts wrote: > >> Hi Mat, >> >> On 04/03/2025 02:42, Mat Martineau wrote: >>> On Sat, 1 Mar 2025, Matthieu Baerts wrote: >>> >>>> Hi Mat, >>>> >>>> On 01/03/2025 02:37, Mat Martineau wrote: >>>>> On Thu, 27 Feb 2025, Matthieu Baerts wrote: >>>>> >>>>>> Hi Geliang, Mat, >>>>>> >>>>>> On 27/02/2025 03:03, Geliang Tang wrote: >>>>>>> Hi Matt, >>>>>>> >>>>>>> On Wed, 2025-02-26 at 19:07 +0100, Matthieu Baerts (NGI0) wrote: >>>>>>>> From what we understood, when being used from a CG [gs]etsockopt >>>>>>>> hook, >>>>>>>> the socket lock will be held. It seems that the BPF infra will make >>>>>>>> sure >>>>>>>> in this case. Mat will try to get the confirmation. >>>>>>>> >>>>>>>> The idea is then to add this msk_owned_by_me() check to make >>>>>>>> sure the >>>>>>>> assumption is correct, and will continue to be: the new selftest >>>>>>>> should >>>>>>>> complain if not when used with a debug kconfig. In other words, if >>>>>>>> the >>>>>>>> tests continue to pass with this patch, the Squash-to series can >>>>>>>> probably be applied. >>>>>>>> >>>>>>>> Note that this is just a debug check, hoping that the selftest will >>>>>>>> cover all cases. We can then not use this check to return an >>>>>>>> error if >>>>>>>> it >>>>>>>> is not held. >>>>>>>> >>>>>>>> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> >>>>>>> >>>>>>> Thanks for updating this for me. >>>>>>> >>>>>>> LGTM! >>>>>>> >>>>>>> Reviewed-by: Geliang Tang <geliang@kernel.org> >>>>>> >>>>>> Thank you for the review! >>>>>> >>>>>>>> --- >>>>>>>> Based-on: <cover.1740368110.git.tanggeliang@kylinos.cn> >>>>>>> >>>>>>> I changed the state of this set, Squash to "Add mptcp_subflow >>>>>>> bpf_iter >>>>>>> support" v4, together with this squash-to patch, as "Queued". >>>>>> >>>>>> Thank you. If that's OK, I will wait for Mat's green light before >>>>>> applying the patches, because he wanted to look at the BPF code to >>>>>> make >>>>>> sure our assumption was correct. >>>>>> >>>>> >>>>> Ok, I did miss it before but in the cases of setsockopt/getsockopt, >>>>> the >>>>> socket lock is acquired by the bpf wrapper code: >>>>> >>>>> https://elixir.bootlin.com/linux/v6.14-rc4/source/kernel/bpf/ >>>>> cgroup.c#L1955 >>>>> >>>>> That's why the test passes. There's still the general issue that the >>>>> iterator could be used in any bpf code that can pass a valid msk >>>>> pointer. The existing lock checks in the export branch check that the >>>>> socket is locked, but it could be locked by some other owner. >>>> >>>> Thank you for having looked! Because these new kfuncs are currently >>>> only >>>> tied to BPF_PROG_TYPE_CGROUP_SOCKOPT, can we apply all these squash-to >>>> patches and send this to BPF maintainers? >>> >>> Hi Matthieu - >>> >>> I think once the squash-to patches are accepted into our repo (a few >>> more changes are needed), it would be ok to upstream them to the BPF >>> maintainers. >> >> Thank you for your reply! But I admit I'm a bit lost: which changes are >> you talking about? >> >> In the "Add mptcp_subflow bpf_iter support" series we sent to the BPF >> maintainers, this support is only added to CG [gs]etsockopt where the sk >> lock is owned, so no need to have extra checks for the moment, no? Then >> the "Squash to "Add mptcp_subflow bpf_iter support"" v4 (not the last >> one you looked at, the v5) should be enough, no? >> > > Hi Matthieu - > > Thanks for clarifying, I misunderstood the scope of "all these squash-to > patches" to include v5 :) > > I see what you mean, and I do agree that squashing the v4 patches > (https://patchwork.kernel.org/project/mptcp/cover/ > cover.1740368110.git.tanggeliang@kylinos.cn/) for another round of BPF > maintainer review is ok with me since the sockopt calls have the > necessary locking, so we can separately upstream the iterator series > (with sockopt limitations) now. OK, great! I will then apply the v4, and archive the v5 on Patchwork to avoid confusions. I will then also apply "use bpf_iter in bpf schedulers" v15, even if the "task_struct" checking will still need to be added, but at least we have some material to show to Martin. >> My understanding is that setting "msk->bpf_XXX" to "current" will only >> be needed when mptcp_subflow bpf_iter will be allowed to be used with >> struct_ops, no? If yes, better to keep this complexity for later, no? > > Agree here as well, the task_struct checking is only needed once there > are other locking situations to verify. Good! I guess Geliang will look at that later. Cheers, Matt
diff --git a/net/mptcp/bpf.c b/net/mptcp/bpf.c index ac7744c6006f..9e8c24c88022 100644 --- a/net/mptcp/bpf.c +++ b/net/mptcp/bpf.c @@ -252,6 +252,9 @@ bpf_iter_mptcp_subflow_new(struct bpf_iter_mptcp_subflow *it, return -EINVAL; msk = mptcp_sk(sk); + + msk_owned_by_me(msk); + kit->msk = msk; kit->pos = &msk->conn_list; return 0;
From what we understood, when being used from a CG [gs]etsockopt hook, the socket lock will be held. It seems that the BPF infra will make sure in this case. Mat will try to get the confirmation. The idea is then to add this msk_owned_by_me() check to make sure the assumption is correct, and will continue to be: the new selftest should complain if not when used with a debug kconfig. In other words, if the tests continue to pass with this patch, the Squash-to series can probably be applied. Note that this is just a debug check, hoping that the selftest will cover all cases. We can then not use this check to return an error if it is not held. Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> --- Based-on: <cover.1740368110.git.tanggeliang@kylinos.cn> Cc: Geliang Tang <geliang@kernel.org> Cc: Mat Martineau <martineau@kernel.org> --- net/mptcp/bpf.c | 3 +++ 1 file changed, 3 insertions(+)