Message ID | c43f5e1fe8da4db01749ede9d18af1a3a2134fd4.1740997925.git.tanggeliang@kylinos.cn (mailing list archive) |
---|---|
State | Rejected, archived |
Delegated to: | Matthieu Baerts |
Headers | show |
Series | Squash to "Add mptcp_subflow bpf_iter support" | expand |
Context | Check | Description |
---|---|---|
matttbe/build | success | Build and static analysis OK |
matttbe/checkpatch | success | total: 0 errors, 0 warnings, 0 checks, 26 lines checked |
matttbe/shellcheck | success | MPTCP selftests files have not been modified |
matttbe/KVM_Validation__normal | success | Success! ✅ |
matttbe/KVM_Validation__debug | success | Success! ✅ |
matttbe/KVM_Validation__btf-normal__only_bpftest_all_ | success | Success! ✅ |
matttbe/KVM_Validation__btf-debug__only_bpftest_all_ | success | Success! ✅ |
On Mon, 3 Mar 2025, Geliang Tang wrote: > From: Geliang Tang <tanggeliang@kylinos.cn> > > Set msk->bpf_iter_task in bpf_mptcp_sock_from_sock() to allow > mptcp_subflow bpt_iter can be used in cgroup/getsockopt, > otherwise, the selftest in this set fails. > > Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> > --- > net/mptcp/bpf.c | 16 ++++++++++++---- > 1 file changed, 12 insertions(+), 4 deletions(-) > > diff --git a/net/mptcp/bpf.c b/net/mptcp/bpf.c > index be222fa5f308..aff92f458e7f 100644 > --- a/net/mptcp/bpf.c > +++ b/net/mptcp/bpf.c > @@ -197,14 +197,22 @@ static struct bpf_struct_ops bpf_mptcp_sched_ops = { > > struct mptcp_sock *bpf_mptcp_sock_from_sock(struct sock *sk) > { > + struct mptcp_sock *msk; > + > if (unlikely(!sk || !sk_fullsock(sk))) > return NULL; > > - if (sk->sk_protocol == IPPROTO_MPTCP) > - return mptcp_sk(sk); > + if (sk->sk_protocol == IPPROTO_MPTCP) { > + msk = mptcp_sk(sk); > + mptcp_set_bpf_iter_task(msk); Hi Geliang - The important part of the bpf_iter_task approach is to only set the task_struct pointer when the msk lock is already held, and to clear it before the msk lock is released. When used this way, the check ensures that the iterator code is both *running with the socket locked* and *in the same context where the lock is held*. The code above will set msk->bpf_iter_task even when the lock is not held, and then it is never cleared. To set/clear as expected for get/setsockopt I think the task pointer would have to be set in places where the locks are acquired and released, like these: https://elixir.bootlin.com/linux/v6.14-rc5/source/kernel/bpf/cgroup.c#L1955 https://elixir.bootlin.com/linux/v6.14-rc5/source/kernel/bpf/cgroup.c#L1846 - Mat > + return msk; > + } > > - if (sk->sk_protocol == IPPROTO_TCP && sk_is_mptcp(sk)) > - return mptcp_sk(mptcp_subflow_ctx(sk)->conn); > + if (sk->sk_protocol == IPPROTO_TCP && sk_is_mptcp(sk)) { > + msk = mptcp_sk(mptcp_subflow_ctx(sk)->conn); > + mptcp_set_bpf_iter_task(msk); > + return msk; > + } > > return NULL; > } > -- > 2.43.0 > > >
Hi Mat, Thanks for the review. On Mon, 2025-03-03 at 17:37 -0800, Mat Martineau wrote: > On Mon, 3 Mar 2025, Geliang Tang wrote: > > > From: Geliang Tang <tanggeliang@kylinos.cn> > > > > Set msk->bpf_iter_task in bpf_mptcp_sock_from_sock() to allow > > mptcp_subflow bpt_iter can be used in cgroup/getsockopt, > > otherwise, the selftest in this set fails. > > > > Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> > > --- > > net/mptcp/bpf.c | 16 ++++++++++++---- > > 1 file changed, 12 insertions(+), 4 deletions(-) > > > > diff --git a/net/mptcp/bpf.c b/net/mptcp/bpf.c > > index be222fa5f308..aff92f458e7f 100644 > > --- a/net/mptcp/bpf.c > > +++ b/net/mptcp/bpf.c > > @@ -197,14 +197,22 @@ static struct bpf_struct_ops > > bpf_mptcp_sched_ops = { > > > > struct mptcp_sock *bpf_mptcp_sock_from_sock(struct sock *sk) > > { > > + struct mptcp_sock *msk; > > + > > if (unlikely(!sk || !sk_fullsock(sk))) > > return NULL; > > > > - if (sk->sk_protocol == IPPROTO_MPTCP) > > - return mptcp_sk(sk); > > + if (sk->sk_protocol == IPPROTO_MPTCP) { > > + msk = mptcp_sk(sk); > > + mptcp_set_bpf_iter_task(msk); > > Hi Geliang - > > The important part of the bpf_iter_task approach is to only set the > task_struct pointer when the msk lock is already held, and to clear > it > before the msk lock is released. When used this way, the check > ensures > that the iterator code is both *running with the socket locked* and > *in > the same context where the lock is held*. Got it, thanks for your explanation. > > The code above will set msk->bpf_iter_task even when the lock is not > held, > and then it is never cleared. > > To set/clear as expected for get/setsockopt I think the task pointer > would > have to be set in places where the locks are acquired and released, > like > these: > > https://elixir.bootlin.com/linux/v6.14-rc5/source/kernel/bpf/cgroup.c#L1955 > https://elixir.bootlin.com/linux/v6.14-rc5/source/kernel/bpf/cgroup.c#L1846 > The get/setsockopt scenario is a bit complicated. I think we can use the bpf_iter_task check only for bpf schedulers, keep the get/setsockopt scenario unchanged, and continue to use the msk_owned_by_me(msk) check. What do you think? In addition, I think it's better to add bpf_iter_task to struct mptcp_sched_data instead in struct mptcp_sock, since it's more related to mptcp scheduler. For specific usage, please see [1]. What do you think of this? Thanks, -Geliang [1] https://patchwork.kernel.org/project/mptcp/cover/cover.1739788598.git.tanggeliang@kylinos.cn/ > > - Mat > > > + return msk; > > + } > > > > - if (sk->sk_protocol == IPPROTO_TCP && sk_is_mptcp(sk)) > > - return mptcp_sk(mptcp_subflow_ctx(sk)->conn); > > + if (sk->sk_protocol == IPPROTO_TCP && sk_is_mptcp(sk)) { > > + msk = mptcp_sk(mptcp_subflow_ctx(sk)->conn); > > + mptcp_set_bpf_iter_task(msk); > > + return msk; > > + } > > > > return NULL; > > } > > -- > > 2.43.0 > > > > > >
Hi Mat, Thanks for the review. On Mon, 2025-03-03 at 17:37 -0800, Mat Martineau wrote: > On Mon, 3 Mar 2025, Geliang Tang wrote: > > > From: Geliang Tang <tanggeliang@kylinos.cn> > > > > Set msk->bpf_iter_task in bpf_mptcp_sock_from_sock() to allow > > mptcp_subflow bpt_iter can be used in cgroup/getsockopt, > > otherwise, the selftest in this set fails. > > > > Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> > > --- > > net/mptcp/bpf.c | 16 ++++++++++++---- > > 1 file changed, 12 insertions(+), 4 deletions(-) > > > > diff --git a/net/mptcp/bpf.c b/net/mptcp/bpf.c > > index be222fa5f308..aff92f458e7f 100644 > > --- a/net/mptcp/bpf.c > > +++ b/net/mptcp/bpf.c > > @@ -197,14 +197,22 @@ static struct bpf_struct_ops > > bpf_mptcp_sched_ops = { > > > > struct mptcp_sock *bpf_mptcp_sock_from_sock(struct sock *sk) > > { > > + struct mptcp_sock *msk; > > + > > if (unlikely(!sk || !sk_fullsock(sk))) > > return NULL; > > > > - if (sk->sk_protocol == IPPROTO_MPTCP) > > - return mptcp_sk(sk); > > + if (sk->sk_protocol == IPPROTO_MPTCP) { > > + msk = mptcp_sk(sk); > > + mptcp_set_bpf_iter_task(msk); > > Hi Geliang - > > The important part of the bpf_iter_task approach is to only set the > task_struct pointer when the msk lock is already held, and to clear > it > before the msk lock is released. When used this way, the check > ensures > that the iterator code is both *running with the socket locked* and > *in > the same context where the lock is held*. Got it, thanks for your explanation. > > The code above will set msk->bpf_iter_task even when the lock is not > held, > and then it is never cleared. > > To set/clear as expected for get/setsockopt I think the task pointer > would > have to be set in places where the locks are acquired and released, > like > these: > > https://elixir.bootlin.com/linux/v6.14-rc5/source/kernel/bpf/cgroup.c#L1955 > https://elixir.bootlin.com/linux/v6.14-rc5/source/kernel/bpf/cgroup.c#L1846 > The get/setsockopt scenario is a bit complicated. I think we can use the bpf_iter_task check only for bpf schedulers, keep the get/setsockopt scenario unchanged, and continue to use the msk_owned_by_me(msk) check. What do you think? In addition, I think it's better to add bpf_iter_task to struct mptcp_sched_data instead in struct mptcp_sock, since it's more related to mptcp scheduler. For specific usage, please see [1]. What do you think of this? Thanks, -Geliang [1] https://patchwork.kernel.org/project/mptcp/cover/cover.1739788598.git.tanggeliang@kylinos.cn/ > > - Mat > > > + return msk; > > + } > > > > - if (sk->sk_protocol == IPPROTO_TCP && sk_is_mptcp(sk)) > > - return mptcp_sk(mptcp_subflow_ctx(sk)->conn); > > + if (sk->sk_protocol == IPPROTO_TCP && sk_is_mptcp(sk)) { > > + msk = mptcp_sk(mptcp_subflow_ctx(sk)->conn); > > + mptcp_set_bpf_iter_task(msk); > > + return msk; > > + } > > > > return NULL; > > } > > -- > > 2.43.0 > > > > > >
Hi Geliang, On 07/03/2025 04:54, Geliang Tang wrote: > On Mon, 2025-03-03 at 17:37 -0800, Mat Martineau wrote: (...) >> The code above will set msk->bpf_iter_task even when the lock is not >> held, >> and then it is never cleared. >> >> To set/clear as expected for get/setsockopt I think the task pointer >> would >> have to be set in places where the locks are acquired and released, >> like >> these: >> >> https://elixir.bootlin.com/linux/v6.14-rc5/source/kernel/bpf/cgroup.c#L1955 >> https://elixir.bootlin.com/linux/v6.14-rc5/source/kernel/bpf/cgroup.c#L1846 >> > > The get/setsockopt scenario is a bit complicated. I think we can use > the bpf_iter_task check only for bpf schedulers, keep the > get/setsockopt scenario unchanged, and continue to use the > msk_owned_by_me(msk) check. What do you think? I think in any case, it doesn't hurt to keep msk_owned_by_me(msk) for lockdep. If we think it is difficult to have the bpf_iter for both CG [gs]etsockopt and struct_ops, then we can also drop the patches for the [gs]etsockopt. Or, from a kfunc, is it easy to check if we are called from a struct_ops? > In addition, I think it's better to add bpf_iter_task to struct > mptcp_sched_data instead in struct mptcp_sock, since it's more related > to mptcp scheduler. For specific usage, please see [1]. What do you > think of this? Please see my reply to your comment on this thread. In short, we can always add new parameters later, when we need them, no? https://lore.kernel.org/706a3051-8328-4321-8eea-71d4120ea996@kernel.org > > Thanks, > -Geliang > > [1] > https://patchwork.kernel.org/project/mptcp/cover/cover.1739788598.git.tanggeliang@kylinos.cn/ > >> >> - Mat >> >>> + return msk; >>> + } >>> >>> - if (sk->sk_protocol == IPPROTO_TCP && sk_is_mptcp(sk)) >>> - return mptcp_sk(mptcp_subflow_ctx(sk)->conn); >>> + if (sk->sk_protocol == IPPROTO_TCP && sk_is_mptcp(sk)) { >>> + msk = mptcp_sk(mptcp_subflow_ctx(sk)->conn); >>> + mptcp_set_bpf_iter_task(msk); >>> + return msk; >>> + } >>> >>> return NULL; >>> } >>> -- >>> 2.43.0 >>> >>> >>> > > Cheers, Matt
diff --git a/net/mptcp/bpf.c b/net/mptcp/bpf.c index be222fa5f308..aff92f458e7f 100644 --- a/net/mptcp/bpf.c +++ b/net/mptcp/bpf.c @@ -197,14 +197,22 @@ static struct bpf_struct_ops bpf_mptcp_sched_ops = { struct mptcp_sock *bpf_mptcp_sock_from_sock(struct sock *sk) { + struct mptcp_sock *msk; + if (unlikely(!sk || !sk_fullsock(sk))) return NULL; - if (sk->sk_protocol == IPPROTO_MPTCP) - return mptcp_sk(sk); + if (sk->sk_protocol == IPPROTO_MPTCP) { + msk = mptcp_sk(sk); + mptcp_set_bpf_iter_task(msk); + return msk; + } - if (sk->sk_protocol == IPPROTO_TCP && sk_is_mptcp(sk)) - return mptcp_sk(mptcp_subflow_ctx(sk)->conn); + if (sk->sk_protocol == IPPROTO_TCP && sk_is_mptcp(sk)) { + msk = mptcp_sk(mptcp_subflow_ctx(sk)->conn); + mptcp_set_bpf_iter_task(msk); + return msk; + } return NULL; }