diff mbox series

[mptcp-next] Squash to "bpf: Add mptcp_subflow bpf_iter"

Message ID 20250226180727.2499531-2-matttbe@kernel.org (mailing list archive)
State Accepted, archived
Commit dd8797c750fe851f953242b2ddf0e538a761854a
Delegated to: Matthieu Baerts
Headers show
Series [mptcp-next] Squash to "bpf: Add mptcp_subflow bpf_iter" | expand

Checks

Context Check Description
matttbe/build success Build and static analysis OK
matttbe/checkpatch success total: 0 errors, 0 warnings, 0 checks, 9 lines checked
matttbe/shellcheck success MPTCP selftests files have not been modified
matttbe/KVM_Validation__normal success Success! ✅
matttbe/KVM_Validation__debug success Success! ✅
matttbe/KVM_Validation__btf-normal__only_bpftest_all_ success Success! ✅
matttbe/KVM_Validation__btf-debug__only_bpftest_all_ success Success! ✅

Commit Message

Matthieu Baerts Feb. 26, 2025, 6:07 p.m. UTC
From what we understood, when being used from a CG [gs]etsockopt hook,
the socket lock will be held. It seems that the BPF infra will make sure
in this case. Mat will try to get the confirmation.

The idea is then to add this msk_owned_by_me() check to make sure the
assumption is correct, and will continue to be: the new selftest should
complain if not when used with a debug kconfig. In other words, if the
tests continue to pass with this patch, the Squash-to series can
probably be applied.

Note that this is just a debug check, hoping that the selftest will
cover all cases. We can then not use this check to return an error if it
is not held.

Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
Based-on: <cover.1740368110.git.tanggeliang@kylinos.cn>
Cc: Geliang Tang <geliang@kernel.org>
Cc: Mat Martineau <martineau@kernel.org>
---
 net/mptcp/bpf.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

MPTCP CI Feb. 26, 2025, 7:17 p.m. UTC | #1
Hi Matthieu,

Thank you for your modifications, that's great!

Our CI did some validations and here is its report:

- KVM Validation: normal: Success! ✅
- KVM Validation: debug: Success! ✅
- KVM Validation: btf-normal (only bpftest_all): Success! ✅
- KVM Validation: btf-debug (only bpftest_all): Success! ✅
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/13550710113

Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/c610b4dde837
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=938175


If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:

    $ cd [kernel source code]
    $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
        --pull always mptcp/mptcp-upstream-virtme-docker:latest \
        auto-normal

For more details:

    https://github.com/multipath-tcp/mptcp-upstream-virtme-docker


Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)
Geliang Tang Feb. 27, 2025, 2:03 a.m. UTC | #2
Hi Matt,

On Wed, 2025-02-26 at 19:07 +0100, Matthieu Baerts (NGI0) wrote:
> From what we understood, when being used from a CG [gs]etsockopt
> hook,
> the socket lock will be held. It seems that the BPF infra will make
> sure
> in this case. Mat will try to get the confirmation.
> 
> The idea is then to add this msk_owned_by_me() check to make sure the
> assumption is correct, and will continue to be: the new selftest
> should
> complain if not when used with a debug kconfig. In other words, if
> the
> tests continue to pass with this patch, the Squash-to series can
> probably be applied.
> 
> Note that this is just a debug check, hoping that the selftest will
> cover all cases. We can then not use this check to return an error if
> it
> is not held.
> 
> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>

Thanks for updating this for me.

LGTM!

Reviewed-by: Geliang Tang <geliang@kernel.org>

> ---
> Based-on: <cover.1740368110.git.tanggeliang@kylinos.cn>

I changed the state of this set, Squash to "Add mptcp_subflow bpf_iter
support" v4, together with this squash-to patch, as "Queued".

Thanks,
-Geliang

> Cc: Geliang Tang <geliang@kernel.org>
> Cc: Mat Martineau <martineau@kernel.org>
> ---
>  net/mptcp/bpf.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/net/mptcp/bpf.c b/net/mptcp/bpf.c
> index ac7744c6006f..9e8c24c88022 100644
> --- a/net/mptcp/bpf.c
> +++ b/net/mptcp/bpf.c
> @@ -252,6 +252,9 @@ bpf_iter_mptcp_subflow_new(struct
> bpf_iter_mptcp_subflow *it,
>  		return -EINVAL;
>  
>  	msk = mptcp_sk(sk);
> +
> +	msk_owned_by_me(msk);
> +
>  	kit->msk = msk;
>  	kit->pos = &msk->conn_list;
>  	return 0;
Matthieu Baerts Feb. 27, 2025, 8:32 a.m. UTC | #3
Hi Geliang, Mat,

On 27/02/2025 03:03, Geliang Tang wrote:
> Hi Matt,
> 
> On Wed, 2025-02-26 at 19:07 +0100, Matthieu Baerts (NGI0) wrote:
>> From what we understood, when being used from a CG [gs]etsockopt
>> hook,
>> the socket lock will be held. It seems that the BPF infra will make
>> sure
>> in this case. Mat will try to get the confirmation.
>>
>> The idea is then to add this msk_owned_by_me() check to make sure the
>> assumption is correct, and will continue to be: the new selftest
>> should
>> complain if not when used with a debug kconfig. In other words, if
>> the
>> tests continue to pass with this patch, the Squash-to series can
>> probably be applied.
>>
>> Note that this is just a debug check, hoping that the selftest will
>> cover all cases. We can then not use this check to return an error if
>> it
>> is not held.
>>
>> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
> 
> Thanks for updating this for me.
> 
> LGTM!
> 
> Reviewed-by: Geliang Tang <geliang@kernel.org>

Thank you for the review!

>> ---
>> Based-on: <cover.1740368110.git.tanggeliang@kylinos.cn>
> 
> I changed the state of this set, Squash to "Add mptcp_subflow bpf_iter
> support" v4, together with this squash-to patch, as "Queued".

Thank you. If that's OK, I will wait for Mat's green light before
applying the patches, because he wanted to look at the BPF code to make
sure our assumption was correct.

Cheers,
Matt
Mat Martineau March 1, 2025, 1:37 a.m. UTC | #4
On Thu, 27 Feb 2025, Matthieu Baerts wrote:

> Hi Geliang, Mat,
>
> On 27/02/2025 03:03, Geliang Tang wrote:
>> Hi Matt,
>>
>> On Wed, 2025-02-26 at 19:07 +0100, Matthieu Baerts (NGI0) wrote:
>>> From what we understood, when being used from a CG [gs]etsockopt
>>> hook,
>>> the socket lock will be held. It seems that the BPF infra will make
>>> sure
>>> in this case. Mat will try to get the confirmation.
>>>
>>> The idea is then to add this msk_owned_by_me() check to make sure the
>>> assumption is correct, and will continue to be: the new selftest
>>> should
>>> complain if not when used with a debug kconfig. In other words, if
>>> the
>>> tests continue to pass with this patch, the Squash-to series can
>>> probably be applied.
>>>
>>> Note that this is just a debug check, hoping that the selftest will
>>> cover all cases. We can then not use this check to return an error if
>>> it
>>> is not held.
>>>
>>> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
>>
>> Thanks for updating this for me.
>>
>> LGTM!
>>
>> Reviewed-by: Geliang Tang <geliang@kernel.org>
>
> Thank you for the review!
>
>>> ---
>>> Based-on: <cover.1740368110.git.tanggeliang@kylinos.cn>
>>
>> I changed the state of this set, Squash to "Add mptcp_subflow bpf_iter
>> support" v4, together with this squash-to patch, as "Queued".
>
> Thank you. If that's OK, I will wait for Mat's green light before
> applying the patches, because he wanted to look at the BPF code to make
> sure our assumption was correct.
>

Ok, I did miss it before but in the cases of setsockopt/getsockopt, the 
socket lock is acquired by the bpf wrapper code:

https://elixir.bootlin.com/linux/v6.14-rc4/source/kernel/bpf/cgroup.c#L1955

That's why the test passes. There's still the general issue that the 
iterator could be used in any bpf code that can pass a valid msk pointer. 
The existing lock checks in the export branch check that the socket is 
locked, but it could be locked by some other owner.


As we've already discussed, msk_owned_by_me() doesn't work everywhere 
because we can't change behavior if the lock isn't owned. However, we 
don't need to handle every locking case like lockdep does - for the 
scheduler BPF case, we only care if the BPF code is running in the 
scheduler context. So, I think we can do a simplified version of tracking 
the lock context:

1. Add a 'struct task_struct *bpf_scheduler_task' field to struct 
mptcp_sock.

2. Do a WRITE_ONCE(msk->bpf_scheduler_context, current) before calling a 
packet scheduler hook, and WRITE_ONCE(msk->bpf_scheduler_task, NULL) after 
the hook returns.

3. In bpf_iter_mptcp_subflow_new(), check 
"READ_ONCE(msk->bpf_scheduler_task) == current" to confirm the correct 
task, return -EINVAL if it doesn't match.


(It would help to create helper functions for setting and checking that 
value)

Do you think this iterator will be useful in other places, like bpf path 
manager? The same approach would work there, but a better name than 
"bpf_scheduler_task" would help.


- Mat
Matthieu Baerts March 1, 2025, 10:37 a.m. UTC | #5
Hi Mat,

On 01/03/2025 02:37, Mat Martineau wrote:
> On Thu, 27 Feb 2025, Matthieu Baerts wrote:
> 
>> Hi Geliang, Mat,
>>
>> On 27/02/2025 03:03, Geliang Tang wrote:
>>> Hi Matt,
>>>
>>> On Wed, 2025-02-26 at 19:07 +0100, Matthieu Baerts (NGI0) wrote:
>>>> From what we understood, when being used from a CG [gs]etsockopt
>>>> hook,
>>>> the socket lock will be held. It seems that the BPF infra will make
>>>> sure
>>>> in this case. Mat will try to get the confirmation.
>>>>
>>>> The idea is then to add this msk_owned_by_me() check to make sure the
>>>> assumption is correct, and will continue to be: the new selftest
>>>> should
>>>> complain if not when used with a debug kconfig. In other words, if
>>>> the
>>>> tests continue to pass with this patch, the Squash-to series can
>>>> probably be applied.
>>>>
>>>> Note that this is just a debug check, hoping that the selftest will
>>>> cover all cases. We can then not use this check to return an error if
>>>> it
>>>> is not held.
>>>>
>>>> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
>>>
>>> Thanks for updating this for me.
>>>
>>> LGTM!
>>>
>>> Reviewed-by: Geliang Tang <geliang@kernel.org>
>>
>> Thank you for the review!
>>
>>>> ---
>>>> Based-on: <cover.1740368110.git.tanggeliang@kylinos.cn>
>>>
>>> I changed the state of this set, Squash to "Add mptcp_subflow bpf_iter
>>> support" v4, together with this squash-to patch, as "Queued".
>>
>> Thank you. If that's OK, I will wait for Mat's green light before
>> applying the patches, because he wanted to look at the BPF code to make
>> sure our assumption was correct.
>>
> 
> Ok, I did miss it before but in the cases of setsockopt/getsockopt, the
> socket lock is acquired by the bpf wrapper code:
> 
> https://elixir.bootlin.com/linux/v6.14-rc4/source/kernel/bpf/cgroup.c#L1955
> 
> That's why the test passes. There's still the general issue that the
> iterator could be used in any bpf code that can pass a valid msk
> pointer. The existing lock checks in the export branch check that the
> socket is locked, but it could be locked by some other owner.

Thank you for having looked! Because these new kfuncs are currently only
tied to BPF_PROG_TYPE_CGROUP_SOCKOPT, can we apply all these squash-to
patches and send this to BPF maintainers?


When bpf_iter will be used in the schedulers and the PM, more checks can
be added later.

> As we've already discussed, msk_owned_by_me() doesn't work everywhere
> because we can't change behavior if the lock isn't owned. However, we
> don't need to handle every locking case like lockdep does - for the
> scheduler BPF case, we only care if the BPF code is running in the
> scheduler context. So, I think we can do a simplified version of
> tracking the lock context:
> 
> 1. Add a 'struct task_struct *bpf_scheduler_task' field to struct
> mptcp_sock.
> 
> 2. Do a WRITE_ONCE(msk->bpf_scheduler_context, current) before calling a
> packet scheduler hook, and WRITE_ONCE(msk->bpf_scheduler_task, NULL)
> after the hook returns.
> 
> 3. In bpf_iter_mptcp_subflow_new(), check "READ_ONCE(msk->
> bpf_scheduler_task) == current" to confirm the correct task, return
> -EINVAL if it doesn't match.

That's a very good idea! Yes, we will need to add such checks before
msk_owned_by_me() when the bpf_iter will be used elsewhere.

> (It would help to create helper functions for setting and checking that
> value)

+1.

> Do you think this iterator will be useful in other places, like bpf path
> manager? The same approach would work there, but a better name than
> "bpf_scheduler_task" would help.
Yes, good point! "msk->bpf_struct_ops_task"?

(that's something that could also be added after having applied "use
bpf_iter in bpf schedulers") series)

Cheers,
Matt
Mat Martineau March 4, 2025, 1:42 a.m. UTC | #6
On Sat, 1 Mar 2025, Matthieu Baerts wrote:

> Hi Mat,
>
> On 01/03/2025 02:37, Mat Martineau wrote:
>> On Thu, 27 Feb 2025, Matthieu Baerts wrote:
>>
>>> Hi Geliang, Mat,
>>>
>>> On 27/02/2025 03:03, Geliang Tang wrote:
>>>> Hi Matt,
>>>>
>>>> On Wed, 2025-02-26 at 19:07 +0100, Matthieu Baerts (NGI0) wrote:
>>>>> From what we understood, when being used from a CG [gs]etsockopt
>>>>> hook,
>>>>> the socket lock will be held. It seems that the BPF infra will make
>>>>> sure
>>>>> in this case. Mat will try to get the confirmation.
>>>>>
>>>>> The idea is then to add this msk_owned_by_me() check to make sure the
>>>>> assumption is correct, and will continue to be: the new selftest
>>>>> should
>>>>> complain if not when used with a debug kconfig. In other words, if
>>>>> the
>>>>> tests continue to pass with this patch, the Squash-to series can
>>>>> probably be applied.
>>>>>
>>>>> Note that this is just a debug check, hoping that the selftest will
>>>>> cover all cases. We can then not use this check to return an error if
>>>>> it
>>>>> is not held.
>>>>>
>>>>> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
>>>>
>>>> Thanks for updating this for me.
>>>>
>>>> LGTM!
>>>>
>>>> Reviewed-by: Geliang Tang <geliang@kernel.org>
>>>
>>> Thank you for the review!
>>>
>>>>> ---
>>>>> Based-on: <cover.1740368110.git.tanggeliang@kylinos.cn>
>>>>
>>>> I changed the state of this set, Squash to "Add mptcp_subflow bpf_iter
>>>> support" v4, together with this squash-to patch, as "Queued".
>>>
>>> Thank you. If that's OK, I will wait for Mat's green light before
>>> applying the patches, because he wanted to look at the BPF code to make
>>> sure our assumption was correct.
>>>
>>
>> Ok, I did miss it before but in the cases of setsockopt/getsockopt, the
>> socket lock is acquired by the bpf wrapper code:
>>
>> https://elixir.bootlin.com/linux/v6.14-rc4/source/kernel/bpf/cgroup.c#L1955
>>
>> That's why the test passes. There's still the general issue that the
>> iterator could be used in any bpf code that can pass a valid msk
>> pointer. The existing lock checks in the export branch check that the
>> socket is locked, but it could be locked by some other owner.
>
> Thank you for having looked! Because these new kfuncs are currently only
> tied to BPF_PROG_TYPE_CGROUP_SOCKOPT, can we apply all these squash-to
> patches and send this to BPF maintainers?

Hi Matthieu -

I think once the squash-to patches are accepted into our repo (a few more 
changes are needed), it would be ok to upstream them to the BPF 
maintainers.

- Mat


>
>
> When bpf_iter will be used in the schedulers and the PM, more checks can
> be added later.
>
>> As we've already discussed, msk_owned_by_me() doesn't work everywhere
>> because we can't change behavior if the lock isn't owned. However, we
>> don't need to handle every locking case like lockdep does - for the
>> scheduler BPF case, we only care if the BPF code is running in the
>> scheduler context. So, I think we can do a simplified version of
>> tracking the lock context:
>>
>> 1. Add a 'struct task_struct *bpf_scheduler_task' field to struct
>> mptcp_sock.
>>
>> 2. Do a WRITE_ONCE(msk->bpf_scheduler_context, current) before calling a
>> packet scheduler hook, and WRITE_ONCE(msk->bpf_scheduler_task, NULL)
>> after the hook returns.
>>
>> 3. In bpf_iter_mptcp_subflow_new(), check "READ_ONCE(msk->
>> bpf_scheduler_task) == current" to confirm the correct task, return
>> -EINVAL if it doesn't match.
>
> That's a very good idea! Yes, we will need to add such checks before
> msk_owned_by_me() when the bpf_iter will be used elsewhere.
>
>> (It would help to create helper functions for setting and checking that
>> value)
>
> +1.
>
>> Do you think this iterator will be useful in other places, like bpf path
>> manager? The same approach would work there, but a better name than
>> "bpf_scheduler_task" would help.
> Yes, good point! "msk->bpf_struct_ops_task"?
>
> (that's something that could also be added after having applied "use
> bpf_iter in bpf schedulers") series)
>
> Cheers,
> Matt
> -- 
> Sponsored by the NGI0 Core fund.
>
>
Matthieu Baerts March 4, 2025, 9:22 a.m. UTC | #7
Hi Mat,

On 04/03/2025 02:42, Mat Martineau wrote:
> On Sat, 1 Mar 2025, Matthieu Baerts wrote:
> 
>> Hi Mat,
>>
>> On 01/03/2025 02:37, Mat Martineau wrote:
>>> On Thu, 27 Feb 2025, Matthieu Baerts wrote:
>>>
>>>> Hi Geliang, Mat,
>>>>
>>>> On 27/02/2025 03:03, Geliang Tang wrote:
>>>>> Hi Matt,
>>>>>
>>>>> On Wed, 2025-02-26 at 19:07 +0100, Matthieu Baerts (NGI0) wrote:
>>>>>> From what we understood, when being used from a CG [gs]etsockopt
>>>>>> hook,
>>>>>> the socket lock will be held. It seems that the BPF infra will make
>>>>>> sure
>>>>>> in this case. Mat will try to get the confirmation.
>>>>>>
>>>>>> The idea is then to add this msk_owned_by_me() check to make sure the
>>>>>> assumption is correct, and will continue to be: the new selftest
>>>>>> should
>>>>>> complain if not when used with a debug kconfig. In other words, if
>>>>>> the
>>>>>> tests continue to pass with this patch, the Squash-to series can
>>>>>> probably be applied.
>>>>>>
>>>>>> Note that this is just a debug check, hoping that the selftest will
>>>>>> cover all cases. We can then not use this check to return an error if
>>>>>> it
>>>>>> is not held.
>>>>>>
>>>>>> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
>>>>>
>>>>> Thanks for updating this for me.
>>>>>
>>>>> LGTM!
>>>>>
>>>>> Reviewed-by: Geliang Tang <geliang@kernel.org>
>>>>
>>>> Thank you for the review!
>>>>
>>>>>> ---
>>>>>> Based-on: <cover.1740368110.git.tanggeliang@kylinos.cn>
>>>>>
>>>>> I changed the state of this set, Squash to "Add mptcp_subflow bpf_iter
>>>>> support" v4, together with this squash-to patch, as "Queued".
>>>>
>>>> Thank you. If that's OK, I will wait for Mat's green light before
>>>> applying the patches, because he wanted to look at the BPF code to make
>>>> sure our assumption was correct.
>>>>
>>>
>>> Ok, I did miss it before but in the cases of setsockopt/getsockopt, the
>>> socket lock is acquired by the bpf wrapper code:
>>>
>>> https://elixir.bootlin.com/linux/v6.14-rc4/source/kernel/bpf/
>>> cgroup.c#L1955
>>>
>>> That's why the test passes. There's still the general issue that the
>>> iterator could be used in any bpf code that can pass a valid msk
>>> pointer. The existing lock checks in the export branch check that the
>>> socket is locked, but it could be locked by some other owner.
>>
>> Thank you for having looked! Because these new kfuncs are currently only
>> tied to BPF_PROG_TYPE_CGROUP_SOCKOPT, can we apply all these squash-to
>> patches and send this to BPF maintainers?
> 
> Hi Matthieu -
> 
> I think once the squash-to patches are accepted into our repo (a few
> more changes are needed), it would be ok to upstream them to the BPF
> maintainers.

Thank you for your reply! But I admit I'm a bit lost: which changes are
you talking about?

In the "Add mptcp_subflow bpf_iter support" series we sent to the BPF
maintainers, this support is only added to CG [gs]etsockopt where the sk
lock is owned, so no need to have extra checks for the moment, no? Then
the "Squash to "Add mptcp_subflow bpf_iter support"" v4 (not the last
one you looked at, the v5) should be enough, no?

My understanding is that setting "msk->bpf_XXX" to "current" will only
be needed when mptcp_subflow bpf_iter will be allowed to be used with
struct_ops, no? If yes, better to keep this complexity for later, no?

Cheers,
Matt
Mat Martineau March 5, 2025, 1:35 a.m. UTC | #8
On Tue, 4 Mar 2025, Matthieu Baerts wrote:

> Hi Mat,
>
> On 04/03/2025 02:42, Mat Martineau wrote:
>> On Sat, 1 Mar 2025, Matthieu Baerts wrote:
>>
>>> Hi Mat,
>>>
>>> On 01/03/2025 02:37, Mat Martineau wrote:
>>>> On Thu, 27 Feb 2025, Matthieu Baerts wrote:
>>>>
>>>>> Hi Geliang, Mat,
>>>>>
>>>>> On 27/02/2025 03:03, Geliang Tang wrote:
>>>>>> Hi Matt,
>>>>>>
>>>>>> On Wed, 2025-02-26 at 19:07 +0100, Matthieu Baerts (NGI0) wrote:
>>>>>>> From what we understood, when being used from a CG [gs]etsockopt
>>>>>>> hook,
>>>>>>> the socket lock will be held. It seems that the BPF infra will make
>>>>>>> sure
>>>>>>> in this case. Mat will try to get the confirmation.
>>>>>>>
>>>>>>> The idea is then to add this msk_owned_by_me() check to make sure the
>>>>>>> assumption is correct, and will continue to be: the new selftest
>>>>>>> should
>>>>>>> complain if not when used with a debug kconfig. In other words, if
>>>>>>> the
>>>>>>> tests continue to pass with this patch, the Squash-to series can
>>>>>>> probably be applied.
>>>>>>>
>>>>>>> Note that this is just a debug check, hoping that the selftest will
>>>>>>> cover all cases. We can then not use this check to return an error if
>>>>>>> it
>>>>>>> is not held.
>>>>>>>
>>>>>>> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
>>>>>>
>>>>>> Thanks for updating this for me.
>>>>>>
>>>>>> LGTM!
>>>>>>
>>>>>> Reviewed-by: Geliang Tang <geliang@kernel.org>
>>>>>
>>>>> Thank you for the review!
>>>>>
>>>>>>> ---
>>>>>>> Based-on: <cover.1740368110.git.tanggeliang@kylinos.cn>
>>>>>>
>>>>>> I changed the state of this set, Squash to "Add mptcp_subflow bpf_iter
>>>>>> support" v4, together with this squash-to patch, as "Queued".
>>>>>
>>>>> Thank you. If that's OK, I will wait for Mat's green light before
>>>>> applying the patches, because he wanted to look at the BPF code to make
>>>>> sure our assumption was correct.
>>>>>
>>>>
>>>> Ok, I did miss it before but in the cases of setsockopt/getsockopt, the
>>>> socket lock is acquired by the bpf wrapper code:
>>>>
>>>> https://elixir.bootlin.com/linux/v6.14-rc4/source/kernel/bpf/
>>>> cgroup.c#L1955
>>>>
>>>> That's why the test passes. There's still the general issue that the
>>>> iterator could be used in any bpf code that can pass a valid msk
>>>> pointer. The existing lock checks in the export branch check that the
>>>> socket is locked, but it could be locked by some other owner.
>>>
>>> Thank you for having looked! Because these new kfuncs are currently only
>>> tied to BPF_PROG_TYPE_CGROUP_SOCKOPT, can we apply all these squash-to
>>> patches and send this to BPF maintainers?
>>
>> Hi Matthieu -
>>
>> I think once the squash-to patches are accepted into our repo (a few
>> more changes are needed), it would be ok to upstream them to the BPF
>> maintainers.
>
> Thank you for your reply! But I admit I'm a bit lost: which changes are
> you talking about?
>
> In the "Add mptcp_subflow bpf_iter support" series we sent to the BPF
> maintainers, this support is only added to CG [gs]etsockopt where the sk
> lock is owned, so no need to have extra checks for the moment, no? Then
> the "Squash to "Add mptcp_subflow bpf_iter support"" v4 (not the last
> one you looked at, the v5) should be enough, no?
>

Hi Matthieu -

Thanks for clarifying, I misunderstood the scope of "all these squash-to 
patches" to include v5 :)

I see what you mean, and I do agree that squashing the v4 patches 
(https://patchwork.kernel.org/project/mptcp/cover/cover.1740368110.git.tanggeliang@kylinos.cn/) 
for another round of BPF maintainer review is ok with me since the sockopt 
calls have the necessary locking, so we can separately upstream the 
iterator series (with sockopt limitations) now.

> My understanding is that setting "msk->bpf_XXX" to "current" will only
> be needed when mptcp_subflow bpf_iter will be allowed to be used with
> struct_ops, no? If yes, better to keep this complexity for later, no?

Agree here as well, the task_struct checking is only needed once there are 
other locking situations to verify.


- Mat
Matthieu Baerts March 5, 2025, 3:14 p.m. UTC | #9
Hi Mat,

On 05/03/2025 02:35, Mat Martineau wrote:
> On Tue, 4 Mar 2025, Matthieu Baerts wrote:
> 
>> Hi Mat,
>>
>> On 04/03/2025 02:42, Mat Martineau wrote:
>>> On Sat, 1 Mar 2025, Matthieu Baerts wrote:
>>>
>>>> Hi Mat,
>>>>
>>>> On 01/03/2025 02:37, Mat Martineau wrote:
>>>>> On Thu, 27 Feb 2025, Matthieu Baerts wrote:
>>>>>
>>>>>> Hi Geliang, Mat,
>>>>>>
>>>>>> On 27/02/2025 03:03, Geliang Tang wrote:
>>>>>>> Hi Matt,
>>>>>>>
>>>>>>> On Wed, 2025-02-26 at 19:07 +0100, Matthieu Baerts (NGI0) wrote:
>>>>>>>> From what we understood, when being used from a CG [gs]etsockopt
>>>>>>>> hook,
>>>>>>>> the socket lock will be held. It seems that the BPF infra will make
>>>>>>>> sure
>>>>>>>> in this case. Mat will try to get the confirmation.
>>>>>>>>
>>>>>>>> The idea is then to add this msk_owned_by_me() check to make
>>>>>>>> sure the
>>>>>>>> assumption is correct, and will continue to be: the new selftest
>>>>>>>> should
>>>>>>>> complain if not when used with a debug kconfig. In other words, if
>>>>>>>> the
>>>>>>>> tests continue to pass with this patch, the Squash-to series can
>>>>>>>> probably be applied.
>>>>>>>>
>>>>>>>> Note that this is just a debug check, hoping that the selftest will
>>>>>>>> cover all cases. We can then not use this check to return an
>>>>>>>> error if
>>>>>>>> it
>>>>>>>> is not held.
>>>>>>>>
>>>>>>>> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
>>>>>>>
>>>>>>> Thanks for updating this for me.
>>>>>>>
>>>>>>> LGTM!
>>>>>>>
>>>>>>> Reviewed-by: Geliang Tang <geliang@kernel.org>
>>>>>>
>>>>>> Thank you for the review!
>>>>>>
>>>>>>>> ---
>>>>>>>> Based-on: <cover.1740368110.git.tanggeliang@kylinos.cn>
>>>>>>>
>>>>>>> I changed the state of this set, Squash to "Add mptcp_subflow
>>>>>>> bpf_iter
>>>>>>> support" v4, together with this squash-to patch, as "Queued".
>>>>>>
>>>>>> Thank you. If that's OK, I will wait for Mat's green light before
>>>>>> applying the patches, because he wanted to look at the BPF code to
>>>>>> make
>>>>>> sure our assumption was correct.
>>>>>>
>>>>>
>>>>> Ok, I did miss it before but in the cases of setsockopt/getsockopt,
>>>>> the
>>>>> socket lock is acquired by the bpf wrapper code:
>>>>>
>>>>> https://elixir.bootlin.com/linux/v6.14-rc4/source/kernel/bpf/
>>>>> cgroup.c#L1955
>>>>>
>>>>> That's why the test passes. There's still the general issue that the
>>>>> iterator could be used in any bpf code that can pass a valid msk
>>>>> pointer. The existing lock checks in the export branch check that the
>>>>> socket is locked, but it could be locked by some other owner.
>>>>
>>>> Thank you for having looked! Because these new kfuncs are currently
>>>> only
>>>> tied to BPF_PROG_TYPE_CGROUP_SOCKOPT, can we apply all these squash-to
>>>> patches and send this to BPF maintainers?
>>>
>>> Hi Matthieu -
>>>
>>> I think once the squash-to patches are accepted into our repo (a few
>>> more changes are needed), it would be ok to upstream them to the BPF
>>> maintainers.
>>
>> Thank you for your reply! But I admit I'm a bit lost: which changes are
>> you talking about?
>>
>> In the "Add mptcp_subflow bpf_iter support" series we sent to the BPF
>> maintainers, this support is only added to CG [gs]etsockopt where the sk
>> lock is owned, so no need to have extra checks for the moment, no? Then
>> the "Squash to "Add mptcp_subflow bpf_iter support"" v4 (not the last
>> one you looked at, the v5) should be enough, no?
>>
> 
> Hi Matthieu -
> 
> Thanks for clarifying, I misunderstood the scope of "all these squash-to
> patches" to include v5 :)
> 
> I see what you mean, and I do agree that squashing the v4 patches
> (https://patchwork.kernel.org/project/mptcp/cover/
> cover.1740368110.git.tanggeliang@kylinos.cn/) for another round of BPF
> maintainer review is ok with me since the sockopt calls have the
> necessary locking, so we can separately upstream the iterator series
> (with sockopt limitations) now.

OK, great! I will then apply the v4, and archive the v5 on Patchwork to
avoid confusions.

I will then also apply "use bpf_iter in bpf schedulers" v15, even if the
"task_struct" checking will still need to be added, but at least we have
some material to show to Martin.

>> My understanding is that setting "msk->bpf_XXX" to "current" will only
>> be needed when mptcp_subflow bpf_iter will be allowed to be used with
>> struct_ops, no? If yes, better to keep this complexity for later, no?
> 
> Agree here as well, the task_struct checking is only needed once there
> are other locking situations to verify.
Good! I guess Geliang will look at that later.

Cheers,
Matt
diff mbox series

Patch

diff --git a/net/mptcp/bpf.c b/net/mptcp/bpf.c
index ac7744c6006f..9e8c24c88022 100644
--- a/net/mptcp/bpf.c
+++ b/net/mptcp/bpf.c
@@ -252,6 +252,9 @@  bpf_iter_mptcp_subflow_new(struct bpf_iter_mptcp_subflow *it,
 		return -EINVAL;
 
 	msk = mptcp_sk(sk);
+
+	msk_owned_by_me(msk);
+
 	kit->msk = msk;
 	kit->pos = &msk->conn_list;
 	return 0;