diff mbox series

[mptcp-next] mptcp: pm: userspace: avoid scheduling in-kernel PM worker

Message ID 20250224-mptcp-userspace-avoid-worker-v1-1-127325d3e9a4@kernel.org (mailing list archive)
State Superseded, archived
Headers show
Series [mptcp-next] mptcp: pm: userspace: avoid scheduling in-kernel PM worker | expand

Checks

Context Check Description
matttbe/build success Build and static analysis OK
matttbe/checkpatch success total: 0 errors, 0 warnings, 0 checks, 34 lines checked
matttbe/shellcheck success MPTCP selftests files have not been modified
matttbe/KVM_Validation__normal success Success! ✅
matttbe/KVM_Validation__debug success Success! ✅
matttbe/KVM_Validation__btf-normal__only_bpftest_all_ success Success! ✅
matttbe/KVM_Validation__btf-debug__only_bpftest_all_ success Success! ✅

Commit Message

Matthieu Baerts Feb. 24, 2025, 7:01 p.m. UTC
When the userspace PM is used, there is no need to schedule the PM
worker for in-kernel specific tasks, e.g. creating new subflows, or
sending more ADD_ADDR.

Now, these tasks will be done only if the in-kernel PM is being used.

Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
Notes:
 - I don't know if this should be seen as a fix. Maybe yes because it
   was not supposed to do that from the beginning, and it feels less
   risky not to schedule the worker when it is not needed. I'm open to
   suggestions here.
 - I'm mainly sharing this patch now because Geliang is going to modify
   these helpers to separate PM specific code.
   @Geliang: please rebase your series on top of this one (or include
   this patch if it is easier).
Cc: Geliang Tang <tanggeliang@kylinos.cn>
---
 net/mptcp/pm.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)


---
base-commit: 3716d837622ee1baf6fbe7e6a6a7a3df116e75fe
change-id: 20250224-mptcp-userspace-avoid-worker-93f367e39ca6

Best regards,

Comments

MPTCP CI Feb. 24, 2025, 8:05 p.m. UTC | #1
Hi Matthieu,

Thank you for your modifications, that's great!

Our CI did some validations and here is its report:

- KVM Validation: normal: Success! ✅
- KVM Validation: debug: Success! ✅
- KVM Validation: btf-normal (only bpftest_all): Success! ✅
- KVM Validation: btf-debug (only bpftest_all): Success! ✅
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/13505941194

Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/c3c9a58d9080
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=937225


If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:

    $ cd [kernel source code]
    $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
        --pull always mptcp/mptcp-upstream-virtme-docker:latest \
        auto-normal

For more details:

    https://github.com/multipath-tcp/mptcp-upstream-virtme-docker


Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)
Geliang Tang Feb. 25, 2025, 4:21 a.m. UTC | #2
Hi Matt,

On Mon, 2025-02-24 at 20:01 +0100, Matthieu Baerts (NGI0) wrote:
> When the userspace PM is used, there is no need to schedule the PM
> worker for in-kernel specific tasks, e.g. creating new subflows, or
> sending more ADD_ADDR.
> 
> Now, these tasks will be done only if the in-kernel PM is being used.
> 
> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
> ---
> Notes:
>  - I don't know if this should be seen as a fix. Maybe yes because it
>    was not supposed to do that from the beginning, and it feels less
>    risky not to schedule the worker when it is not needed. I'm open
> to
>    suggestions here.
>  - I'm mainly sharing this patch now because Geliang is going to
> modify
>    these helpers to separate PM specific code.
>    @Geliang: please rebase your series on top of this one (or include
>    this patch if it is easier).
> Cc: Geliang Tang <tanggeliang@kylinos.cn>
> ---
>  net/mptcp/pm.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
> index
> 16cacce6c10fe86467aa7ef8e588f9f535b586fb..4eabb83328905676759768fb44c
> 859f5682721e3 100644
> --- a/net/mptcp/pm.c
> +++ b/net/mptcp/pm.c
> @@ -138,7 +138,7 @@ void mptcp_pm_fully_established(struct mptcp_sock
> *msk, const struct sock *ssk)
>  	 * racing paths - accept() and check_fully_established()
>  	 * be sure to serve this event only once.
>  	 */
> -	if (READ_ONCE(pm->work_pending) &&
> +	if (mptcp_pm_is_kernel(msk) && READ_ONCE(pm->work_pending)
> &&

I think there's no need to modify this. work_pending is set to 0 by 
userspace pm, for all cases where work_pending is 1, the path manager
must be an in-kernel pm.

So there is no need to add mptcp_pm_is_kernel() check here. Just
checking work_pending is sufficient.

>  	    !(msk->pm.status & BIT(MPTCP_PM_ALREADY_ESTABLISHED)))
>  		mptcp_pm_schedule_work(msk, MPTCP_PM_ESTABLISHED);
>  
> @@ -166,7 +166,7 @@ void mptcp_pm_subflow_established(struct
> mptcp_sock *msk)
>  
>  	pr_debug("msk=%p\n", msk);
>  
> -	if (!READ_ONCE(pm->work_pending))
> +	if (!mptcp_pm_is_kernel(msk) || !READ_ONCE(pm-
> >work_pending))
>  		return;

No need to modify this too, work_pending has been checked later in
mptcp_pm_subflow_established:

    if (READ_ONCE(pm->work_pending))
        mptcp_pm_schedule_work(msk, MPTCP_PM_SUBFLOW_ESTABLISHED);

>  
>  	spin_lock_bh(&pm->lock);
> @@ -251,6 +251,9 @@ void mptcp_pm_add_addr_echoed(struct mptcp_sock
> *msk,
>  
>  	pr_debug("msk=%p\n", msk);
>  
> +	if (!mptcp_pm_is_kernel(msk) || !READ_ONCE(pm-
> >work_pending))
> +		return;
> +

Same here, work_pending has been checked later in
mptcp_pm_add_addr_echoed:

if (mptcp_lookup_anno_list_by_saddr(msk, addr) && READ_ONCE(pm-
>work_pending))
    mptcp_pm_schedule_work(msk, MPTCP_PM_SUBFLOW_ESTABLISHED);

>  	spin_lock_bh(&pm->lock);
>  
>  	if (mptcp_lookup_anno_list_by_saddr(msk, addr) &&
> READ_ONCE(pm->work_pending))
> @@ -278,6 +281,9 @@ void mptcp_pm_rm_addr_received(struct mptcp_sock
> *msk,
>  	for (i = 0; i < rm_list->nr; i++)
>  		mptcp_event_addr_removed(msk, rm_list->ids[i]);
>  
> +	if (!mptcp_pm_is_kernel(msk))
> +		return;

This mptcp_pm_is_kernel() is indeed needed.

WDYT?

Thanks,
-Geliang

> +
>  	spin_lock_bh(&pm->lock);
>  	if (mptcp_pm_schedule_work(msk, MPTCP_PM_RM_ADDR_RECEIVED))
>  		pm->rm_list_rx = *rm_list;
> 
> ---
> base-commit: 3716d837622ee1baf6fbe7e6a6a7a3df116e75fe
> change-id: 20250224-mptcp-userspace-avoid-worker-93f367e39ca6
> 
> Best regards,
Matthieu Baerts Feb. 25, 2025, 9:25 a.m. UTC | #3
Hi Geliang,

Thank you for your review!

On 25/02/2025 05:21, Geliang Tang wrote:
> Hi Matt,
> 
> On Mon, 2025-02-24 at 20:01 +0100, Matthieu Baerts (NGI0) wrote:
>> When the userspace PM is used, there is no need to schedule the PM
>> worker for in-kernel specific tasks, e.g. creating new subflows, or
>> sending more ADD_ADDR.
>>
>> Now, these tasks will be done only if the in-kernel PM is being used.
>>
>> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
>> ---
>> Notes:
>>  - I don't know if this should be seen as a fix. Maybe yes because it
>>    was not supposed to do that from the beginning, and it feels less
>>    risky not to schedule the worker when it is not needed. I'm open
>> to
>>    suggestions here.
>>  - I'm mainly sharing this patch now because Geliang is going to
>> modify
>>    these helpers to separate PM specific code.
>>    @Geliang: please rebase your series on top of this one (or include
>>    this patch if it is easier).
>> Cc: Geliang Tang <tanggeliang@kylinos.cn>
>> ---
>>  net/mptcp/pm.c | 10 ++++++++--
>>  1 file changed, 8 insertions(+), 2 deletions(-)
>>
>> diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
>> index
>> 16cacce6c10fe86467aa7ef8e588f9f535b586fb..4eabb83328905676759768fb44c
>> 859f5682721e3 100644
>> --- a/net/mptcp/pm.c
>> +++ b/net/mptcp/pm.c
>> @@ -138,7 +138,7 @@ void mptcp_pm_fully_established(struct mptcp_sock
>> *msk, const struct sock *ssk)
>>  	 * racing paths - accept() and check_fully_established()
>>  	 * be sure to serve this event only once.
>>  	 */
>> -	if (READ_ONCE(pm->work_pending) &&
>> +	if (mptcp_pm_is_kernel(msk) && READ_ONCE(pm->work_pending)
>> &&
> 
> I think there's no need to modify this. work_pending is set to 0 by 
> userspace pm, for all cases where work_pending is 1, the path manager
> must be an in-kernel pm.

Good point, I missed that!

'pm->work_pending' doesn't look like a good name for what it does.
Hopefully, this will be clearer when pm->ops will be used, which will
clearly separate what is done by the different PMs.

Also, is it possible we never set 'pm->work_pending' back to 'true' once
it has been set to 'false'? Does it mean when we reach the limits, and
something happens (e.g. a removed subflow, limits are changed, etc.),
the worker is no longer used to establish new subflows or react when an
ADD_ADDR is received?

> So there is no need to add mptcp_pm_is_kernel() check here. Just
> checking work_pending is sufficient.

Indeed.

>>  	    !(msk->pm.status & BIT(MPTCP_PM_ALREADY_ESTABLISHED)))
>>  		mptcp_pm_schedule_work(msk, MPTCP_PM_ESTABLISHED);
>>  
>> @@ -166,7 +166,7 @@ void mptcp_pm_subflow_established(struct
>> mptcp_sock *msk)
>>  
>>  	pr_debug("msk=%p\n", msk);
>>  
>> -	if (!READ_ONCE(pm->work_pending))
>> +	if (!mptcp_pm_is_kernel(msk) || !READ_ONCE(pm-
>>> work_pending))
>>  		return;
> 
> No need to modify this too, work_pending has been checked later in
> mptcp_pm_subflow_established:
> 
>     if (READ_ONCE(pm->work_pending))
>         mptcp_pm_schedule_work(msk, MPTCP_PM_SUBFLOW_ESTABLISHED);

Indeed.

>>  
>>  	spin_lock_bh(&pm->lock);
>> @@ -251,6 +251,9 @@ void mptcp_pm_add_addr_echoed(struct mptcp_sock
>> *msk,
>>  
>>  	pr_debug("msk=%p\n", msk);
>>  
>> +	if (!mptcp_pm_is_kernel(msk) || !READ_ONCE(pm-
>>> work_pending))
>> +		return;
>> +
> 
> Same here, work_pending has been checked later in
> mptcp_pm_add_addr_echoed:
> 
> if (mptcp_lookup_anno_list_by_saddr(msk, addr) && READ_ONCE(pm-
>> work_pending))
>     mptcp_pm_schedule_work(msk, MPTCP_PM_SUBFLOW_ESTABLISHED);

Indeed.

>>  	spin_lock_bh(&pm->lock);
>>  
>>  	if (mptcp_lookup_anno_list_by_saddr(msk, addr) &&
>> READ_ONCE(pm->work_pending))
>> @@ -278,6 +281,9 @@ void mptcp_pm_rm_addr_received(struct mptcp_sock
>> *msk,
>>  	for (i = 0; i < rm_list->nr; i++)
>>  		mptcp_event_addr_removed(msk, rm_list->ids[i]);
>>  
>> +	if (!mptcp_pm_is_kernel(msk))
>> +		return;
> 
> This mptcp_pm_is_kernel() is indeed needed.

Yes it is. And 'pm->work_pending' cannot be used, because it will be set
to false when the limits are reached, but a RM_ADDR will potentially
decrement counters.

I will send a v2 with only this change.

Cheers,
Matt
Matthieu Baerts Feb. 25, 2025, 6:15 p.m. UTC | #4
Hi Geliang,

On 25/02/2025 10:25, Matthieu Baerts wrote:
> Hi Geliang,
> 
> Thank you for your review!
> 
> On 25/02/2025 05:21, Geliang Tang wrote:
>> Hi Matt,
>>
>> On Mon, 2025-02-24 at 20:01 +0100, Matthieu Baerts (NGI0) wrote:
>>> When the userspace PM is used, there is no need to schedule the PM
>>> worker for in-kernel specific tasks, e.g. creating new subflows, or
>>> sending more ADD_ADDR.
>>>
>>> Now, these tasks will be done only if the in-kernel PM is being used.
>>>
>>> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
>>> ---
>>> Notes:
>>>  - I don't know if this should be seen as a fix. Maybe yes because it
>>>    was not supposed to do that from the beginning, and it feels less
>>>    risky not to schedule the worker when it is not needed. I'm open
>>> to
>>>    suggestions here.
>>>  - I'm mainly sharing this patch now because Geliang is going to
>>> modify
>>>    these helpers to separate PM specific code.
>>>    @Geliang: please rebase your series on top of this one (or include
>>>    this patch if it is easier).
>>> Cc: Geliang Tang <tanggeliang@kylinos.cn>
>>> ---
>>>  net/mptcp/pm.c | 10 ++++++++--
>>>  1 file changed, 8 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
>>> index
>>> 16cacce6c10fe86467aa7ef8e588f9f535b586fb..4eabb83328905676759768fb44c
>>> 859f5682721e3 100644
>>> --- a/net/mptcp/pm.c
>>> +++ b/net/mptcp/pm.c
>>> @@ -138,7 +138,7 @@ void mptcp_pm_fully_established(struct mptcp_sock
>>> *msk, const struct sock *ssk)
>>>  	 * racing paths - accept() and check_fully_established()
>>>  	 * be sure to serve this event only once.
>>>  	 */
>>> -	if (READ_ONCE(pm->work_pending) &&
>>> +	if (mptcp_pm_is_kernel(msk) && READ_ONCE(pm->work_pending)
>>> &&
>>
>> I think there's no need to modify this. work_pending is set to 0 by 
>> userspace pm, for all cases where work_pending is 1, the path manager
>> must be an in-kernel pm.
> 
> Good point, I missed that!
> 
> 'pm->work_pending' doesn't look like a good name for what it does.
> Hopefully, this will be clearer when pm->ops will be used, which will
> clearly separate what is done by the different PMs.
> 
> Also, is it possible we never set 'pm->work_pending' back to 'true' once
> it has been set to 'false'? Does it mean when we reach the limits, and
> something happens (e.g. a removed subflow, limits are changed, etc.),
> the worker is no longer used to establish new subflows or react when an
> ADD_ADDR is received?

I didn't check, but it feels like we should add something like that:

> diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
> index d4328443d844..dee3e3c14ec3 100644
> --- a/net/mptcp/pm_netlink.c
> +++ b/net/mptcp/pm_netlink.c
> @@ -248,6 +248,8 @@ bool mptcp_pm_nl_check_work_pending(struct mptcp_sock *msk)
>                 WRITE_ONCE(msk->pm.work_pending, false);
>                 return false;
>         }
> +
> +       WRITE_ONCE(msk->pm.work_pending, true);
>         return true;
>  }
>  
> @@ -911,6 +913,9 @@ static void mptcp_pm_nl_rm_addr_or_subflow(struct mptcp_sock *msk,
>                                 WRITE_ONCE(msk->pm.accept_addr, true);
>                 }
>         }
> +
> +       if (mptcp_pm_is_kernel(msk))
> +               mptcp_pm_nl_check_work_pending(msk);
>  }
>  
>  static void mptcp_pm_nl_rm_addr_received(struct mptcp_sock *msk)


So to re-enable msk->pm.work_pending when limits have been reached, and
it is no longer the case. WDYT?

We would need a new subtest to confirm this patch is needed, and to
avoid regressions later.

(...)

>>> @@ -278,6 +281,9 @@ void mptcp_pm_rm_addr_received(struct mptcp_sock
>>> *msk,
>>>  	for (i = 0; i < rm_list->nr; i++)
>>>  		mptcp_event_addr_removed(msk, rm_list->ids[i]);
>>>  
>>> +	if (!mptcp_pm_is_kernel(msk))
>>> +		return;
>>
>> This mptcp_pm_is_kernel() is indeed needed.
> 
> Yes it is. And 'pm->work_pending' cannot be used, because it will be set
> to false when the limits are reached, but a RM_ADDR will potentially
> decrement counters.

In fact, it is not needed: from what I see, the corresponding subflows
are removed on the kernel side, the userspace daemon doesn't need to
send a request to remove them. See the modification from 4d25247d3ae4
("mptcp: bypass in-kernel PM restrictions for non-kernel PMs") in
mptcp_pm_nl_rm_addr_or_subflow().

> I will send a v2 with only this change.
I will send a v2 with only an early exit in mptcp_pm_add_addr_echoed()
if there is no need to schedule the worker.

Cheers,
Matt
Geliang Tang Feb. 26, 2025, 1:05 a.m. UTC | #5
Hi Matt,

On Tue, 2025-02-25 at 19:15 +0100, Matthieu Baerts wrote:
> Hi Geliang,
> 
> On 25/02/2025 10:25, Matthieu Baerts wrote:
> > Hi Geliang,
> > 
> > Thank you for your review!
> > 
> > On 25/02/2025 05:21, Geliang Tang wrote:
> > > Hi Matt,
> > > 
> > > On Mon, 2025-02-24 at 20:01 +0100, Matthieu Baerts (NGI0) wrote:
> > > > When the userspace PM is used, there is no need to schedule the
> > > > PM
> > > > worker for in-kernel specific tasks, e.g. creating new
> > > > subflows, or
> > > > sending more ADD_ADDR.
> > > > 
> > > > Now, these tasks will be done only if the in-kernel PM is being
> > > > used.
> > > > 
> > > > Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
> > > > ---
> > > > Notes:
> > > >  - I don't know if this should be seen as a fix. Maybe yes
> > > > because it
> > > >    was not supposed to do that from the beginning, and it feels
> > > > less
> > > >    risky not to schedule the worker when it is not needed. I'm
> > > > open
> > > > to
> > > >    suggestions here.
> > > >  - I'm mainly sharing this patch now because Geliang is going
> > > > to
> > > > modify
> > > >    these helpers to separate PM specific code.
> > > >    @Geliang: please rebase your series on top of this one (or
> > > > include
> > > >    this patch if it is easier).
> > > > Cc: Geliang Tang <tanggeliang@kylinos.cn>
> > > > ---
> > > >  net/mptcp/pm.c | 10 ++++++++--
> > > >  1 file changed, 8 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
> > > > index
> > > > 16cacce6c10fe86467aa7ef8e588f9f535b586fb..4eabb8332890567675976
> > > > 8fb44c
> > > > 859f5682721e3 100644
> > > > --- a/net/mptcp/pm.c
> > > > +++ b/net/mptcp/pm.c
> > > > @@ -138,7 +138,7 @@ void mptcp_pm_fully_established(struct
> > > > mptcp_sock
> > > > *msk, const struct sock *ssk)
> > > >  	 * racing paths - accept() and
> > > > check_fully_established()
> > > >  	 * be sure to serve this event only once.
> > > >  	 */
> > > > -	if (READ_ONCE(pm->work_pending) &&
> > > > +	if (mptcp_pm_is_kernel(msk) && READ_ONCE(pm-
> > > > >work_pending)
> > > > &&
> > > 
> > > I think there's no need to modify this. work_pending is set to 0
> > > by 
> > > userspace pm, for all cases where work_pending is 1, the path
> > > manager
> > > must be an in-kernel pm.
> > 
> > Good point, I missed that!
> > 
> > 'pm->work_pending' doesn't look like a good name for what it does.
> > Hopefully, this will be clearer when pm->ops will be used, which
> > will
> > clearly separate what is done by the different PMs.
> > 
> > Also, is it possible we never set 'pm->work_pending' back to 'true'
> > once
> > it has been set to 'false'? Does it mean when we reach the limits,
> > and
> > something happens (e.g. a removed subflow, limits are changed,
> > etc.),
> > the worker is no longer used to establish new subflows or react
> > when an
> > ADD_ADDR is received?
> 
> I didn't check, but it feels like we should add something like that:

Yes, it is. We only set 'work_pending' to 'false' in
mptcp_pm_nl_check_work_pending() but never set it back.

I'll try to fix this later. Please add an issue for it on github.

> 
> > diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
> > index d4328443d844..dee3e3c14ec3 100644
> > --- a/net/mptcp/pm_netlink.c
> > +++ b/net/mptcp/pm_netlink.c
> > @@ -248,6 +248,8 @@ bool mptcp_pm_nl_check_work_pending(struct
> > mptcp_sock *msk)
> >                 WRITE_ONCE(msk->pm.work_pending, false);
> >                 return false;
> >         }
> > +
> > +       WRITE_ONCE(msk->pm.work_pending, true);
> >         return true;
> >  }
> >  
> > @@ -911,6 +913,9 @@ static void
> > mptcp_pm_nl_rm_addr_or_subflow(struct mptcp_sock *msk,
> >                                 WRITE_ONCE(msk->pm.accept_addr,
> > true);
> >                 }
> >         }
> > +
> > +       if (mptcp_pm_is_kernel(msk))
> > +               mptcp_pm_nl_check_work_pending(msk);
> >  }
> >  
> >  static void mptcp_pm_nl_rm_addr_received(struct mptcp_sock *msk)
> 
> 
> So to re-enable msk->pm.work_pending when limits have been reached,
> and
> it is no longer the case. WDYT?
> 
> We would need a new subtest to confirm this patch is needed, and to
> avoid regressions later.

And add a selftest for this too.

> 
> (...)
> 
> > > > @@ -278,6 +281,9 @@ void mptcp_pm_rm_addr_received(struct
> > > > mptcp_sock
> > > > *msk,
> > > >  	for (i = 0; i < rm_list->nr; i++)
> > > >  		mptcp_event_addr_removed(msk, rm_list-
> > > > >ids[i]);
> > > >  
> > > > +	if (!mptcp_pm_is_kernel(msk))
> > > > +		return;
> > > 
> > > This mptcp_pm_is_kernel() is indeed needed.
> > 
> > Yes it is. And 'pm->work_pending' cannot be used, because it will
> > be set
> > to false when the limits are reached, but a RM_ADDR will
> > potentially
> > decrement counters.
> 
> In fact, it is not needed: from what I see, the corresponding
> subflows
> are removed on the kernel side, the userspace daemon doesn't need to
> send a request to remove them. See the modification from 4d25247d3ae4
> ("mptcp: bypass in-kernel PM restrictions for non-kernel PMs") in
> mptcp_pm_nl_rm_addr_or_subflow().
> 
> > I will send a v2 with only this change.
> I will send a v2 with only an early exit in
> mptcp_pm_add_addr_echoed()
> if there is no need to schedule the worker.

Sure. I agree.

Thanks,
-Geliang

> 
> Cheers,
> Matt
Matthieu Baerts Feb. 26, 2025, 3:02 p.m. UTC | #6
Hi Geliang,

On 26/02/2025 02:05, Geliang Tang wrote:
> Hi Matt,
> 
> On Tue, 2025-02-25 at 19:15 +0100, Matthieu Baerts wrote:
>> Hi Geliang,
>>
>> On 25/02/2025 10:25, Matthieu Baerts wrote:
>>> Hi Geliang,
>>>
>>> Thank you for your review!
>>>
>>> On 25/02/2025 05:21, Geliang Tang wrote:
>>>> Hi Matt,
>>>>
>>>> On Mon, 2025-02-24 at 20:01 +0100, Matthieu Baerts (NGI0) wrote:
>>>>> When the userspace PM is used, there is no need to schedule the
>>>>> PM
>>>>> worker for in-kernel specific tasks, e.g. creating new
>>>>> subflows, or
>>>>> sending more ADD_ADDR.
>>>>>
>>>>> Now, these tasks will be done only if the in-kernel PM is being
>>>>> used.
>>>>>
>>>>> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
>>>>> ---
>>>>> Notes:
>>>>>  - I don't know if this should be seen as a fix. Maybe yes
>>>>> because it
>>>>>    was not supposed to do that from the beginning, and it feels
>>>>> less
>>>>>    risky not to schedule the worker when it is not needed. I'm
>>>>> open
>>>>> to
>>>>>    suggestions here.
>>>>>  - I'm mainly sharing this patch now because Geliang is going
>>>>> to
>>>>> modify
>>>>>    these helpers to separate PM specific code.
>>>>>    @Geliang: please rebase your series on top of this one (or
>>>>> include
>>>>>    this patch if it is easier).
>>>>> Cc: Geliang Tang <tanggeliang@kylinos.cn>
>>>>> ---
>>>>>  net/mptcp/pm.c | 10 ++++++++--
>>>>>  1 file changed, 8 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
>>>>> index
>>>>> 16cacce6c10fe86467aa7ef8e588f9f535b586fb..4eabb8332890567675976
>>>>> 8fb44c
>>>>> 859f5682721e3 100644
>>>>> --- a/net/mptcp/pm.c
>>>>> +++ b/net/mptcp/pm.c
>>>>> @@ -138,7 +138,7 @@ void mptcp_pm_fully_established(struct
>>>>> mptcp_sock
>>>>> *msk, const struct sock *ssk)
>>>>>  	 * racing paths - accept() and
>>>>> check_fully_established()
>>>>>  	 * be sure to serve this event only once.
>>>>>  	 */
>>>>> -	if (READ_ONCE(pm->work_pending) &&
>>>>> +	if (mptcp_pm_is_kernel(msk) && READ_ONCE(pm-
>>>>>> work_pending)
>>>>> &&
>>>>
>>>> I think there's no need to modify this. work_pending is set to 0
>>>> by 
>>>> userspace pm, for all cases where work_pending is 1, the path
>>>> manager
>>>> must be an in-kernel pm.
>>>
>>> Good point, I missed that!
>>>
>>> 'pm->work_pending' doesn't look like a good name for what it does.
>>> Hopefully, this will be clearer when pm->ops will be used, which
>>> will
>>> clearly separate what is done by the different PMs.
>>>
>>> Also, is it possible we never set 'pm->work_pending' back to 'true'
>>> once
>>> it has been set to 'false'? Does it mean when we reach the limits,
>>> and
>>> something happens (e.g. a removed subflow, limits are changed,
>>> etc.),
>>> the worker is no longer used to establish new subflows or react
>>> when an
>>> ADD_ADDR is received?
>>
>> I didn't check, but it feels like we should add something like that:
> 
> Yes, it is. We only set 'work_pending' to 'false' in
> mptcp_pm_nl_check_work_pending() but never set it back.

I looked a bit at that, but I didn't find anything wrong, even if it
feels wrong not to put it back to 'true'.

> I'll try to fix this later. Please add an issue for it on github.

I can, but we will need to proof that there is an interest to set it
back to "true".

Cheers,
Matt
diff mbox series

Patch

diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
index 16cacce6c10fe86467aa7ef8e588f9f535b586fb..4eabb83328905676759768fb44c859f5682721e3 100644
--- a/net/mptcp/pm.c
+++ b/net/mptcp/pm.c
@@ -138,7 +138,7 @@  void mptcp_pm_fully_established(struct mptcp_sock *msk, const struct sock *ssk)
 	 * racing paths - accept() and check_fully_established()
 	 * be sure to serve this event only once.
 	 */
-	if (READ_ONCE(pm->work_pending) &&
+	if (mptcp_pm_is_kernel(msk) && READ_ONCE(pm->work_pending) &&
 	    !(msk->pm.status & BIT(MPTCP_PM_ALREADY_ESTABLISHED)))
 		mptcp_pm_schedule_work(msk, MPTCP_PM_ESTABLISHED);
 
@@ -166,7 +166,7 @@  void mptcp_pm_subflow_established(struct mptcp_sock *msk)
 
 	pr_debug("msk=%p\n", msk);
 
-	if (!READ_ONCE(pm->work_pending))
+	if (!mptcp_pm_is_kernel(msk) || !READ_ONCE(pm->work_pending))
 		return;
 
 	spin_lock_bh(&pm->lock);
@@ -251,6 +251,9 @@  void mptcp_pm_add_addr_echoed(struct mptcp_sock *msk,
 
 	pr_debug("msk=%p\n", msk);
 
+	if (!mptcp_pm_is_kernel(msk) || !READ_ONCE(pm->work_pending))
+		return;
+
 	spin_lock_bh(&pm->lock);
 
 	if (mptcp_lookup_anno_list_by_saddr(msk, addr) && READ_ONCE(pm->work_pending))
@@ -278,6 +281,9 @@  void mptcp_pm_rm_addr_received(struct mptcp_sock *msk,
 	for (i = 0; i < rm_list->nr; i++)
 		mptcp_event_addr_removed(msk, rm_list->ids[i]);
 
+	if (!mptcp_pm_is_kernel(msk))
+		return;
+
 	spin_lock_bh(&pm->lock);
 	if (mptcp_pm_schedule_work(msk, MPTCP_PM_RM_ADDR_RECEIVED))
 		pm->rm_list_rx = *rm_list;