Message ID | 20250224-mptcp-userspace-avoid-worker-v1-1-127325d3e9a4@kernel.org (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
Series | [mptcp-next] mptcp: pm: userspace: avoid scheduling in-kernel PM worker | expand |
Context | Check | Description |
---|---|---|
matttbe/build | success | Build and static analysis OK |
matttbe/checkpatch | success | total: 0 errors, 0 warnings, 0 checks, 34 lines checked |
matttbe/shellcheck | success | MPTCP selftests files have not been modified |
matttbe/KVM_Validation__normal | success | Success! ✅ |
matttbe/KVM_Validation__debug | success | Success! ✅ |
matttbe/KVM_Validation__btf-normal__only_bpftest_all_ | success | Success! ✅ |
matttbe/KVM_Validation__btf-debug__only_bpftest_all_ | success | Success! ✅ |
Hi Matthieu, Thank you for your modifications, that's great! Our CI did some validations and here is its report: - KVM Validation: normal: Success! ✅ - KVM Validation: debug: Success! ✅ - KVM Validation: btf-normal (only bpftest_all): Success! ✅ - KVM Validation: btf-debug (only bpftest_all): Success! ✅ - Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/13505941194 Initiator: Patchew Applier Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/c3c9a58d9080 Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=937225 If there are some issues, you can reproduce them using the same environment as the one used by the CI thanks to a docker image, e.g.: $ cd [kernel source code] $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \ --pull always mptcp/mptcp-upstream-virtme-docker:latest \ auto-normal For more details: https://github.com/multipath-tcp/mptcp-upstream-virtme-docker Please note that despite all the efforts that have been already done to have a stable tests suite when executed on a public CI like here, it is possible some reported issues are not due to your modifications. Still, do not hesitate to help us improve that ;-) Cheers, MPTCP GH Action bot Bot operated by Matthieu Baerts (NGI0 Core)
Hi Matt, On Mon, 2025-02-24 at 20:01 +0100, Matthieu Baerts (NGI0) wrote: > When the userspace PM is used, there is no need to schedule the PM > worker for in-kernel specific tasks, e.g. creating new subflows, or > sending more ADD_ADDR. > > Now, these tasks will be done only if the in-kernel PM is being used. > > Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> > --- > Notes: > - I don't know if this should be seen as a fix. Maybe yes because it > was not supposed to do that from the beginning, and it feels less > risky not to schedule the worker when it is not needed. I'm open > to > suggestions here. > - I'm mainly sharing this patch now because Geliang is going to > modify > these helpers to separate PM specific code. > @Geliang: please rebase your series on top of this one (or include > this patch if it is easier). > Cc: Geliang Tang <tanggeliang@kylinos.cn> > --- > net/mptcp/pm.c | 10 ++++++++-- > 1 file changed, 8 insertions(+), 2 deletions(-) > > diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c > index > 16cacce6c10fe86467aa7ef8e588f9f535b586fb..4eabb83328905676759768fb44c > 859f5682721e3 100644 > --- a/net/mptcp/pm.c > +++ b/net/mptcp/pm.c > @@ -138,7 +138,7 @@ void mptcp_pm_fully_established(struct mptcp_sock > *msk, const struct sock *ssk) > * racing paths - accept() and check_fully_established() > * be sure to serve this event only once. > */ > - if (READ_ONCE(pm->work_pending) && > + if (mptcp_pm_is_kernel(msk) && READ_ONCE(pm->work_pending) > && I think there's no need to modify this. work_pending is set to 0 by userspace pm, for all cases where work_pending is 1, the path manager must be an in-kernel pm. So there is no need to add mptcp_pm_is_kernel() check here. Just checking work_pending is sufficient. > !(msk->pm.status & BIT(MPTCP_PM_ALREADY_ESTABLISHED))) > mptcp_pm_schedule_work(msk, MPTCP_PM_ESTABLISHED); > > @@ -166,7 +166,7 @@ void mptcp_pm_subflow_established(struct > mptcp_sock *msk) > > pr_debug("msk=%p\n", msk); > > - if (!READ_ONCE(pm->work_pending)) > + if (!mptcp_pm_is_kernel(msk) || !READ_ONCE(pm- > >work_pending)) > return; No need to modify this too, work_pending has been checked later in mptcp_pm_subflow_established: if (READ_ONCE(pm->work_pending)) mptcp_pm_schedule_work(msk, MPTCP_PM_SUBFLOW_ESTABLISHED); > > spin_lock_bh(&pm->lock); > @@ -251,6 +251,9 @@ void mptcp_pm_add_addr_echoed(struct mptcp_sock > *msk, > > pr_debug("msk=%p\n", msk); > > + if (!mptcp_pm_is_kernel(msk) || !READ_ONCE(pm- > >work_pending)) > + return; > + Same here, work_pending has been checked later in mptcp_pm_add_addr_echoed: if (mptcp_lookup_anno_list_by_saddr(msk, addr) && READ_ONCE(pm- >work_pending)) mptcp_pm_schedule_work(msk, MPTCP_PM_SUBFLOW_ESTABLISHED); > spin_lock_bh(&pm->lock); > > if (mptcp_lookup_anno_list_by_saddr(msk, addr) && > READ_ONCE(pm->work_pending)) > @@ -278,6 +281,9 @@ void mptcp_pm_rm_addr_received(struct mptcp_sock > *msk, > for (i = 0; i < rm_list->nr; i++) > mptcp_event_addr_removed(msk, rm_list->ids[i]); > > + if (!mptcp_pm_is_kernel(msk)) > + return; This mptcp_pm_is_kernel() is indeed needed. WDYT? Thanks, -Geliang > + > spin_lock_bh(&pm->lock); > if (mptcp_pm_schedule_work(msk, MPTCP_PM_RM_ADDR_RECEIVED)) > pm->rm_list_rx = *rm_list; > > --- > base-commit: 3716d837622ee1baf6fbe7e6a6a7a3df116e75fe > change-id: 20250224-mptcp-userspace-avoid-worker-93f367e39ca6 > > Best regards,
Hi Geliang, Thank you for your review! On 25/02/2025 05:21, Geliang Tang wrote: > Hi Matt, > > On Mon, 2025-02-24 at 20:01 +0100, Matthieu Baerts (NGI0) wrote: >> When the userspace PM is used, there is no need to schedule the PM >> worker for in-kernel specific tasks, e.g. creating new subflows, or >> sending more ADD_ADDR. >> >> Now, these tasks will be done only if the in-kernel PM is being used. >> >> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> >> --- >> Notes: >> - I don't know if this should be seen as a fix. Maybe yes because it >> was not supposed to do that from the beginning, and it feels less >> risky not to schedule the worker when it is not needed. I'm open >> to >> suggestions here. >> - I'm mainly sharing this patch now because Geliang is going to >> modify >> these helpers to separate PM specific code. >> @Geliang: please rebase your series on top of this one (or include >> this patch if it is easier). >> Cc: Geliang Tang <tanggeliang@kylinos.cn> >> --- >> net/mptcp/pm.c | 10 ++++++++-- >> 1 file changed, 8 insertions(+), 2 deletions(-) >> >> diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c >> index >> 16cacce6c10fe86467aa7ef8e588f9f535b586fb..4eabb83328905676759768fb44c >> 859f5682721e3 100644 >> --- a/net/mptcp/pm.c >> +++ b/net/mptcp/pm.c >> @@ -138,7 +138,7 @@ void mptcp_pm_fully_established(struct mptcp_sock >> *msk, const struct sock *ssk) >> * racing paths - accept() and check_fully_established() >> * be sure to serve this event only once. >> */ >> - if (READ_ONCE(pm->work_pending) && >> + if (mptcp_pm_is_kernel(msk) && READ_ONCE(pm->work_pending) >> && > > I think there's no need to modify this. work_pending is set to 0 by > userspace pm, for all cases where work_pending is 1, the path manager > must be an in-kernel pm. Good point, I missed that! 'pm->work_pending' doesn't look like a good name for what it does. Hopefully, this will be clearer when pm->ops will be used, which will clearly separate what is done by the different PMs. Also, is it possible we never set 'pm->work_pending' back to 'true' once it has been set to 'false'? Does it mean when we reach the limits, and something happens (e.g. a removed subflow, limits are changed, etc.), the worker is no longer used to establish new subflows or react when an ADD_ADDR is received? > So there is no need to add mptcp_pm_is_kernel() check here. Just > checking work_pending is sufficient. Indeed. >> !(msk->pm.status & BIT(MPTCP_PM_ALREADY_ESTABLISHED))) >> mptcp_pm_schedule_work(msk, MPTCP_PM_ESTABLISHED); >> >> @@ -166,7 +166,7 @@ void mptcp_pm_subflow_established(struct >> mptcp_sock *msk) >> >> pr_debug("msk=%p\n", msk); >> >> - if (!READ_ONCE(pm->work_pending)) >> + if (!mptcp_pm_is_kernel(msk) || !READ_ONCE(pm- >>> work_pending)) >> return; > > No need to modify this too, work_pending has been checked later in > mptcp_pm_subflow_established: > > if (READ_ONCE(pm->work_pending)) > mptcp_pm_schedule_work(msk, MPTCP_PM_SUBFLOW_ESTABLISHED); Indeed. >> >> spin_lock_bh(&pm->lock); >> @@ -251,6 +251,9 @@ void mptcp_pm_add_addr_echoed(struct mptcp_sock >> *msk, >> >> pr_debug("msk=%p\n", msk); >> >> + if (!mptcp_pm_is_kernel(msk) || !READ_ONCE(pm- >>> work_pending)) >> + return; >> + > > Same here, work_pending has been checked later in > mptcp_pm_add_addr_echoed: > > if (mptcp_lookup_anno_list_by_saddr(msk, addr) && READ_ONCE(pm- >> work_pending)) > mptcp_pm_schedule_work(msk, MPTCP_PM_SUBFLOW_ESTABLISHED); Indeed. >> spin_lock_bh(&pm->lock); >> >> if (mptcp_lookup_anno_list_by_saddr(msk, addr) && >> READ_ONCE(pm->work_pending)) >> @@ -278,6 +281,9 @@ void mptcp_pm_rm_addr_received(struct mptcp_sock >> *msk, >> for (i = 0; i < rm_list->nr; i++) >> mptcp_event_addr_removed(msk, rm_list->ids[i]); >> >> + if (!mptcp_pm_is_kernel(msk)) >> + return; > > This mptcp_pm_is_kernel() is indeed needed. Yes it is. And 'pm->work_pending' cannot be used, because it will be set to false when the limits are reached, but a RM_ADDR will potentially decrement counters. I will send a v2 with only this change. Cheers, Matt
Hi Geliang, On 25/02/2025 10:25, Matthieu Baerts wrote: > Hi Geliang, > > Thank you for your review! > > On 25/02/2025 05:21, Geliang Tang wrote: >> Hi Matt, >> >> On Mon, 2025-02-24 at 20:01 +0100, Matthieu Baerts (NGI0) wrote: >>> When the userspace PM is used, there is no need to schedule the PM >>> worker for in-kernel specific tasks, e.g. creating new subflows, or >>> sending more ADD_ADDR. >>> >>> Now, these tasks will be done only if the in-kernel PM is being used. >>> >>> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> >>> --- >>> Notes: >>> - I don't know if this should be seen as a fix. Maybe yes because it >>> was not supposed to do that from the beginning, and it feels less >>> risky not to schedule the worker when it is not needed. I'm open >>> to >>> suggestions here. >>> - I'm mainly sharing this patch now because Geliang is going to >>> modify >>> these helpers to separate PM specific code. >>> @Geliang: please rebase your series on top of this one (or include >>> this patch if it is easier). >>> Cc: Geliang Tang <tanggeliang@kylinos.cn> >>> --- >>> net/mptcp/pm.c | 10 ++++++++-- >>> 1 file changed, 8 insertions(+), 2 deletions(-) >>> >>> diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c >>> index >>> 16cacce6c10fe86467aa7ef8e588f9f535b586fb..4eabb83328905676759768fb44c >>> 859f5682721e3 100644 >>> --- a/net/mptcp/pm.c >>> +++ b/net/mptcp/pm.c >>> @@ -138,7 +138,7 @@ void mptcp_pm_fully_established(struct mptcp_sock >>> *msk, const struct sock *ssk) >>> * racing paths - accept() and check_fully_established() >>> * be sure to serve this event only once. >>> */ >>> - if (READ_ONCE(pm->work_pending) && >>> + if (mptcp_pm_is_kernel(msk) && READ_ONCE(pm->work_pending) >>> && >> >> I think there's no need to modify this. work_pending is set to 0 by >> userspace pm, for all cases where work_pending is 1, the path manager >> must be an in-kernel pm. > > Good point, I missed that! > > 'pm->work_pending' doesn't look like a good name for what it does. > Hopefully, this will be clearer when pm->ops will be used, which will > clearly separate what is done by the different PMs. > > Also, is it possible we never set 'pm->work_pending' back to 'true' once > it has been set to 'false'? Does it mean when we reach the limits, and > something happens (e.g. a removed subflow, limits are changed, etc.), > the worker is no longer used to establish new subflows or react when an > ADD_ADDR is received? I didn't check, but it feels like we should add something like that: > diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c > index d4328443d844..dee3e3c14ec3 100644 > --- a/net/mptcp/pm_netlink.c > +++ b/net/mptcp/pm_netlink.c > @@ -248,6 +248,8 @@ bool mptcp_pm_nl_check_work_pending(struct mptcp_sock *msk) > WRITE_ONCE(msk->pm.work_pending, false); > return false; > } > + > + WRITE_ONCE(msk->pm.work_pending, true); > return true; > } > > @@ -911,6 +913,9 @@ static void mptcp_pm_nl_rm_addr_or_subflow(struct mptcp_sock *msk, > WRITE_ONCE(msk->pm.accept_addr, true); > } > } > + > + if (mptcp_pm_is_kernel(msk)) > + mptcp_pm_nl_check_work_pending(msk); > } > > static void mptcp_pm_nl_rm_addr_received(struct mptcp_sock *msk) So to re-enable msk->pm.work_pending when limits have been reached, and it is no longer the case. WDYT? We would need a new subtest to confirm this patch is needed, and to avoid regressions later. (...) >>> @@ -278,6 +281,9 @@ void mptcp_pm_rm_addr_received(struct mptcp_sock >>> *msk, >>> for (i = 0; i < rm_list->nr; i++) >>> mptcp_event_addr_removed(msk, rm_list->ids[i]); >>> >>> + if (!mptcp_pm_is_kernel(msk)) >>> + return; >> >> This mptcp_pm_is_kernel() is indeed needed. > > Yes it is. And 'pm->work_pending' cannot be used, because it will be set > to false when the limits are reached, but a RM_ADDR will potentially > decrement counters. In fact, it is not needed: from what I see, the corresponding subflows are removed on the kernel side, the userspace daemon doesn't need to send a request to remove them. See the modification from 4d25247d3ae4 ("mptcp: bypass in-kernel PM restrictions for non-kernel PMs") in mptcp_pm_nl_rm_addr_or_subflow(). > I will send a v2 with only this change. I will send a v2 with only an early exit in mptcp_pm_add_addr_echoed() if there is no need to schedule the worker. Cheers, Matt
Hi Matt, On Tue, 2025-02-25 at 19:15 +0100, Matthieu Baerts wrote: > Hi Geliang, > > On 25/02/2025 10:25, Matthieu Baerts wrote: > > Hi Geliang, > > > > Thank you for your review! > > > > On 25/02/2025 05:21, Geliang Tang wrote: > > > Hi Matt, > > > > > > On Mon, 2025-02-24 at 20:01 +0100, Matthieu Baerts (NGI0) wrote: > > > > When the userspace PM is used, there is no need to schedule the > > > > PM > > > > worker for in-kernel specific tasks, e.g. creating new > > > > subflows, or > > > > sending more ADD_ADDR. > > > > > > > > Now, these tasks will be done only if the in-kernel PM is being > > > > used. > > > > > > > > Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> > > > > --- > > > > Notes: > > > > - I don't know if this should be seen as a fix. Maybe yes > > > > because it > > > > was not supposed to do that from the beginning, and it feels > > > > less > > > > risky not to schedule the worker when it is not needed. I'm > > > > open > > > > to > > > > suggestions here. > > > > - I'm mainly sharing this patch now because Geliang is going > > > > to > > > > modify > > > > these helpers to separate PM specific code. > > > > @Geliang: please rebase your series on top of this one (or > > > > include > > > > this patch if it is easier). > > > > Cc: Geliang Tang <tanggeliang@kylinos.cn> > > > > --- > > > > net/mptcp/pm.c | 10 ++++++++-- > > > > 1 file changed, 8 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c > > > > index > > > > 16cacce6c10fe86467aa7ef8e588f9f535b586fb..4eabb8332890567675976 > > > > 8fb44c > > > > 859f5682721e3 100644 > > > > --- a/net/mptcp/pm.c > > > > +++ b/net/mptcp/pm.c > > > > @@ -138,7 +138,7 @@ void mptcp_pm_fully_established(struct > > > > mptcp_sock > > > > *msk, const struct sock *ssk) > > > > * racing paths - accept() and > > > > check_fully_established() > > > > * be sure to serve this event only once. > > > > */ > > > > - if (READ_ONCE(pm->work_pending) && > > > > + if (mptcp_pm_is_kernel(msk) && READ_ONCE(pm- > > > > >work_pending) > > > > && > > > > > > I think there's no need to modify this. work_pending is set to 0 > > > by > > > userspace pm, for all cases where work_pending is 1, the path > > > manager > > > must be an in-kernel pm. > > > > Good point, I missed that! > > > > 'pm->work_pending' doesn't look like a good name for what it does. > > Hopefully, this will be clearer when pm->ops will be used, which > > will > > clearly separate what is done by the different PMs. > > > > Also, is it possible we never set 'pm->work_pending' back to 'true' > > once > > it has been set to 'false'? Does it mean when we reach the limits, > > and > > something happens (e.g. a removed subflow, limits are changed, > > etc.), > > the worker is no longer used to establish new subflows or react > > when an > > ADD_ADDR is received? > > I didn't check, but it feels like we should add something like that: Yes, it is. We only set 'work_pending' to 'false' in mptcp_pm_nl_check_work_pending() but never set it back. I'll try to fix this later. Please add an issue for it on github. > > > diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c > > index d4328443d844..dee3e3c14ec3 100644 > > --- a/net/mptcp/pm_netlink.c > > +++ b/net/mptcp/pm_netlink.c > > @@ -248,6 +248,8 @@ bool mptcp_pm_nl_check_work_pending(struct > > mptcp_sock *msk) > > WRITE_ONCE(msk->pm.work_pending, false); > > return false; > > } > > + > > + WRITE_ONCE(msk->pm.work_pending, true); > > return true; > > } > > > > @@ -911,6 +913,9 @@ static void > > mptcp_pm_nl_rm_addr_or_subflow(struct mptcp_sock *msk, > > WRITE_ONCE(msk->pm.accept_addr, > > true); > > } > > } > > + > > + if (mptcp_pm_is_kernel(msk)) > > + mptcp_pm_nl_check_work_pending(msk); > > } > > > > static void mptcp_pm_nl_rm_addr_received(struct mptcp_sock *msk) > > > So to re-enable msk->pm.work_pending when limits have been reached, > and > it is no longer the case. WDYT? > > We would need a new subtest to confirm this patch is needed, and to > avoid regressions later. And add a selftest for this too. > > (...) > > > > > @@ -278,6 +281,9 @@ void mptcp_pm_rm_addr_received(struct > > > > mptcp_sock > > > > *msk, > > > > for (i = 0; i < rm_list->nr; i++) > > > > mptcp_event_addr_removed(msk, rm_list- > > > > >ids[i]); > > > > > > > > + if (!mptcp_pm_is_kernel(msk)) > > > > + return; > > > > > > This mptcp_pm_is_kernel() is indeed needed. > > > > Yes it is. And 'pm->work_pending' cannot be used, because it will > > be set > > to false when the limits are reached, but a RM_ADDR will > > potentially > > decrement counters. > > In fact, it is not needed: from what I see, the corresponding > subflows > are removed on the kernel side, the userspace daemon doesn't need to > send a request to remove them. See the modification from 4d25247d3ae4 > ("mptcp: bypass in-kernel PM restrictions for non-kernel PMs") in > mptcp_pm_nl_rm_addr_or_subflow(). > > > I will send a v2 with only this change. > I will send a v2 with only an early exit in > mptcp_pm_add_addr_echoed() > if there is no need to schedule the worker. Sure. I agree. Thanks, -Geliang > > Cheers, > Matt
Hi Geliang, On 26/02/2025 02:05, Geliang Tang wrote: > Hi Matt, > > On Tue, 2025-02-25 at 19:15 +0100, Matthieu Baerts wrote: >> Hi Geliang, >> >> On 25/02/2025 10:25, Matthieu Baerts wrote: >>> Hi Geliang, >>> >>> Thank you for your review! >>> >>> On 25/02/2025 05:21, Geliang Tang wrote: >>>> Hi Matt, >>>> >>>> On Mon, 2025-02-24 at 20:01 +0100, Matthieu Baerts (NGI0) wrote: >>>>> When the userspace PM is used, there is no need to schedule the >>>>> PM >>>>> worker for in-kernel specific tasks, e.g. creating new >>>>> subflows, or >>>>> sending more ADD_ADDR. >>>>> >>>>> Now, these tasks will be done only if the in-kernel PM is being >>>>> used. >>>>> >>>>> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> >>>>> --- >>>>> Notes: >>>>> - I don't know if this should be seen as a fix. Maybe yes >>>>> because it >>>>> was not supposed to do that from the beginning, and it feels >>>>> less >>>>> risky not to schedule the worker when it is not needed. I'm >>>>> open >>>>> to >>>>> suggestions here. >>>>> - I'm mainly sharing this patch now because Geliang is going >>>>> to >>>>> modify >>>>> these helpers to separate PM specific code. >>>>> @Geliang: please rebase your series on top of this one (or >>>>> include >>>>> this patch if it is easier). >>>>> Cc: Geliang Tang <tanggeliang@kylinos.cn> >>>>> --- >>>>> net/mptcp/pm.c | 10 ++++++++-- >>>>> 1 file changed, 8 insertions(+), 2 deletions(-) >>>>> >>>>> diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c >>>>> index >>>>> 16cacce6c10fe86467aa7ef8e588f9f535b586fb..4eabb8332890567675976 >>>>> 8fb44c >>>>> 859f5682721e3 100644 >>>>> --- a/net/mptcp/pm.c >>>>> +++ b/net/mptcp/pm.c >>>>> @@ -138,7 +138,7 @@ void mptcp_pm_fully_established(struct >>>>> mptcp_sock >>>>> *msk, const struct sock *ssk) >>>>> * racing paths - accept() and >>>>> check_fully_established() >>>>> * be sure to serve this event only once. >>>>> */ >>>>> - if (READ_ONCE(pm->work_pending) && >>>>> + if (mptcp_pm_is_kernel(msk) && READ_ONCE(pm- >>>>>> work_pending) >>>>> && >>>> >>>> I think there's no need to modify this. work_pending is set to 0 >>>> by >>>> userspace pm, for all cases where work_pending is 1, the path >>>> manager >>>> must be an in-kernel pm. >>> >>> Good point, I missed that! >>> >>> 'pm->work_pending' doesn't look like a good name for what it does. >>> Hopefully, this will be clearer when pm->ops will be used, which >>> will >>> clearly separate what is done by the different PMs. >>> >>> Also, is it possible we never set 'pm->work_pending' back to 'true' >>> once >>> it has been set to 'false'? Does it mean when we reach the limits, >>> and >>> something happens (e.g. a removed subflow, limits are changed, >>> etc.), >>> the worker is no longer used to establish new subflows or react >>> when an >>> ADD_ADDR is received? >> >> I didn't check, but it feels like we should add something like that: > > Yes, it is. We only set 'work_pending' to 'false' in > mptcp_pm_nl_check_work_pending() but never set it back. I looked a bit at that, but I didn't find anything wrong, even if it feels wrong not to put it back to 'true'. > I'll try to fix this later. Please add an issue for it on github. I can, but we will need to proof that there is an interest to set it back to "true". Cheers, Matt
diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c index 16cacce6c10fe86467aa7ef8e588f9f535b586fb..4eabb83328905676759768fb44c859f5682721e3 100644 --- a/net/mptcp/pm.c +++ b/net/mptcp/pm.c @@ -138,7 +138,7 @@ void mptcp_pm_fully_established(struct mptcp_sock *msk, const struct sock *ssk) * racing paths - accept() and check_fully_established() * be sure to serve this event only once. */ - if (READ_ONCE(pm->work_pending) && + if (mptcp_pm_is_kernel(msk) && READ_ONCE(pm->work_pending) && !(msk->pm.status & BIT(MPTCP_PM_ALREADY_ESTABLISHED))) mptcp_pm_schedule_work(msk, MPTCP_PM_ESTABLISHED); @@ -166,7 +166,7 @@ void mptcp_pm_subflow_established(struct mptcp_sock *msk) pr_debug("msk=%p\n", msk); - if (!READ_ONCE(pm->work_pending)) + if (!mptcp_pm_is_kernel(msk) || !READ_ONCE(pm->work_pending)) return; spin_lock_bh(&pm->lock); @@ -251,6 +251,9 @@ void mptcp_pm_add_addr_echoed(struct mptcp_sock *msk, pr_debug("msk=%p\n", msk); + if (!mptcp_pm_is_kernel(msk) || !READ_ONCE(pm->work_pending)) + return; + spin_lock_bh(&pm->lock); if (mptcp_lookup_anno_list_by_saddr(msk, addr) && READ_ONCE(pm->work_pending)) @@ -278,6 +281,9 @@ void mptcp_pm_rm_addr_received(struct mptcp_sock *msk, for (i = 0; i < rm_list->nr; i++) mptcp_event_addr_removed(msk, rm_list->ids[i]); + if (!mptcp_pm_is_kernel(msk)) + return; + spin_lock_bh(&pm->lock); if (mptcp_pm_schedule_work(msk, MPTCP_PM_RM_ADDR_RECEIVED)) pm->rm_list_rx = *rm_list;
When the userspace PM is used, there is no need to schedule the PM worker for in-kernel specific tasks, e.g. creating new subflows, or sending more ADD_ADDR. Now, these tasks will be done only if the in-kernel PM is being used. Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> --- Notes: - I don't know if this should be seen as a fix. Maybe yes because it was not supposed to do that from the beginning, and it feels less risky not to schedule the worker when it is not needed. I'm open to suggestions here. - I'm mainly sharing this patch now because Geliang is going to modify these helpers to separate PM specific code. @Geliang: please rebase your series on top of this one (or include this patch if it is easier). Cc: Geliang Tang <tanggeliang@kylinos.cn> --- net/mptcp/pm.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) --- base-commit: 3716d837622ee1baf6fbe7e6a6a7a3df116e75fe change-id: 20250224-mptcp-userspace-avoid-worker-93f367e39ca6 Best regards,