Message ID | f6371f65b1ebabc0f308528494ec39c9bdf786fd.1702631517.git.pabeni@redhat.com (mailing list archive) |
---|---|
State | Accepted, archived |
Commit | 5913b5d75d1f8a286057f25abe2a51ff84109178 |
Delegated to: | Matthieu Baerts |
Headers | show |
Series | [mptcp-net] mptcp: prevent tcp diag from closing listener subflows | expand |
Context | Check | Description |
---|---|---|
matttbe/KVM_Validation__normal__except_selftest_mptcp_join_ | success | Success! ✅ |
matttbe/KVM_Validation__normal__only_selftest_mptcp_join_ | success | Success! ✅ |
matttbe/KVM_Validation__debug__except_selftest_mptcp_join_ | success | Success! ✅ |
matttbe/KVM_Validation__debug__only_selftest_mptcp_join_ | warning | Unstable: 1 failed test(s): selftest_mptcp_join |
matttbe/build | success | Build and static analysis OK |
matttbe/checkpatch | warning | total: 0 errors, 1 warnings, 0 checks, 31 lines checked |
Hi Paolo, Thank you for your modifications, that's great! Our CI did some validations and here is its report: - KVM Validation: normal (except selftest_mptcp_join): - Success! ✅: - Task: https://cirrus-ci.com/task/6646535530741760 - Summary: https://api.cirrus-ci.com/v1/artifact/task/6646535530741760/summary/summary.txt - KVM Validation: normal (only selftest_mptcp_join): - Success! ✅: - Task: https://cirrus-ci.com/task/5666210654715904 - Summary: https://api.cirrus-ci.com/v1/artifact/task/5666210654715904/summary/summary.txt - KVM Validation: debug (except selftest_mptcp_join): - Success! ✅: - Task: https://cirrus-ci.com/task/5731741856432128 - Summary: https://api.cirrus-ci.com/v1/artifact/task/5731741856432128/summary/summary.txt - KVM Validation: debug (only selftest_mptcp_join): - Unstable: 1 failed test(s): selftest_mptcp_join
On Fri, 15 Dec 2023, Paolo Abeni wrote: > The MPTCP protocol does not expect that any other entity could change > the first subflow status when such socket is listening. > Unfortunately the TCP diag interface allows aborting any TCP socket, > including MPTCP listeners subflows. As reported by syzbot, that trigger > a WARN() and could lead to later bigger trouble. > > The MPTCP protocol needs to do some MPTCP-level cleanup actions to > properly shutdown the listener. To keep the fix simple, prevent > entirely the diag interface from stopping such listeners. > > We could refine the diag callback in a later, larger patch targeting > net-next. > > Fixes: 57fc0f1ceaa4 ("mptcp: ensure listener is unhashed before updating the sk status") > Closes: https://lore.kernel.org/netdev/0000000000004f4579060c68431b@google.com/ > Reported-by: syzbot+5a01c3a666e726bc8752@syzkaller.appspotmail.com > Signed-off-by: Paolo Abeni <pabeni@redhat.com> > --- > net/mptcp/subflow.c | 13 +++++++++++++ > 1 file changed, 13 insertions(+) > > diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c > index 64302a1ac306..1f98bec264a6 100644 > --- a/net/mptcp/subflow.c > +++ b/net/mptcp/subflow.c > @@ -1981,6 +1981,17 @@ static void tcp_release_cb_override(struct sock *ssk) > tcp_release_cb(ssk); > } > > +static int tcp_abort_override(struct sock *ssk, int err) > +{ > + /* closing a listener subflow requires a great deal of care. > + * keep it simple and just prevent such operation > + */ > + if (inet_sk_state_load(ssk) == TCP_LISTEN) > + return -EINVAL; > + > + return tcp_abort(ssk, err); Hi Paolo - I looked around the code to see if there might be issues with an unexpected tcp_abort on non-listening subflow sockets. Closest code I found was mptcp_subflow_reset(), and the main difference seems to be giving the msk a chance to clean up: if (!test_and_set_bit(MPTCP_WORK_CLOSE_SUBFLOW, &mptcp_sk(sk)->flags)) mptcp_schedule_work(sk); Worthwhile to add that in this override function? - Mat > +} > + > static struct tcp_ulp_ops subflow_ulp_ops __read_mostly = { > .name = "mptcp", > .owner = THIS_MODULE, > @@ -2025,6 +2036,7 @@ void __init mptcp_subflow_init(void) > > tcp_prot_override = tcp_prot; > tcp_prot_override.release_cb = tcp_release_cb_override; > + tcp_prot_override.diag_destroy = tcp_abort_override; > > #if IS_ENABLED(CONFIG_MPTCP_IPV6) > /* In struct mptcp_subflow_request_sock, we assume the TCP request sock > @@ -2060,6 +2072,7 @@ void __init mptcp_subflow_init(void) > > tcpv6_prot_override = tcpv6_prot; > tcpv6_prot_override.release_cb = tcp_release_cb_override; > + tcpv6_prot_override.diag_destroy = tcp_abort_override; > #endif > > mptcp_diag_subflow_init(&subflow_ulp_ops); > -- > 2.42.0 > > >
On Fri, 2023-12-15 at 16:47 -0800, Mat Martineau wrote: > On Fri, 15 Dec 2023, Paolo Abeni wrote: > > > The MPTCP protocol does not expect that any other entity could change > > the first subflow status when such socket is listening. > > Unfortunately the TCP diag interface allows aborting any TCP socket, > > including MPTCP listeners subflows. As reported by syzbot, that trigger > > a WARN() and could lead to later bigger trouble. > > > > The MPTCP protocol needs to do some MPTCP-level cleanup actions to > > properly shutdown the listener. To keep the fix simple, prevent > > entirely the diag interface from stopping such listeners. > > > > We could refine the diag callback in a later, larger patch targeting > > net-next. > > > > Fixes: 57fc0f1ceaa4 ("mptcp: ensure listener is unhashed before updating the sk status") > > Closes: https://lore.kernel.org/netdev/0000000000004f4579060c68431b@google.com/ > > Reported-by: syzbot+5a01c3a666e726bc8752@syzkaller.appspotmail.com > > Signed-off-by: Paolo Abeni <pabeni@redhat.com> > > --- > > net/mptcp/subflow.c | 13 +++++++++++++ > > 1 file changed, 13 insertions(+) > > > > diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c > > index 64302a1ac306..1f98bec264a6 100644 > > --- a/net/mptcp/subflow.c > > +++ b/net/mptcp/subflow.c > > @@ -1981,6 +1981,17 @@ static void tcp_release_cb_override(struct sock *ssk) > > tcp_release_cb(ssk); > > } > > > > +static int tcp_abort_override(struct sock *ssk, int err) > > +{ > > + /* closing a listener subflow requires a great deal of care. > > + * keep it simple and just prevent such operation > > + */ > > + if (inet_sk_state_load(ssk) == TCP_LISTEN) > > + return -EINVAL; > > + > > + return tcp_abort(ssk, err); > > Hi Paolo - > > I looked around the code to see if there might be issues with an > unexpected tcp_abort on non-listening subflow sockets. Closest code I > found was mptcp_subflow_reset(), and the main difference seems to be > giving the msk a chance to clean up: > > if (!test_and_set_bit(MPTCP_WORK_CLOSE_SUBFLOW, &mptcp_sk(sk)->flags)) > mptcp_schedule_work(sk); > > Worthwhile to add that in this override function? Reporting there the discussion from the last mtg for completeness. Async closing of mptcp listener via the worker has proven to be quite complex and bug prone. In this case it would probably be safe, but given the troubled git history of mptcp_check_listen_stop(), I would keep this version for net, and do the complete thing for net-next, with no rush. Cheers, Paolo
On Thu, 21 Dec 2023, Paolo Abeni wrote: > On Fri, 2023-12-15 at 16:47 -0800, Mat Martineau wrote: >> On Fri, 15 Dec 2023, Paolo Abeni wrote: >> >>> The MPTCP protocol does not expect that any other entity could change >>> the first subflow status when such socket is listening. >>> Unfortunately the TCP diag interface allows aborting any TCP socket, >>> including MPTCP listeners subflows. As reported by syzbot, that trigger >>> a WARN() and could lead to later bigger trouble. >>> >>> The MPTCP protocol needs to do some MPTCP-level cleanup actions to >>> properly shutdown the listener. To keep the fix simple, prevent >>> entirely the diag interface from stopping such listeners. >>> >>> We could refine the diag callback in a later, larger patch targeting >>> net-next. >>> >>> Fixes: 57fc0f1ceaa4 ("mptcp: ensure listener is unhashed before updating the sk status") >>> Closes: https://lore.kernel.org/netdev/0000000000004f4579060c68431b@google.com/ >>> Reported-by: syzbot+5a01c3a666e726bc8752@syzkaller.appspotmail.com >>> Signed-off-by: Paolo Abeni <pabeni@redhat.com> >>> --- >>> net/mptcp/subflow.c | 13 +++++++++++++ >>> 1 file changed, 13 insertions(+) >>> >>> diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c >>> index 64302a1ac306..1f98bec264a6 100644 >>> --- a/net/mptcp/subflow.c >>> +++ b/net/mptcp/subflow.c >>> @@ -1981,6 +1981,17 @@ static void tcp_release_cb_override(struct sock *ssk) >>> tcp_release_cb(ssk); >>> } >>> >>> +static int tcp_abort_override(struct sock *ssk, int err) >>> +{ >>> + /* closing a listener subflow requires a great deal of care. >>> + * keep it simple and just prevent such operation >>> + */ >>> + if (inet_sk_state_load(ssk) == TCP_LISTEN) >>> + return -EINVAL; >>> + >>> + return tcp_abort(ssk, err); >> >> Hi Paolo - >> >> I looked around the code to see if there might be issues with an >> unexpected tcp_abort on non-listening subflow sockets. Closest code I >> found was mptcp_subflow_reset(), and the main difference seems to be >> giving the msk a chance to clean up: >> >> if (!test_and_set_bit(MPTCP_WORK_CLOSE_SUBFLOW, &mptcp_sk(sk)->flags)) >> mptcp_schedule_work(sk); >> >> Worthwhile to add that in this override function? > > Reporting there the discussion from the last mtg for completeness. > Async closing of mptcp listener via the worker has proven to be quite > complex and bug prone. In this case it would probably be safe, but > given the troubled git history of mptcp_check_listen_stop(), I would > keep this version for net, and do the complete thing for net-next, with > no rush. > Thanks Paolo, sounds like a good plan. Reviewed-by: Mat Martineau <martineau@kernel.org>
Hi Paolo, Mat, On 15/12/2023 10:12, Paolo Abeni wrote: > The MPTCP protocol does not expect that any other entity could change > the first subflow status when such socket is listening. > Unfortunately the TCP diag interface allows aborting any TCP socket, > including MPTCP listeners subflows. As reported by syzbot, that trigger > a WARN() and could lead to later bigger trouble. > > The MPTCP protocol needs to do some MPTCP-level cleanup actions to > properly shutdown the listener. To keep the fix simple, prevent > entirely the diag interface from stopping such listeners. > > We could refine the diag callback in a later, larger patch targeting > net-next. Thank you for this patch and the review! > Fixes: 57fc0f1ceaa4 ("mptcp: ensure listener is unhashed before updating the sk status") > Closes: https://lore.kernel.org/netdev/0000000000004f4579060c68431b@google.com/ > Reported-by: syzbot+5a01c3a666e726bc8752@syzkaller.appspotmail.com (I just inverted these two lines to make checkpatch happy ;) ) (and it looks like b4 automatically adds '<>' around the email address) Now in our tree (fixes for -net): New patches for t/upstream-net and t/upstream: - 5913b5d75d1f: mptcp: prevent tcp diag from closing listener subflows - Results: 9293d696010b..b7f484870b63 (export-net) - Results: 4720030d95d3..b52e904674ac (export) (Tests are not in progress) Cheers, Matt
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index 64302a1ac306..1f98bec264a6 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -1981,6 +1981,17 @@ static void tcp_release_cb_override(struct sock *ssk) tcp_release_cb(ssk); } +static int tcp_abort_override(struct sock *ssk, int err) +{ + /* closing a listener subflow requires a great deal of care. + * keep it simple and just prevent such operation + */ + if (inet_sk_state_load(ssk) == TCP_LISTEN) + return -EINVAL; + + return tcp_abort(ssk, err); +} + static struct tcp_ulp_ops subflow_ulp_ops __read_mostly = { .name = "mptcp", .owner = THIS_MODULE, @@ -2025,6 +2036,7 @@ void __init mptcp_subflow_init(void) tcp_prot_override = tcp_prot; tcp_prot_override.release_cb = tcp_release_cb_override; + tcp_prot_override.diag_destroy = tcp_abort_override; #if IS_ENABLED(CONFIG_MPTCP_IPV6) /* In struct mptcp_subflow_request_sock, we assume the TCP request sock @@ -2060,6 +2072,7 @@ void __init mptcp_subflow_init(void) tcpv6_prot_override = tcpv6_prot; tcpv6_prot_override.release_cb = tcp_release_cb_override; + tcpv6_prot_override.diag_destroy = tcp_abort_override; #endif mptcp_diag_subflow_init(&subflow_ulp_ops);
The MPTCP protocol does not expect that any other entity could change the first subflow status when such socket is listening. Unfortunately the TCP diag interface allows aborting any TCP socket, including MPTCP listeners subflows. As reported by syzbot, that trigger a WARN() and could lead to later bigger trouble. The MPTCP protocol needs to do some MPTCP-level cleanup actions to properly shutdown the listener. To keep the fix simple, prevent entirely the diag interface from stopping such listeners. We could refine the diag callback in a later, larger patch targeting net-next. Fixes: 57fc0f1ceaa4 ("mptcp: ensure listener is unhashed before updating the sk status") Closes: https://lore.kernel.org/netdev/0000000000004f4579060c68431b@google.com/ Reported-by: syzbot+5a01c3a666e726bc8752@syzkaller.appspotmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> --- net/mptcp/subflow.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)