Message ID | 20220907111132.31722-1-imagedong@tencent.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net,v3] net: mptcp: fix unreleased socket in accept queue | expand |
Hi Menglong, On 07/09/2022 13:11, menglong8.dong@gmail.com wrote: > From: Menglong Dong <imagedong@tencent.com> > > The mptcp socket and its subflow sockets in accept queue can't be > released after the process exit. > > While the release of a mptcp socket in listening state, the > corresponding tcp socket will be released too. Meanwhile, the tcp > socket in the unaccept queue will be released too. However, only init > subflow is in the unaccept queue, and the joined subflow is not in the > unaccept queue, which makes the joined subflow won't be released, and > therefore the corresponding unaccepted mptcp socket will not be released > to. Thank you for the v3. Unfortunately, our CI found a possible recursive locking: > - KVM Validation: debug: > - Unstable: 1 failed test(s): selftest_mptcp_join - Critical: 1 Call Trace(s) ❌: > - Task: https://cirrus-ci.com/task/5418283233968128 > - Summary: https://api.cirrus-ci.com/v1/artifact/task/5418283233968128/summary/summary.txt https://lore.kernel.org/mptcp/4e6d3d9e-1f1a-23ae-cb56-2d4f043f17ae@gmail.com/T/#u Do you mind looking at it please? Also, because it is not just a simple fix, may you send any new versions only to MPTCP mailing list please? So without the other mailing lists and netdev maintainers to reduce the audience during the development. Once the patch is ready, we will apply it in MPTCP tree and send it to netdev. That's what we usually for MPTCP related patches. Cheers, Matt
On Thu, 2022-09-08 at 15:56 +0200, Matthieu Baerts wrote: > Hi Menglong, > > On 07/09/2022 13:11, menglong8.dong@gmail.com wrote: > > From: Menglong Dong <imagedong@tencent.com> > > > > The mptcp socket and its subflow sockets in accept queue can't be > > released after the process exit. > > > > While the release of a mptcp socket in listening state, the > > corresponding tcp socket will be released too. Meanwhile, the tcp > > socket in the unaccept queue will be released too. However, only init > > subflow is in the unaccept queue, and the joined subflow is not in the > > unaccept queue, which makes the joined subflow won't be released, and > > therefore the corresponding unaccepted mptcp socket will not be released > > to. > > Thank you for the v3. > > Unfortunately, our CI found a possible recursive locking: > > > - KVM Validation: debug: > > - Unstable: 1 failed test(s): selftest_mptcp_join - Critical: 1 Call Trace(s) ❌: > > - Task: https://cirrus-ci.com/task/5418283233968128 > > - Summary: https://api.cirrus-ci.com/v1/artifact/task/5418283233968128/summary/summary.txt > > https://lore.kernel.org/mptcp/4e6d3d9e-1f1a-23ae-cb56-2d4f043f17ae@gmail.com/T/#u > > Do you mind looking at it please? Ah, that is actually a false positive, but we must silence it. The main point is that the lock_sock() in mptcp_close() rightfully lacks the _nested annotation. Instead of adding such annotation only for this call site, which would be both ugly and dangerous, I suggest to factor_out from mptcp_close() all the code the run under the socket lock, say in: bool __mptcp_close(struct sock *sk, long timeout) // return true if the caller need to cancel the mptcp worker // (outside the socket lock) and then in mptcp_subflow_queue_clean(): sock_hold(sk); slow = lock_sock_fast_nested(sk); next = msk->dl_next; msk->first = NULL; msk->dl_next = NULL; do_cancel_work = __mptcp_close(sk, 0); unlock_sock_fast(sk, slow); if (do_cancel_work) mptcp_cancel_work(sk); sock_put(sk); All the above could require 2 different patches, 1 to factor-out the helper, and 1 to actually implement the fix. Cheers, Paolo
On Thu, Sep 8, 2022 at 10:45 PM Paolo Abeni <pabeni@redhat.com> wrote: > > On Thu, 2022-09-08 at 15:56 +0200, Matthieu Baerts wrote: > > Hi Menglong, > > > > On 07/09/2022 13:11, menglong8.dong@gmail.com wrote: > > > From: Menglong Dong <imagedong@tencent.com> > > > > > > The mptcp socket and its subflow sockets in accept queue can't be > > > released after the process exit. > > > > > > While the release of a mptcp socket in listening state, the > > > corresponding tcp socket will be released too. Meanwhile, the tcp > > > socket in the unaccept queue will be released too. However, only init > > > subflow is in the unaccept queue, and the joined subflow is not in the > > > unaccept queue, which makes the joined subflow won't be released, and > > > therefore the corresponding unaccepted mptcp socket will not be released > > > to. > > > > Thank you for the v3. > > > > Unfortunately, our CI found a possible recursive locking: > > > > > - KVM Validation: debug: > > > - Unstable: 1 failed test(s): selftest_mptcp_join - Critical: 1 Call Trace(s) ❌: > > > - Task: https://cirrus-ci.com/task/5418283233968128 > > > - Summary: https://api.cirrus-ci.com/v1/artifact/task/5418283233968128/summary/summary.txt > > > > https://lore.kernel.org/mptcp/4e6d3d9e-1f1a-23ae-cb56-2d4f043f17ae@gmail.com/T/#u > > > > Do you mind looking at it please? > > Ah, that is actually a false positive, but we must silence it. The main > point is that the lock_sock() in mptcp_close() rightfully lacks the > _nested annotation. > > Instead of adding such annotation only for this call site, which would > be both ugly and dangerous, I suggest to factor_out from mptcp_close() > all the code the run under the socket lock, say in: > > bool __mptcp_close(struct sock *sk, long timeout) > // return true if the caller need to cancel the mptcp worker > // (outside the socket lock) > > and then in mptcp_subflow_queue_clean(): > > sock_hold(sk); > > slow = lock_sock_fast_nested(sk); > next = msk->dl_next; > msk->first = NULL; > msk->dl_next = NULL; > do_cancel_work = __mptcp_close(sk, 0); > unlock_sock_fast(sk, slow); > > if (do_cancel_work) > mptcp_cancel_work(sk); > sock_put(sk); > > All the above could require 2 different patches, 1 to factor-out the > helper, and 1 to actually implement the fix. > Thanks for your explanation! As Matthieu said, I'll send the next version to the MPTCP mailing list only. Thanks! Menglong Dong > Cheers, > > Paolo >
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index d398f3810662..2de33626b73f 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -2796,7 +2796,7 @@ static void __mptcp_destroy_sock(struct sock *sk) sock_put(sk); } -static void mptcp_close(struct sock *sk, long timeout) +void mptcp_close(struct sock *sk, long timeout) { struct mptcp_subflow_context *subflow; struct mptcp_sock *msk = mptcp_sk(sk); diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index 132d50833df1..102692b581dd 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -612,6 +612,7 @@ void mptcp_subflow_reset(struct sock *ssk); void mptcp_subflow_queue_clean(struct sock *ssk); void mptcp_sock_graft(struct sock *sk, struct socket *parent); struct socket *__mptcp_nmpc_socket(const struct mptcp_sock *msk); +void mptcp_close(struct sock *sk, long timeout); bool mptcp_addresses_equal(const struct mptcp_addr_info *a, const struct mptcp_addr_info *b, bool use_port); diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index c7d49fb6e7bd..45315a57185a 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -602,30 +602,6 @@ static bool subflow_hmac_valid(const struct request_sock *req, return !crypto_memneq(hmac, mp_opt->hmac, MPTCPOPT_HMAC_LEN); } -static void mptcp_sock_destruct(struct sock *sk) -{ - /* if new mptcp socket isn't accepted, it is free'd - * from the tcp listener sockets request queue, linked - * from req->sk. The tcp socket is released. - * This calls the ULP release function which will - * also remove the mptcp socket, via - * sock_put(ctx->conn). - * - * Problem is that the mptcp socket will be in - * ESTABLISHED state and will not have the SOCK_DEAD flag. - * Both result in warnings from inet_sock_destruct. - */ - if ((1 << sk->sk_state) & (TCPF_ESTABLISHED | TCPF_CLOSE_WAIT)) { - sk->sk_state = TCP_CLOSE; - WARN_ON_ONCE(sk->sk_socket); - sock_orphan(sk); - } - - /* We don't need to clear msk->subflow, as it's still NULL at this point */ - mptcp_destroy_common(mptcp_sk(sk), 0); - inet_sock_destruct(sk); -} - static void mptcp_force_close(struct sock *sk) { /* the msk is not yet exposed to user-space */ @@ -768,7 +744,6 @@ static struct sock *subflow_syn_recv_sock(const struct sock *sk, /* new mpc subflow takes ownership of the newly * created mptcp socket */ - new_msk->sk_destruct = mptcp_sock_destruct; mptcp_sk(new_msk)->setsockopt_seq = ctx->setsockopt_seq; mptcp_pm_new_connection(mptcp_sk(new_msk), child, 1); mptcp_token_accept(subflow_req, mptcp_sk(new_msk)); @@ -1770,6 +1745,12 @@ void mptcp_subflow_queue_clean(struct sock *listener_ssk) msk->first = NULL; msk->dl_next = NULL; unlock_sock_fast(sk, slow); + + /* mptcp_close() will put a extra reference on sk, + * so we hold one here. + */ + sock_hold(sk); + mptcp_close(sk, 0); } /* we are still under the listener msk socket lock */