Message ID | bd4d533b-15d2-6c0a-7667-70fd95dbea20@ya.ru (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net,v2] unix: Fix race in SOCK_SEQPACKET's unix_dgram_sendmsg() | expand |
From: Kirill Tkhai <tkhai@ya.ru> Date: Sun, 27 Nov 2022 01:46:51 +0300 > There is a race resulting in alive SOCK_SEQPACKET socket > may change its state from TCP_ESTABLISHED to TCP_CLOSE: > > unix_release_sock(peer) unix_dgram_sendmsg(sk) > sock_orphan(peer) > sock_set_flag(peer, SOCK_DEAD) > sock_alloc_send_pskb() > if !(sk->sk_shutdown & SEND_SHUTDOWN) > OK > if sock_flag(peer, SOCK_DEAD) > sk->sk_state = TCP_CLOSE > sk->sk_shutdown = SHUTDOWN_MASK > > > After that socket sk remains almost normal: it is able to connect, listen, accept > and recvmsg, while it can't sendmsg. > > Since this is the only possibility for alive SOCK_SEQPACKET to change > the state in such way, we should better fix this strange and potentially > danger corner case. > > Also, move TCP_CLOSE assignment for SOCK_DGRAM sockets under state lock > to fix race with unix_dgram_connect(): > > unix_dgram_connect(other) unix_dgram_sendmsg(sk) > unix_peer(sk) = NULL > unix_state_unlock(sk) > unix_state_double_lock(sk, other) > sk->sk_state = TCP_ESTABLISHED > unix_peer(sk) = other > unix_state_double_unlock(sk, other) > sk->sk_state = TCP_CLOSED > > This patch fixes both of these races. > > Fixes: 83301b5367a9 ("af_unix: Set TCP_ESTABLISHED for datagram sockets too") > Suggested-by: Kuniyuki Iwashima <kuniyu@amazon.com> > Signed-off-by: Kirill Tkhai <tkhai@ya.ru> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Thank you, Kirill. > --- > v2: Disconnect from peer right there. > > net/unix/af_unix.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c > index b3545fc68097..be40023a61fb 100644 > --- a/net/unix/af_unix.c > +++ b/net/unix/af_unix.c > @@ -2001,11 +2001,14 @@ static int unix_dgram_sendmsg(struct socket *sock, struct msghdr *msg, > err = 0; > if (unix_peer(sk) == other) { > unix_peer(sk) = NULL; > - unix_dgram_peer_wake_disconnect_wakeup(sk, other); > + > + if (sk->sk_type == SOCK_DGRAM) { > + unix_dgram_peer_wake_disconnect_wakeup(sk, other); > + sk->sk_state = TCP_CLOSE; > + } > > unix_state_unlock(sk); > > - sk->sk_state = TCP_CLOSE; > unix_dgram_disconnected(sk, other); > sock_put(other); > err = -ECONNREFUSED;
On Sun, 2022-11-27 at 01:46 +0300, Kirill Tkhai wrote: > There is a race resulting in alive SOCK_SEQPACKET socket > may change its state from TCP_ESTABLISHED to TCP_CLOSE: > > unix_release_sock(peer) unix_dgram_sendmsg(sk) > sock_orphan(peer) > sock_set_flag(peer, SOCK_DEAD) > sock_alloc_send_pskb() > if !(sk->sk_shutdown & SEND_SHUTDOWN) > OK > if sock_flag(peer, SOCK_DEAD) > sk->sk_state = TCP_CLOSE > sk->sk_shutdown = SHUTDOWN_MASK > > > After that socket sk remains almost normal: it is able to connect, listen, accept > and recvmsg, while it can't sendmsg. > > Since this is the only possibility for alive SOCK_SEQPACKET to change > the state in such way, we should better fix this strange and potentially > danger corner case. > > Also, move TCP_CLOSE assignment for SOCK_DGRAM sockets under state lock > to fix race with unix_dgram_connect(): > > unix_dgram_connect(other) unix_dgram_sendmsg(sk) > unix_peer(sk) = NULL > unix_state_unlock(sk) > unix_state_double_lock(sk, other) > sk->sk_state = TCP_ESTABLISHED > unix_peer(sk) = other > unix_state_double_unlock(sk, other) > sk->sk_state = TCP_CLOSED > > This patch fixes both of these races. > > Fixes: 83301b5367a9 ("af_unix: Set TCP_ESTABLISHED for datagram sockets too") I don't think this commmit introduces the issues, both behavior described above appear to be present even before? Thank! Paolo
On 01.12.2022 12:30, Paolo Abeni wrote: > On Sun, 2022-11-27 at 01:46 +0300, Kirill Tkhai wrote: >> There is a race resulting in alive SOCK_SEQPACKET socket >> may change its state from TCP_ESTABLISHED to TCP_CLOSE: >> >> unix_release_sock(peer) unix_dgram_sendmsg(sk) >> sock_orphan(peer) >> sock_set_flag(peer, SOCK_DEAD) >> sock_alloc_send_pskb() >> if !(sk->sk_shutdown & SEND_SHUTDOWN) >> OK >> if sock_flag(peer, SOCK_DEAD) >> sk->sk_state = TCP_CLOSE >> sk->sk_shutdown = SHUTDOWN_MASK >> >> >> After that socket sk remains almost normal: it is able to connect, listen, accept >> and recvmsg, while it can't sendmsg. >> >> Since this is the only possibility for alive SOCK_SEQPACKET to change >> the state in such way, we should better fix this strange and potentially >> danger corner case. >> >> Also, move TCP_CLOSE assignment for SOCK_DGRAM sockets under state lock >> to fix race with unix_dgram_connect(): >> >> unix_dgram_connect(other) unix_dgram_sendmsg(sk) >> unix_peer(sk) = NULL >> unix_state_unlock(sk) >> unix_state_double_lock(sk, other) >> sk->sk_state = TCP_ESTABLISHED >> unix_peer(sk) = other >> unix_state_double_unlock(sk, other) >> sk->sk_state = TCP_CLOSED >> >> This patch fixes both of these races. >> >> Fixes: 83301b5367a9 ("af_unix: Set TCP_ESTABLISHED for datagram sockets too") > > I don't think this commmit introduces the issues, both behavior > described above appear to be present even before? 1)Hm, I pointed to the commit suggested by Kuniyuki without checking it. Possible, the real problem commit is dc56ad7028c5 "af_unix: fix potential NULL deref in unix_dgram_connect()", since it added TCP_CLOSED assignment to unix_dgram_sendmsg(). 2)What do you think about initial version of fix? https://patchwork.kernel.org/project/netdevbpf/patch/38a920a7-cfba-7929-886d-c3c6effc0c43@ya.ru/ Despite there are some arguments, I'm not still sure that v2 is better. Thanks, Kirill
> On Dec 3, 2022, at 7:44, Kirill Tkhai <tkhai@ya.ru> wrote: >> On 01.12.2022 12:30, Paolo Abeni wrote: >>> On Sun, 2022-11-27 at 01:46 +0300, Kirill Tkhai wrote: >>> There is a race resulting in alive SOCK_SEQPACKET socket >>> may change its state from TCP_ESTABLISHED to TCP_CLOSE: >>> >>> unix_release_sock(peer) unix_dgram_sendmsg(sk) >>> sock_orphan(peer) >>> sock_set_flag(peer, SOCK_DEAD) >>> sock_alloc_send_pskb() >>> if !(sk->sk_shutdown & SEND_SHUTDOWN) >>> OK >>> if sock_flag(peer, SOCK_DEAD) >>> sk->sk_state = TCP_CLOSE >>> sk->sk_shutdown = SHUTDOWN_MASK >>> >>> >>> After that socket sk remains almost normal: it is able to connect, listen, accept >>> and recvmsg, while it can't sendmsg. >>> >>> Since this is the only possibility for alive SOCK_SEQPACKET to change >>> the state in such way, we should better fix this strange and potentially >>> danger corner case. >>> >>> Also, move TCP_CLOSE assignment for SOCK_DGRAM sockets under state lock >>> to fix race with unix_dgram_connect(): >>> >>> unix_dgram_connect(other) unix_dgram_sendmsg(sk) >>> unix_peer(sk) = NULL >>> unix_state_unlock(sk) >>> unix_state_double_lock(sk, other) >>> sk->sk_state = TCP_ESTABLISHED >>> unix_peer(sk) = other >>> unix_state_double_unlock(sk, other) >>> sk->sk_state = TCP_CLOSED >>> >>> This patch fixes both of these races. >>> >>> Fixes: 83301b5367a9 ("af_unix: Set TCP_ESTABLISHED for datagram sockets too") >> >> I don't think this commmit introduces the issues, both behavior >> described above appear to be present even before? > > 1)Hm, I pointed to the commit suggested by Kuniyuki without checking it. > > Possible, the real problem commit is dc56ad7028c5 "af_unix: fix potential NULL deref in unix_dgram_connect()", > since it added TCP_CLOSED assignment to unix_dgram_sendmsg(). The commit just moved the assignment. Note unix_dgram_disconnected() is called for SOCK_SEQPACKET after releasing the lock, and 83301b5367a9 introduced the TCP_CLOSE assignment. > 2)What do you think about initial version of fix? > > https://patchwork.kernel.org/project/netdevbpf/patch/38a920a7-cfba-7929-886d-c3c6effc0c43@ya.ru/ > > Despite there are some arguments, I'm not still sure that v2 is better. > > Thanks, > Kirill
On 03.12.2022 01:43, Kirill Tkhai wrote: > On 01.12.2022 12:30, Paolo Abeni wrote: >> On Sun, 2022-11-27 at 01:46 +0300, Kirill Tkhai wrote: >>> There is a race resulting in alive SOCK_SEQPACKET socket >>> may change its state from TCP_ESTABLISHED to TCP_CLOSE: >>> >>> unix_release_sock(peer) unix_dgram_sendmsg(sk) >>> sock_orphan(peer) >>> sock_set_flag(peer, SOCK_DEAD) >>> sock_alloc_send_pskb() >>> if !(sk->sk_shutdown & SEND_SHUTDOWN) >>> OK >>> if sock_flag(peer, SOCK_DEAD) >>> sk->sk_state = TCP_CLOSE >>> sk->sk_shutdown = SHUTDOWN_MASK >>> >>> >>> After that socket sk remains almost normal: it is able to connect, listen, accept >>> and recvmsg, while it can't sendmsg. >>> >>> Since this is the only possibility for alive SOCK_SEQPACKET to change >>> the state in such way, we should better fix this strange and potentially >>> danger corner case. >>> >>> Also, move TCP_CLOSE assignment for SOCK_DGRAM sockets under state lock >>> to fix race with unix_dgram_connect(): >>> >>> unix_dgram_connect(other) unix_dgram_sendmsg(sk) >>> unix_peer(sk) = NULL >>> unix_state_unlock(sk) >>> unix_state_double_lock(sk, other) >>> sk->sk_state = TCP_ESTABLISHED >>> unix_peer(sk) = other >>> unix_state_double_unlock(sk, other) >>> sk->sk_state = TCP_CLOSED >>> >>> This patch fixes both of these races. >>> >>> Fixes: 83301b5367a9 ("af_unix: Set TCP_ESTABLISHED for datagram sockets too") >> >> I don't think this commmit introduces the issues, both behavior >> described above appear to be present even before? > > 1)Hm, I pointed to the commit suggested by Kuniyuki without checking it. > > Possible, the real problem commit is dc56ad7028c5 "af_unix: fix potential NULL deref in unix_dgram_connect()", > since it added TCP_CLOSED assignment to unix_dgram_sendmsg(). > > 2)What do you think about initial version of fix? > > https://patchwork.kernel.org/project/netdevbpf/patch/38a920a7-cfba-7929-886d-c3c6effc0c43@ya.ru/ > > Despite there are some arguments, I'm not still sure that v2 is better. Rethinking again, I think v1 is better, and we don't have to introduce optimizations, which works only in very rare race cases. So, I'm going to return to V1 version, which is better. Kirill
On Fri, 2022-12-02 at 23:18 +0000, Iwashima, Kuniyuki wrote: > > > On Dec 3, 2022, at 7:44, Kirill Tkhai <tkhai@ya.ru> wrote: > > > On 01.12.2022 12:30, Paolo Abeni wrote: > > > > On Sun, 2022-11-27 at 01:46 +0300, Kirill Tkhai wrote: > > > > There is a race resulting in alive SOCK_SEQPACKET socket > > > > may change its state from TCP_ESTABLISHED to TCP_CLOSE: > > > > > > > > unix_release_sock(peer) unix_dgram_sendmsg(sk) > > > > sock_orphan(peer) > > > > sock_set_flag(peer, SOCK_DEAD) > > > > sock_alloc_send_pskb() > > > > if !(sk->sk_shutdown & SEND_SHUTDOWN) > > > > OK > > > > if sock_flag(peer, SOCK_DEAD) > > > > sk->sk_state = TCP_CLOSE > > > > sk->sk_shutdown = SHUTDOWN_MASK > > > > > > > > > > > > After that socket sk remains almost normal: it is able to connect, listen, accept > > > > and recvmsg, while it can't sendmsg. > > > > > > > > Since this is the only possibility for alive SOCK_SEQPACKET to change > > > > the state in such way, we should better fix this strange and potentially > > > > danger corner case. > > > > > > > > Also, move TCP_CLOSE assignment for SOCK_DGRAM sockets under state lock > > > > to fix race with unix_dgram_connect(): > > > > > > > > unix_dgram_connect(other) unix_dgram_sendmsg(sk) > > > > unix_peer(sk) = NULL > > > > unix_state_unlock(sk) > > > > unix_state_double_lock(sk, other) > > > > sk->sk_state = TCP_ESTABLISHED > > > > unix_peer(sk) = other > > > > unix_state_double_unlock(sk, other) > > > > sk->sk_state = TCP_CLOSED > > > > > > > > This patch fixes both of these races. > > > > > > > > Fixes: 83301b5367a9 ("af_unix: Set TCP_ESTABLISHED for datagram sockets too") > > > > > > I don't think this commmit introduces the issues, both behavior > > > described above appear to be present even before? > > > > 1)Hm, I pointed to the commit suggested by Kuniyuki without checking it. > > > > Possible, the real problem commit is dc56ad7028c5 "af_unix: fix potential NULL deref in unix_dgram_connect()", > > since it added TCP_CLOSED assignment to unix_dgram_sendmsg(). > > The commit just moved the assignment. > > Note unix_dgram_disconnected() is called for SOCK_SEQPACKET > after releasing the lock, and 83301b5367a9 introduced the > TCP_CLOSE assignment. I'm sorry for the back and forth, I think I initally misread the code. I agree 83301b5367a9 is good fixes tag. > > 2)What do you think about initial version of fix? > > > > https://patchwork.kernel.org/project/netdevbpf/patch/38a920a7-cfba-7929-886d-c3c6effc0c43@ya.ru/ > > > > Despite there are some arguments, I'm not still sure that v2 is better. v1 introduces quite a few behavior changes (different error code, different cleanup schema) that could be IMHO more risky for a stable patch. I suggest to pick the minimal change that addresses the issue (v2 in this case). Thanks, Paolo
On 05.12.2022 12:22, Paolo Abeni wrote: > On Fri, 2022-12-02 at 23:18 +0000, Iwashima, Kuniyuki wrote: >> >>> On Dec 3, 2022, at 7:44, Kirill Tkhai <tkhai@ya.ru> wrote: >>>> On 01.12.2022 12:30, Paolo Abeni wrote: >>>>> On Sun, 2022-11-27 at 01:46 +0300, Kirill Tkhai wrote: >>>>> There is a race resulting in alive SOCK_SEQPACKET socket >>>>> may change its state from TCP_ESTABLISHED to TCP_CLOSE: >>>>> >>>>> unix_release_sock(peer) unix_dgram_sendmsg(sk) >>>>> sock_orphan(peer) >>>>> sock_set_flag(peer, SOCK_DEAD) >>>>> sock_alloc_send_pskb() >>>>> if !(sk->sk_shutdown & SEND_SHUTDOWN) >>>>> OK >>>>> if sock_flag(peer, SOCK_DEAD) >>>>> sk->sk_state = TCP_CLOSE >>>>> sk->sk_shutdown = SHUTDOWN_MASK >>>>> >>>>> >>>>> After that socket sk remains almost normal: it is able to connect, listen, accept >>>>> and recvmsg, while it can't sendmsg. >>>>> >>>>> Since this is the only possibility for alive SOCK_SEQPACKET to change >>>>> the state in such way, we should better fix this strange and potentially >>>>> danger corner case. >>>>> >>>>> Also, move TCP_CLOSE assignment for SOCK_DGRAM sockets under state lock >>>>> to fix race with unix_dgram_connect(): >>>>> >>>>> unix_dgram_connect(other) unix_dgram_sendmsg(sk) >>>>> unix_peer(sk) = NULL >>>>> unix_state_unlock(sk) >>>>> unix_state_double_lock(sk, other) >>>>> sk->sk_state = TCP_ESTABLISHED >>>>> unix_peer(sk) = other >>>>> unix_state_double_unlock(sk, other) >>>>> sk->sk_state = TCP_CLOSED >>>>> >>>>> This patch fixes both of these races. >>>>> >>>>> Fixes: 83301b5367a9 ("af_unix: Set TCP_ESTABLISHED for datagram sockets too") >>>> >>>> I don't think this commmit introduces the issues, both behavior >>>> described above appear to be present even before? >>> >>> 1)Hm, I pointed to the commit suggested by Kuniyuki without checking it. >>> >>> Possible, the real problem commit is dc56ad7028c5 "af_unix: fix potential NULL deref in unix_dgram_connect()", >>> since it added TCP_CLOSED assignment to unix_dgram_sendmsg(). >> >> The commit just moved the assignment. >> >> Note unix_dgram_disconnected() is called for SOCK_SEQPACKET >> after releasing the lock, and 83301b5367a9 introduced the >> TCP_CLOSE assignment. > > I'm sorry for the back and forth, I think I initally misread the code. > I agree 83301b5367a9 is good fixes tag. > >>> 2)What do you think about initial version of fix? >>> >>> https://patchwork.kernel.org/project/netdevbpf/patch/38a920a7-cfba-7929-886d-c3c6effc0c43@ya.ru/ >>> >>> Despite there are some arguments, I'm not still sure that v2 is better. > > v1 introduces quite a few behavior changes (different error code, > different cleanup schema) that could be IMHO more risky for a stable > patch. I suggest to pick the minimal change that addresses the issue > (v2 in this case). Hm, not exactly. EPIPE is regular return value, which is normally returned from unix_dgram_sendmsg()->sock_alloc_send_pskb (see SEND_SHUTDOWN check). ECONNREFUSED is a race case return value, it does not returned normally. What different cleanup scheme do you mean? IMO, there is the same behavior as we get, when race is failed, and sock_alloc_send_pskb() returns EPIPE as in regular case.
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index b3545fc68097..be40023a61fb 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -2001,11 +2001,14 @@ static int unix_dgram_sendmsg(struct socket *sock, struct msghdr *msg, err = 0; if (unix_peer(sk) == other) { unix_peer(sk) = NULL; - unix_dgram_peer_wake_disconnect_wakeup(sk, other); + + if (sk->sk_type == SOCK_DGRAM) { + unix_dgram_peer_wake_disconnect_wakeup(sk, other); + sk->sk_state = TCP_CLOSE; + } unix_state_unlock(sk); - sk->sk_state = TCP_CLOSE; unix_dgram_disconnected(sk, other); sock_put(other); err = -ECONNREFUSED;
There is a race resulting in alive SOCK_SEQPACKET socket may change its state from TCP_ESTABLISHED to TCP_CLOSE: unix_release_sock(peer) unix_dgram_sendmsg(sk) sock_orphan(peer) sock_set_flag(peer, SOCK_DEAD) sock_alloc_send_pskb() if !(sk->sk_shutdown & SEND_SHUTDOWN) OK if sock_flag(peer, SOCK_DEAD) sk->sk_state = TCP_CLOSE sk->sk_shutdown = SHUTDOWN_MASK After that socket sk remains almost normal: it is able to connect, listen, accept and recvmsg, while it can't sendmsg. Since this is the only possibility for alive SOCK_SEQPACKET to change the state in such way, we should better fix this strange and potentially danger corner case. Also, move TCP_CLOSE assignment for SOCK_DGRAM sockets under state lock to fix race with unix_dgram_connect(): unix_dgram_connect(other) unix_dgram_sendmsg(sk) unix_peer(sk) = NULL unix_state_unlock(sk) unix_state_double_lock(sk, other) sk->sk_state = TCP_ESTABLISHED unix_peer(sk) = other unix_state_double_unlock(sk, other) sk->sk_state = TCP_CLOSED This patch fixes both of these races. Fixes: 83301b5367a9 ("af_unix: Set TCP_ESTABLISHED for datagram sockets too") Suggested-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Kirill Tkhai <tkhai@ya.ru> --- v2: Disconnect from peer right there. net/unix/af_unix.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)