Message ID | 20250108180617.154053-2-sgarzare@redhat.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | vsock: some fixes due to transport de-assignment | expand |
On Wed, Jan 08, 2025 at 07:06:16PM +0100, Stefano Garzarella wrote: > If the socket has been de-assigned or assigned to another transport, > we must discard any packets received because they are not expected > and would cause issues when we access vsk->transport. > > A possible scenario is described by Hyunwoo Kim in the attached link, > where after a first connect() interrupted by a signal, and a second > connect() failed, we can find `vsk->transport` at NULL, leading to a > NULL pointer dereference. > > Fixes: c0cfa2d8a788 ("vsock: add multi-transports support") > Reported-by: Hyunwoo Kim <v4bel@theori.io> > Reported-by: Wongi Lee <qwerty@theori.io> > Closes: https://lore.kernel.org/netdev/Z2LvdTTQR7dBmPb5@v4bel-B760M-AORUS-ELITE-AX/ > Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> > --- > net/vmw_vsock/virtio_transport_common.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c > index 9acc13ab3f82..51a494b69be8 100644 > --- a/net/vmw_vsock/virtio_transport_common.c > +++ b/net/vmw_vsock/virtio_transport_common.c > @@ -1628,8 +1628,11 @@ void virtio_transport_recv_pkt(struct virtio_transport *t, > > lock_sock(sk); > > - /* Check if sk has been closed before lock_sock */ > - if (sock_flag(sk, SOCK_DONE)) { > + /* Check if sk has been closed or assigned to another transport before > + * lock_sock (note: listener sockets are not assigned to any transport) > + */ > + if (sock_flag(sk, SOCK_DONE) || > + (sk->sk_state != TCP_LISTEN && vsk->transport != &t->transport)) { If a race scenario with vsock_listen() is added to the existing race scenario, the patch can be bypassed. In addition to the existing scenario: ``` cpu0 cpu1 socket(A) bind(A, {cid: VMADDR_CID_LOCAL, port: 1024}) vsock_bind() listen(A) vsock_listen() socket(B) connect(B, {cid: VMADDR_CID_LOCAL, port: 1024}) vsock_connect() lock_sock(sk); virtio_transport_connect() virtio_transport_connect() virtio_transport_send_pkt_info() vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_REQUEST) queue_work(vsock_loopback_work) sk->sk_state = TCP_SYN_SENT; release_sock(sk); vsock_loopback_work() virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_REQUEST) sk = vsock_find_bound_socket(&dst); virtio_transport_recv_listen(sk, skb) child = vsock_create_connected(sk); vsock_assign_transport() vvs = kzalloc(sizeof(*vvs), GFP_KERNEL); vsock_insert_connected(vchild); list_add(&vsk->connected_table, list); virtio_transport_send_response(vchild, skb); virtio_transport_send_pkt_info() vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_RESPONSE) queue_work(vsock_loopback_work) vsock_loopback_work() virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_RESPONSE) sk = vsock_find_bound_socket(&dst); lock_sock(sk); case TCP_SYN_SENT: virtio_transport_recv_connecting() sk->sk_state = TCP_ESTABLISHED; release_sock(sk); kill(connect(B)); lock_sock(sk); if (signal_pending(current)) { sk->sk_state = sk->sk_state == TCP_ESTABLISHED ? TCP_CLOSING : TCP_CLOSE; sock->state = SS_UNCONNECTED; // [1] release_sock(sk); connect(B, {cid: VMADDR_CID_HYPERVISOR, port: 1024}) vsock_connect(B) lock_sock(sk); vsock_assign_transport() virtio_transport_release() virtio_transport_close() if (!(sk->sk_state == TCP_ESTABLISHED || sk->sk_state == TCP_CLOSING)) virtio_transport_shutdown() virtio_transport_send_pkt_info() vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_SHUTDOWN) queue_work(vsock_loopback_work) schedule_delayed_work(&vsk->close_work, VSOCK_CLOSE_TIMEOUT); // [5] vsock_deassign_transport() vsk->transport = NULL; return -ESOCKTNOSUPPORT; release_sock(sk); vsock_loopback_work() virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_SHUTDOWN) virtio_transport_recv_connected() virtio_transport_reset() virtio_transport_send_pkt_info() vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_RST) queue_work(vsock_loopback_work) listen(B) vsock_listen() if (sock->state != SS_UNCONNECTED) // [2] sk->sk_state = TCP_LISTEN; // [3] vsock_loopback_work() virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_RST) if ((sk->sk_state != TCP_LISTEN && vsk->transport != &t->transport)) { // [4] ... virtio_transport_close_timeout() virtio_transport_do_close() vsock_stream_has_data() return vsk->transport->stream_has_data(vsk); // null-ptr-deref ``` (Yes, This is quite a crazy scenario, but it can actually be induced) Since sock->state is set to SS_UNCONNECTED during the first connect()[1], it can pass the sock->state check[2] in vsock_listen() and set sk->sk_state to TCP_LISTEN[3]. If this happens, the check in the patch with `sk->sk_state != TCP_LISTEN` will pass[4], and a null-ptr-deref can still occur. More specifically, because the sk_state has changed to TCP_LISTEN, virtio_transport_recv_disconnecting() will not be called by the loopback worker. However, a null-ptr-deref may occur in virtio_transport_close_timeout(), which is scheduled by virtio_transport_close() called in the flow of the second connect()[5]. (The patch no longer cancels the virtio_transport_close_timeout() worker) And even if the `sk->sk_state != TCP_LISTEN` check is removed from the patch, it seems that a null-ptr-deref will still occur due to virtio_transport_close_timeout(). It might be necessary to add worker cancellation at the appropriate location.
On Wed, Jan 08, 2025 at 02:31:19PM -0500, Hyunwoo Kim wrote: >On Wed, Jan 08, 2025 at 07:06:16PM +0100, Stefano Garzarella wrote: >> If the socket has been de-assigned or assigned to another transport, >> we must discard any packets received because they are not expected >> and would cause issues when we access vsk->transport. >> >> A possible scenario is described by Hyunwoo Kim in the attached link, >> where after a first connect() interrupted by a signal, and a second >> connect() failed, we can find `vsk->transport` at NULL, leading to a >> NULL pointer dereference. >> >> Fixes: c0cfa2d8a788 ("vsock: add multi-transports support") >> Reported-by: Hyunwoo Kim <v4bel@theori.io> >> Reported-by: Wongi Lee <qwerty@theori.io> >> Closes: https://lore.kernel.org/netdev/Z2LvdTTQR7dBmPb5@v4bel-B760M-AORUS-ELITE-AX/ >> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> >> --- >> net/vmw_vsock/virtio_transport_common.c | 7 +++++-- >> 1 file changed, 5 insertions(+), 2 deletions(-) >> >> diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c >> index 9acc13ab3f82..51a494b69be8 100644 >> --- a/net/vmw_vsock/virtio_transport_common.c >> +++ b/net/vmw_vsock/virtio_transport_common.c >> @@ -1628,8 +1628,11 @@ void virtio_transport_recv_pkt(struct virtio_transport *t, >> >> lock_sock(sk); >> >> - /* Check if sk has been closed before lock_sock */ >> - if (sock_flag(sk, SOCK_DONE)) { >> + /* Check if sk has been closed or assigned to another transport before >> + * lock_sock (note: listener sockets are not assigned to any transport) >> + */ >> + if (sock_flag(sk, SOCK_DONE) || >> + (sk->sk_state != TCP_LISTEN && vsk->transport != &t->transport)) { > >If a race scenario with vsock_listen() is added to the existing >race scenario, the patch can be bypassed. > >In addition to the existing scenario: >``` > cpu0 cpu1 > > socket(A) > > bind(A, {cid: VMADDR_CID_LOCAL, port: 1024}) > vsock_bind() > > listen(A) > vsock_listen() > socket(B) > > connect(B, {cid: VMADDR_CID_LOCAL, port: 1024}) > vsock_connect() > lock_sock(sk); > virtio_transport_connect() > virtio_transport_connect() > virtio_transport_send_pkt_info() > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_REQUEST) > queue_work(vsock_loopback_work) > sk->sk_state = TCP_SYN_SENT; > release_sock(sk); > vsock_loopback_work() > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_REQUEST) > sk = vsock_find_bound_socket(&dst); > virtio_transport_recv_listen(sk, skb) > child = vsock_create_connected(sk); > vsock_assign_transport() > vvs = kzalloc(sizeof(*vvs), GFP_KERNEL); > vsock_insert_connected(vchild); > list_add(&vsk->connected_table, list); > virtio_transport_send_response(vchild, skb); > virtio_transport_send_pkt_info() > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_RESPONSE) > queue_work(vsock_loopback_work) > > vsock_loopback_work() > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_RESPONSE) > sk = vsock_find_bound_socket(&dst); > lock_sock(sk); > case TCP_SYN_SENT: > virtio_transport_recv_connecting() > sk->sk_state = TCP_ESTABLISHED; > release_sock(sk); > > kill(connect(B)); > lock_sock(sk); > if (signal_pending(current)) { > sk->sk_state = sk->sk_state == TCP_ESTABLISHED ? TCP_CLOSING : TCP_CLOSE; > sock->state = SS_UNCONNECTED; // [1] > release_sock(sk); > > connect(B, {cid: VMADDR_CID_HYPERVISOR, port: 1024}) > vsock_connect(B) > lock_sock(sk); > vsock_assign_transport() > virtio_transport_release() > virtio_transport_close() > if (!(sk->sk_state == TCP_ESTABLISHED || sk->sk_state == TCP_CLOSING)) > virtio_transport_shutdown() > virtio_transport_send_pkt_info() > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_SHUTDOWN) > queue_work(vsock_loopback_work) > schedule_delayed_work(&vsk->close_work, VSOCK_CLOSE_TIMEOUT); // [5] > vsock_deassign_transport() > vsk->transport = NULL; > return -ESOCKTNOSUPPORT; > release_sock(sk); > vsock_loopback_work() > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_SHUTDOWN) > virtio_transport_recv_connected() > virtio_transport_reset() > virtio_transport_send_pkt_info() > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_RST) > queue_work(vsock_loopback_work) > listen(B) > vsock_listen() > if (sock->state != SS_UNCONNECTED) // [2] > sk->sk_state = TCP_LISTEN; // [3] > > vsock_loopback_work() > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_RST) > if ((sk->sk_state != TCP_LISTEN && vsk->transport != &t->transport)) { // [4] > ... > > virtio_transport_close_timeout() > virtio_transport_do_close() > vsock_stream_has_data() > return vsk->transport->stream_has_data(vsk); // null-ptr-deref > >``` >(Yes, This is quite a crazy scenario, but it can actually be induced) > >Since sock->state is set to SS_UNCONNECTED during the first connect()[1], >it can pass the sock->state check[2] in vsock_listen() and set >sk->sk_state to TCP_LISTEN[3]. >If this happens, the check in the patch with >`sk->sk_state != TCP_LISTEN` will pass[4], and a null-ptr-deref can >still occur.) > >More specifically, because the sk_state has changed to TCP_LISTEN, >virtio_transport_recv_disconnecting() will not be called by the >loopback worker. However, a null-ptr-deref may occur in >virtio_transport_close_timeout(), which is scheduled by >virtio_transport_close() called in the flow of the second connect()[5]. >(The patch no longer cancels the virtio_transport_close_timeout() worker) > >And even if the `sk->sk_state != TCP_LISTEN` check is removed from the >patch, it seems that a null-ptr-deref will still occur due to >virtio_transport_close_timeout(). >It might be necessary to add worker cancellation at the >appropriate location. Thanks for the analysis! Do you have time to cook a proper patch to cover this scenario? Or we should mix this fix together with your patch (return 0 in vsock_stream_has_data()) while we investigate a better handling? Thanks, Stefano
On Thu, Jan 09, 2025 at 10:01:31AM +0100, Stefano Garzarella wrote: > On Wed, Jan 08, 2025 at 02:31:19PM -0500, Hyunwoo Kim wrote: > > On Wed, Jan 08, 2025 at 07:06:16PM +0100, Stefano Garzarella wrote: > > > If the socket has been de-assigned or assigned to another transport, > > > we must discard any packets received because they are not expected > > > and would cause issues when we access vsk->transport. > > > > > > A possible scenario is described by Hyunwoo Kim in the attached link, > > > where after a first connect() interrupted by a signal, and a second > > > connect() failed, we can find `vsk->transport` at NULL, leading to a > > > NULL pointer dereference. > > > > > > Fixes: c0cfa2d8a788 ("vsock: add multi-transports support") > > > Reported-by: Hyunwoo Kim <v4bel@theori.io> > > > Reported-by: Wongi Lee <qwerty@theori.io> > > > Closes: https://lore.kernel.org/netdev/Z2LvdTTQR7dBmPb5@v4bel-B760M-AORUS-ELITE-AX/ > > > Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> > > > --- > > > net/vmw_vsock/virtio_transport_common.c | 7 +++++-- > > > 1 file changed, 5 insertions(+), 2 deletions(-) > > > > > > diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c > > > index 9acc13ab3f82..51a494b69be8 100644 > > > --- a/net/vmw_vsock/virtio_transport_common.c > > > +++ b/net/vmw_vsock/virtio_transport_common.c > > > @@ -1628,8 +1628,11 @@ void virtio_transport_recv_pkt(struct virtio_transport *t, > > > > > > lock_sock(sk); > > > > > > - /* Check if sk has been closed before lock_sock */ > > > - if (sock_flag(sk, SOCK_DONE)) { > > > + /* Check if sk has been closed or assigned to another transport before > > > + * lock_sock (note: listener sockets are not assigned to any transport) > > > + */ > > > + if (sock_flag(sk, SOCK_DONE) || > > > + (sk->sk_state != TCP_LISTEN && vsk->transport != &t->transport)) { > > > > If a race scenario with vsock_listen() is added to the existing > > race scenario, the patch can be bypassed. > > > > In addition to the existing scenario: > > ``` > > cpu0 cpu1 > > > > socket(A) > > > > bind(A, {cid: VMADDR_CID_LOCAL, port: 1024}) > > vsock_bind() > > > > listen(A) > > vsock_listen() > > socket(B) > > > > connect(B, {cid: VMADDR_CID_LOCAL, port: 1024}) > > vsock_connect() > > lock_sock(sk); > > virtio_transport_connect() > > virtio_transport_connect() > > virtio_transport_send_pkt_info() > > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_REQUEST) > > queue_work(vsock_loopback_work) > > sk->sk_state = TCP_SYN_SENT; > > release_sock(sk); > > vsock_loopback_work() > > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_REQUEST) > > sk = vsock_find_bound_socket(&dst); > > virtio_transport_recv_listen(sk, skb) > > child = vsock_create_connected(sk); > > vsock_assign_transport() > > vvs = kzalloc(sizeof(*vvs), GFP_KERNEL); > > vsock_insert_connected(vchild); > > list_add(&vsk->connected_table, list); > > virtio_transport_send_response(vchild, skb); > > virtio_transport_send_pkt_info() > > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_RESPONSE) > > queue_work(vsock_loopback_work) > > > > vsock_loopback_work() > > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_RESPONSE) > > sk = vsock_find_bound_socket(&dst); > > lock_sock(sk); > > case TCP_SYN_SENT: > > virtio_transport_recv_connecting() > > sk->sk_state = TCP_ESTABLISHED; > > release_sock(sk); > > > > kill(connect(B)); > > lock_sock(sk); > > if (signal_pending(current)) { > > sk->sk_state = sk->sk_state == TCP_ESTABLISHED ? TCP_CLOSING : TCP_CLOSE; > > sock->state = SS_UNCONNECTED; // [1] > > release_sock(sk); > > > > connect(B, {cid: VMADDR_CID_HYPERVISOR, port: 1024}) > > vsock_connect(B) > > lock_sock(sk); > > vsock_assign_transport() > > virtio_transport_release() > > virtio_transport_close() > > if (!(sk->sk_state == TCP_ESTABLISHED || sk->sk_state == TCP_CLOSING)) > > virtio_transport_shutdown() > > virtio_transport_send_pkt_info() > > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_SHUTDOWN) > > queue_work(vsock_loopback_work) > > schedule_delayed_work(&vsk->close_work, VSOCK_CLOSE_TIMEOUT); // [5] > > vsock_deassign_transport() > > vsk->transport = NULL; > > return -ESOCKTNOSUPPORT; > > release_sock(sk); > > vsock_loopback_work() > > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_SHUTDOWN) > > virtio_transport_recv_connected() > > virtio_transport_reset() > > virtio_transport_send_pkt_info() > > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_RST) > > queue_work(vsock_loopback_work) > > listen(B) > > vsock_listen() > > if (sock->state != SS_UNCONNECTED) // [2] > > sk->sk_state = TCP_LISTEN; // [3] > > > > vsock_loopback_work() > > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_RST) > > if ((sk->sk_state != TCP_LISTEN && vsk->transport != &t->transport)) { // [4] > > ... > > > > virtio_transport_close_timeout() > > virtio_transport_do_close() > > vsock_stream_has_data() > > return vsk->transport->stream_has_data(vsk); // null-ptr-deref > > > > ``` > > (Yes, This is quite a crazy scenario, but it can actually be induced) > > > > Since sock->state is set to SS_UNCONNECTED during the first connect()[1], > > it can pass the sock->state check[2] in vsock_listen() and set > > sk->sk_state to TCP_LISTEN[3]. > > If this happens, the check in the patch with > > `sk->sk_state != TCP_LISTEN` will pass[4], and a null-ptr-deref can > > still occur.) > > > > More specifically, because the sk_state has changed to TCP_LISTEN, > > virtio_transport_recv_disconnecting() will not be called by the > > loopback worker. However, a null-ptr-deref may occur in > > virtio_transport_close_timeout(), which is scheduled by > > virtio_transport_close() called in the flow of the second connect()[5]. > > (The patch no longer cancels the virtio_transport_close_timeout() worker) > > > > And even if the `sk->sk_state != TCP_LISTEN` check is removed from the > > patch, it seems that a null-ptr-deref will still occur due to > > virtio_transport_close_timeout(). > > It might be necessary to add worker cancellation at the > > appropriate location. > > Thanks for the analysis! > > Do you have time to cook a proper patch to cover this scenario? > Or we should mix this fix together with your patch (return 0 in > vsock_stream_has_data()) while we investigate a better handling? > > Thanks, > Stefano better combine them imho.
On Thu, Jan 09, 2025 at 10:01:31AM +0100, Stefano Garzarella wrote: > On Wed, Jan 08, 2025 at 02:31:19PM -0500, Hyunwoo Kim wrote: > > On Wed, Jan 08, 2025 at 07:06:16PM +0100, Stefano Garzarella wrote: > > > If the socket has been de-assigned or assigned to another transport, > > > we must discard any packets received because they are not expected > > > and would cause issues when we access vsk->transport. > > > > > > A possible scenario is described by Hyunwoo Kim in the attached link, > > > where after a first connect() interrupted by a signal, and a second > > > connect() failed, we can find `vsk->transport` at NULL, leading to a > > > NULL pointer dereference. > > > > > > Fixes: c0cfa2d8a788 ("vsock: add multi-transports support") > > > Reported-by: Hyunwoo Kim <v4bel@theori.io> > > > Reported-by: Wongi Lee <qwerty@theori.io> > > > Closes: https://lore.kernel.org/netdev/Z2LvdTTQR7dBmPb5@v4bel-B760M-AORUS-ELITE-AX/ > > > Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> > > > --- > > > net/vmw_vsock/virtio_transport_common.c | 7 +++++-- > > > 1 file changed, 5 insertions(+), 2 deletions(-) > > > > > > diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c > > > index 9acc13ab3f82..51a494b69be8 100644 > > > --- a/net/vmw_vsock/virtio_transport_common.c > > > +++ b/net/vmw_vsock/virtio_transport_common.c > > > @@ -1628,8 +1628,11 @@ void virtio_transport_recv_pkt(struct virtio_transport *t, > > > > > > lock_sock(sk); > > > > > > - /* Check if sk has been closed before lock_sock */ > > > - if (sock_flag(sk, SOCK_DONE)) { > > > + /* Check if sk has been closed or assigned to another transport before > > > + * lock_sock (note: listener sockets are not assigned to any transport) > > > + */ > > > + if (sock_flag(sk, SOCK_DONE) || > > > + (sk->sk_state != TCP_LISTEN && vsk->transport != &t->transport)) { > > > > If a race scenario with vsock_listen() is added to the existing > > race scenario, the patch can be bypassed. > > > > In addition to the existing scenario: > > ``` > > cpu0 cpu1 > > > > socket(A) > > > > bind(A, {cid: VMADDR_CID_LOCAL, port: 1024}) > > vsock_bind() > > > > listen(A) > > vsock_listen() > > socket(B) > > > > connect(B, {cid: VMADDR_CID_LOCAL, port: 1024}) > > vsock_connect() > > lock_sock(sk); > > virtio_transport_connect() > > virtio_transport_connect() > > virtio_transport_send_pkt_info() > > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_REQUEST) > > queue_work(vsock_loopback_work) > > sk->sk_state = TCP_SYN_SENT; > > release_sock(sk); > > vsock_loopback_work() > > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_REQUEST) > > sk = vsock_find_bound_socket(&dst); > > virtio_transport_recv_listen(sk, skb) > > child = vsock_create_connected(sk); > > vsock_assign_transport() > > vvs = kzalloc(sizeof(*vvs), GFP_KERNEL); > > vsock_insert_connected(vchild); > > list_add(&vsk->connected_table, list); > > virtio_transport_send_response(vchild, skb); > > virtio_transport_send_pkt_info() > > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_RESPONSE) > > queue_work(vsock_loopback_work) > > > > vsock_loopback_work() > > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_RESPONSE) > > sk = vsock_find_bound_socket(&dst); > > lock_sock(sk); > > case TCP_SYN_SENT: > > virtio_transport_recv_connecting() > > sk->sk_state = TCP_ESTABLISHED; > > release_sock(sk); > > > > kill(connect(B)); > > lock_sock(sk); > > if (signal_pending(current)) { > > sk->sk_state = sk->sk_state == TCP_ESTABLISHED ? TCP_CLOSING : TCP_CLOSE; > > sock->state = SS_UNCONNECTED; // [1] > > release_sock(sk); > > > > connect(B, {cid: VMADDR_CID_HYPERVISOR, port: 1024}) > > vsock_connect(B) > > lock_sock(sk); > > vsock_assign_transport() > > virtio_transport_release() > > virtio_transport_close() > > if (!(sk->sk_state == TCP_ESTABLISHED || sk->sk_state == TCP_CLOSING)) > > virtio_transport_shutdown() > > virtio_transport_send_pkt_info() > > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_SHUTDOWN) > > queue_work(vsock_loopback_work) > > schedule_delayed_work(&vsk->close_work, VSOCK_CLOSE_TIMEOUT); // [5] > > vsock_deassign_transport() > > vsk->transport = NULL; > > return -ESOCKTNOSUPPORT; > > release_sock(sk); > > vsock_loopback_work() > > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_SHUTDOWN) > > virtio_transport_recv_connected() > > virtio_transport_reset() > > virtio_transport_send_pkt_info() > > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_RST) > > queue_work(vsock_loopback_work) > > listen(B) > > vsock_listen() > > if (sock->state != SS_UNCONNECTED) // [2] > > sk->sk_state = TCP_LISTEN; // [3] > > > > vsock_loopback_work() > > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_RST) > > if ((sk->sk_state != TCP_LISTEN && vsk->transport != &t->transport)) { // [4] > > ... > > > > virtio_transport_close_timeout() > > virtio_transport_do_close() > > vsock_stream_has_data() > > return vsk->transport->stream_has_data(vsk); // null-ptr-deref > > > > ``` > > (Yes, This is quite a crazy scenario, but it can actually be induced) > > > > Since sock->state is set to SS_UNCONNECTED during the first connect()[1], > > it can pass the sock->state check[2] in vsock_listen() and set > > sk->sk_state to TCP_LISTEN[3]. > > If this happens, the check in the patch with > > `sk->sk_state != TCP_LISTEN` will pass[4], and a null-ptr-deref can > > still occur.) > > > > More specifically, because the sk_state has changed to TCP_LISTEN, > > virtio_transport_recv_disconnecting() will not be called by the > > loopback worker. However, a null-ptr-deref may occur in > > virtio_transport_close_timeout(), which is scheduled by > > virtio_transport_close() called in the flow of the second connect()[5]. > > (The patch no longer cancels the virtio_transport_close_timeout() worker) > > > > And even if the `sk->sk_state != TCP_LISTEN` check is removed from the > > patch, it seems that a null-ptr-deref will still occur due to > > virtio_transport_close_timeout(). > > It might be necessary to add worker cancellation at the > > appropriate location. > > Thanks for the analysis! > > Do you have time to cook a proper patch to cover this scenario? > Or we should mix this fix together with your patch (return 0 in > vsock_stream_has_data()) while we investigate a better handling? For now, it seems better to merge them together. It seems that covering this scenario will require more analysis and testing. Regards, Hyunwoo Kim
On Thu, Jan 09, 2025 at 04:13:44AM -0500, Hyunwoo Kim wrote: >On Thu, Jan 09, 2025 at 10:01:31AM +0100, Stefano Garzarella wrote: >> On Wed, Jan 08, 2025 at 02:31:19PM -0500, Hyunwoo Kim wrote: >> > On Wed, Jan 08, 2025 at 07:06:16PM +0100, Stefano Garzarella wrote: >> > > If the socket has been de-assigned or assigned to another transport, >> > > we must discard any packets received because they are not expected >> > > and would cause issues when we access vsk->transport. >> > > >> > > A possible scenario is described by Hyunwoo Kim in the attached link, >> > > where after a first connect() interrupted by a signal, and a second >> > > connect() failed, we can find `vsk->transport` at NULL, leading to a >> > > NULL pointer dereference. >> > > >> > > Fixes: c0cfa2d8a788 ("vsock: add multi-transports support") >> > > Reported-by: Hyunwoo Kim <v4bel@theori.io> >> > > Reported-by: Wongi Lee <qwerty@theori.io> >> > > Closes: https://lore.kernel.org/netdev/Z2LvdTTQR7dBmPb5@v4bel-B760M-AORUS-ELITE-AX/ >> > > Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> >> > > --- >> > > net/vmw_vsock/virtio_transport_common.c | 7 +++++-- >> > > 1 file changed, 5 insertions(+), 2 deletions(-) >> > > >> > > diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c >> > > index 9acc13ab3f82..51a494b69be8 100644 >> > > --- a/net/vmw_vsock/virtio_transport_common.c >> > > +++ b/net/vmw_vsock/virtio_transport_common.c >> > > @@ -1628,8 +1628,11 @@ void virtio_transport_recv_pkt(struct virtio_transport *t, >> > > >> > > lock_sock(sk); >> > > >> > > - /* Check if sk has been closed before lock_sock */ >> > > - if (sock_flag(sk, SOCK_DONE)) { >> > > + /* Check if sk has been closed or assigned to another transport before >> > > + * lock_sock (note: listener sockets are not assigned to any transport) >> > > + */ >> > > + if (sock_flag(sk, SOCK_DONE) || >> > > + (sk->sk_state != TCP_LISTEN && vsk->transport != &t->transport)) { >> > >> > If a race scenario with vsock_listen() is added to the existing >> > race scenario, the patch can be bypassed. >> > >> > In addition to the existing scenario: >> > ``` >> > cpu0 cpu1 >> > >> > socket(A) >> > >> > bind(A, {cid: VMADDR_CID_LOCAL, port: 1024}) >> > vsock_bind() >> > >> > listen(A) >> > vsock_listen() >> > socket(B) >> > >> > connect(B, {cid: VMADDR_CID_LOCAL, port: 1024}) >> > vsock_connect() >> > lock_sock(sk); >> > virtio_transport_connect() >> > virtio_transport_connect() >> > virtio_transport_send_pkt_info() >> > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_REQUEST) >> > queue_work(vsock_loopback_work) >> > sk->sk_state = TCP_SYN_SENT; >> > release_sock(sk); >> > vsock_loopback_work() >> > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_REQUEST) >> > sk = vsock_find_bound_socket(&dst); >> > virtio_transport_recv_listen(sk, skb) >> > child = vsock_create_connected(sk); >> > vsock_assign_transport() >> > vvs = kzalloc(sizeof(*vvs), GFP_KERNEL); >> > vsock_insert_connected(vchild); >> > list_add(&vsk->connected_table, list); >> > virtio_transport_send_response(vchild, skb); >> > virtio_transport_send_pkt_info() >> > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_RESPONSE) >> > queue_work(vsock_loopback_work) >> > >> > vsock_loopback_work() >> > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_RESPONSE) >> > sk = vsock_find_bound_socket(&dst); >> > lock_sock(sk); >> > case TCP_SYN_SENT: >> > virtio_transport_recv_connecting() >> > sk->sk_state = TCP_ESTABLISHED; >> > release_sock(sk); >> > >> > kill(connect(B)); >> > lock_sock(sk); >> > if (signal_pending(current)) { >> > sk->sk_state = sk->sk_state == TCP_ESTABLISHED ? TCP_CLOSING : TCP_CLOSE; >> > sock->state = SS_UNCONNECTED; // [1] >> > release_sock(sk); >> > >> > connect(B, {cid: VMADDR_CID_HYPERVISOR, port: 1024}) >> > vsock_connect(B) >> > lock_sock(sk); >> > vsock_assign_transport() >> > virtio_transport_release() >> > virtio_transport_close() >> > if (!(sk->sk_state == TCP_ESTABLISHED || sk->sk_state == TCP_CLOSING)) >> > virtio_transport_shutdown() >> > virtio_transport_send_pkt_info() >> > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_SHUTDOWN) >> > queue_work(vsock_loopback_work) >> > schedule_delayed_work(&vsk->close_work, VSOCK_CLOSE_TIMEOUT); // [5] >> > vsock_deassign_transport() >> > vsk->transport = NULL; >> > return -ESOCKTNOSUPPORT; >> > release_sock(sk); >> > vsock_loopback_work() >> > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_SHUTDOWN) >> > virtio_transport_recv_connected() >> > virtio_transport_reset() >> > virtio_transport_send_pkt_info() >> > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_RST) >> > queue_work(vsock_loopback_work) >> > listen(B) >> > vsock_listen() >> > if (sock->state != SS_UNCONNECTED) // [2] >> > sk->sk_state = TCP_LISTEN; // [3] >> > >> > vsock_loopback_work() >> > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_RST) >> > if ((sk->sk_state != TCP_LISTEN && vsk->transport != &t->transport)) { // [4] >> > ... >> > >> > virtio_transport_close_timeout() >> > virtio_transport_do_close() >> > vsock_stream_has_data() >> > return vsk->transport->stream_has_data(vsk); // null-ptr-deref >> > >> > ``` >> > (Yes, This is quite a crazy scenario, but it can actually be induced) >> > >> > Since sock->state is set to SS_UNCONNECTED during the first connect()[1], >> > it can pass the sock->state check[2] in vsock_listen() and set >> > sk->sk_state to TCP_LISTEN[3]. >> > If this happens, the check in the patch with >> > `sk->sk_state != TCP_LISTEN` will pass[4], and a null-ptr-deref can >> > still occur.) >> > >> > More specifically, because the sk_state has changed to TCP_LISTEN, >> > virtio_transport_recv_disconnecting() will not be called by the >> > loopback worker. However, a null-ptr-deref may occur in >> > virtio_transport_close_timeout(), which is scheduled by >> > virtio_transport_close() called in the flow of the second connect()[5]. >> > (The patch no longer cancels the virtio_transport_close_timeout() worker) >> > >> > And even if the `sk->sk_state != TCP_LISTEN` check is removed from the >> > patch, it seems that a null-ptr-deref will still occur due to >> > virtio_transport_close_timeout(). >> > It might be necessary to add worker cancellation at the >> > appropriate location. >> >> Thanks for the analysis! >> >> Do you have time to cook a proper patch to cover this scenario? >> Or we should mix this fix together with your patch (return 0 in >> vsock_stream_has_data()) while we investigate a better handling? > >For now, it seems better to merge them together. Okay, since both you and Michael agree on that, I'll include your changes in this series, but adding a warning message, since it should not happen. Is that fine with you? > >It seems that covering this scenario will require more analysis and >testing. Yeah, scheduling a task during the release is tricky, especially when we are changing the transport, so I think we should handle that better. One idea that I have it to cancel delayed works in virtio_transport_destruct(), I'll test it a bit and add a patch for that in the next version of this series. We also need to reset SOCK_DONE after changing the transports. Thanks, Stefano
On Thu, Jan 09, 2025 at 11:59:21AM +0100, Stefano Garzarella wrote: > On Thu, Jan 09, 2025 at 04:13:44AM -0500, Hyunwoo Kim wrote: > > On Thu, Jan 09, 2025 at 10:01:31AM +0100, Stefano Garzarella wrote: > > > On Wed, Jan 08, 2025 at 02:31:19PM -0500, Hyunwoo Kim wrote: > > > > On Wed, Jan 08, 2025 at 07:06:16PM +0100, Stefano Garzarella wrote: > > > > > If the socket has been de-assigned or assigned to another transport, > > > > > we must discard any packets received because they are not expected > > > > > and would cause issues when we access vsk->transport. > > > > > > > > > > A possible scenario is described by Hyunwoo Kim in the attached link, > > > > > where after a first connect() interrupted by a signal, and a second > > > > > connect() failed, we can find `vsk->transport` at NULL, leading to a > > > > > NULL pointer dereference. > > > > > > > > > > Fixes: c0cfa2d8a788 ("vsock: add multi-transports support") > > > > > Reported-by: Hyunwoo Kim <v4bel@theori.io> > > > > > Reported-by: Wongi Lee <qwerty@theori.io> > > > > > Closes: https://lore.kernel.org/netdev/Z2LvdTTQR7dBmPb5@v4bel-B760M-AORUS-ELITE-AX/ > > > > > Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> > > > > > --- > > > > > net/vmw_vsock/virtio_transport_common.c | 7 +++++-- > > > > > 1 file changed, 5 insertions(+), 2 deletions(-) > > > > > > > > > > diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c > > > > > index 9acc13ab3f82..51a494b69be8 100644 > > > > > --- a/net/vmw_vsock/virtio_transport_common.c > > > > > +++ b/net/vmw_vsock/virtio_transport_common.c > > > > > @@ -1628,8 +1628,11 @@ void virtio_transport_recv_pkt(struct virtio_transport *t, > > > > > > > > > > lock_sock(sk); > > > > > > > > > > - /* Check if sk has been closed before lock_sock */ > > > > > - if (sock_flag(sk, SOCK_DONE)) { > > > > > + /* Check if sk has been closed or assigned to another transport before > > > > > + * lock_sock (note: listener sockets are not assigned to any transport) > > > > > + */ > > > > > + if (sock_flag(sk, SOCK_DONE) || > > > > > + (sk->sk_state != TCP_LISTEN && vsk->transport != &t->transport)) { > > > > > > > > If a race scenario with vsock_listen() is added to the existing > > > > race scenario, the patch can be bypassed. > > > > > > > > In addition to the existing scenario: > > > > ``` > > > > cpu0 cpu1 > > > > > > > > socket(A) > > > > > > > > bind(A, {cid: VMADDR_CID_LOCAL, port: 1024}) > > > > vsock_bind() > > > > > > > > listen(A) > > > > vsock_listen() > > > > socket(B) > > > > > > > > connect(B, {cid: VMADDR_CID_LOCAL, port: 1024}) > > > > vsock_connect() > > > > lock_sock(sk); > > > > virtio_transport_connect() > > > > virtio_transport_connect() > > > > virtio_transport_send_pkt_info() > > > > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_REQUEST) > > > > queue_work(vsock_loopback_work) > > > > sk->sk_state = TCP_SYN_SENT; > > > > release_sock(sk); > > > > vsock_loopback_work() > > > > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_REQUEST) > > > > sk = vsock_find_bound_socket(&dst); > > > > virtio_transport_recv_listen(sk, skb) > > > > child = vsock_create_connected(sk); > > > > vsock_assign_transport() > > > > vvs = kzalloc(sizeof(*vvs), GFP_KERNEL); > > > > vsock_insert_connected(vchild); > > > > list_add(&vsk->connected_table, list); > > > > virtio_transport_send_response(vchild, skb); > > > > virtio_transport_send_pkt_info() > > > > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_RESPONSE) > > > > queue_work(vsock_loopback_work) > > > > > > > > vsock_loopback_work() > > > > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_RESPONSE) > > > > sk = vsock_find_bound_socket(&dst); > > > > lock_sock(sk); > > > > case TCP_SYN_SENT: > > > > virtio_transport_recv_connecting() > > > > sk->sk_state = TCP_ESTABLISHED; > > > > release_sock(sk); > > > > > > > > kill(connect(B)); > > > > lock_sock(sk); > > > > if (signal_pending(current)) { > > > > sk->sk_state = sk->sk_state == TCP_ESTABLISHED ? TCP_CLOSING : TCP_CLOSE; > > > > sock->state = SS_UNCONNECTED; // [1] > > > > release_sock(sk); > > > > > > > > connect(B, {cid: VMADDR_CID_HYPERVISOR, port: 1024}) > > > > vsock_connect(B) > > > > lock_sock(sk); > > > > vsock_assign_transport() > > > > virtio_transport_release() > > > > virtio_transport_close() > > > > if (!(sk->sk_state == TCP_ESTABLISHED || sk->sk_state == TCP_CLOSING)) > > > > virtio_transport_shutdown() > > > > virtio_transport_send_pkt_info() > > > > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_SHUTDOWN) > > > > queue_work(vsock_loopback_work) > > > > schedule_delayed_work(&vsk->close_work, VSOCK_CLOSE_TIMEOUT); // [5] > > > > vsock_deassign_transport() > > > > vsk->transport = NULL; > > > > return -ESOCKTNOSUPPORT; > > > > release_sock(sk); > > > > vsock_loopback_work() > > > > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_SHUTDOWN) > > > > virtio_transport_recv_connected() > > > > virtio_transport_reset() > > > > virtio_transport_send_pkt_info() > > > > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_RST) > > > > queue_work(vsock_loopback_work) > > > > listen(B) > > > > vsock_listen() > > > > if (sock->state != SS_UNCONNECTED) // [2] > > > > sk->sk_state = TCP_LISTEN; // [3] > > > > > > > > vsock_loopback_work() > > > > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_RST) > > > > if ((sk->sk_state != TCP_LISTEN && vsk->transport != &t->transport)) { // [4] > > > > ... > > > > > > > > virtio_transport_close_timeout() > > > > virtio_transport_do_close() > > > > vsock_stream_has_data() > > > > return vsk->transport->stream_has_data(vsk); // null-ptr-deref > > > > > > > > ``` > > > > (Yes, This is quite a crazy scenario, but it can actually be induced) > > > > > > > > Since sock->state is set to SS_UNCONNECTED during the first connect()[1], > > > > it can pass the sock->state check[2] in vsock_listen() and set > > > > sk->sk_state to TCP_LISTEN[3]. > > > > If this happens, the check in the patch with > > > > `sk->sk_state != TCP_LISTEN` will pass[4], and a null-ptr-deref can > > > > still occur.) > > > > > > > > More specifically, because the sk_state has changed to TCP_LISTEN, > > > > virtio_transport_recv_disconnecting() will not be called by the > > > > loopback worker. However, a null-ptr-deref may occur in > > > > virtio_transport_close_timeout(), which is scheduled by > > > > virtio_transport_close() called in the flow of the second connect()[5]. > > > > (The patch no longer cancels the virtio_transport_close_timeout() worker) > > > > > > > > And even if the `sk->sk_state != TCP_LISTEN` check is removed from the > > > > patch, it seems that a null-ptr-deref will still occur due to > > > > virtio_transport_close_timeout(). > > > > It might be necessary to add worker cancellation at the > > > > appropriate location. > > > > > > Thanks for the analysis! > > > > > > Do you have time to cook a proper patch to cover this scenario? > > > Or we should mix this fix together with your patch (return 0 in > > > vsock_stream_has_data()) while we investigate a better handling? > > > > For now, it seems better to merge them together. > > Okay, since both you and Michael agree on that, I'll include your changes in > this series, but adding a warning message, since it should not happen. > > Is that fine with you? Yes, I agree. > > > > > It seems that covering this scenario will require more analysis and > > testing. > > Yeah, scheduling a task during the release is tricky, especially when we are > changing the transport, so I think we should handle that better. > > One idea that I have it to cancel delayed works in > virtio_transport_destruct(), I'll test it a bit and add a patch for that in > the next version of this series. OK. once the patch is submitted, I will review it. > > We also need to reset SOCK_DONE after changing the transports. > > Thanks, > Stefano >
On 1/8/25 19:06, Stefano Garzarella wrote: > If the socket has been de-assigned or assigned to another transport, > we must discard any packets received because they are not expected > and would cause issues when we access vsk->transport. > > A possible scenario is described by Hyunwoo Kim in the attached link, > where after a first connect() interrupted by a signal, and a second > connect() failed, we can find `vsk->transport` at NULL, leading to a > NULL pointer dereference. > > Fixes: c0cfa2d8a788 ("vsock: add multi-transports support") > Reported-by: Hyunwoo Kim <v4bel@theori.io> > Reported-by: Wongi Lee <qwerty@theori.io> > Closes: https://lore.kernel.org/netdev/Z2LvdTTQR7dBmPb5@v4bel-B760M-AORUS-ELITE-AX/ > Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> > --- > net/vmw_vsock/virtio_transport_common.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c > index 9acc13ab3f82..51a494b69be8 100644 > --- a/net/vmw_vsock/virtio_transport_common.c > +++ b/net/vmw_vsock/virtio_transport_common.c > @@ -1628,8 +1628,11 @@ void virtio_transport_recv_pkt(struct virtio_transport *t, > > lock_sock(sk); > > - /* Check if sk has been closed before lock_sock */ > - if (sock_flag(sk, SOCK_DONE)) { > + /* Check if sk has been closed or assigned to another transport before > + * lock_sock (note: listener sockets are not assigned to any transport) > + */ > + if (sock_flag(sk, SOCK_DONE) || > + (sk->sk_state != TCP_LISTEN && vsk->transport != &t->transport)) { > (void)virtio_transport_reset_no_sock(t, skb); > release_sock(sk); > sock_put(sk); FWIW, I've tried simplifying Hyunwoo's repro to toy with some tests. Ended up with ``` from threading import * from socket import * from signal import * def listener(tid): while True: s = socket(AF_VSOCK, SOCK_SEQPACKET) s.bind((1, 1234)) s.listen() pthread_kill(tid, SIGUSR1) signal(SIGUSR1, lambda *args: None) Thread(target=listener, args=[get_ident()]).start() while True: c = socket(AF_VSOCK, SOCK_SEQPACKET) c.connect_ex((1, 1234)) c.connect_ex((42, 1234)) ``` which gives me splats with or without this patch. That said, when I apply this patch, but drop the `sk->sk_state != TCP_LISTEN &&`: no more splats.
On Thu, Jan 09, 2025 at 02:34:28PM +0100, Michal Luczaj wrote: >On 1/8/25 19:06, Stefano Garzarella wrote: >> If the socket has been de-assigned or assigned to another transport, >> we must discard any packets received because they are not expected >> and would cause issues when we access vsk->transport. >> >> A possible scenario is described by Hyunwoo Kim in the attached link, >> where after a first connect() interrupted by a signal, and a second >> connect() failed, we can find `vsk->transport` at NULL, leading to a >> NULL pointer dereference. >> >> Fixes: c0cfa2d8a788 ("vsock: add multi-transports support") >> Reported-by: Hyunwoo Kim <v4bel@theori.io> >> Reported-by: Wongi Lee <qwerty@theori.io> >> Closes: https://lore.kernel.org/netdev/Z2LvdTTQR7dBmPb5@v4bel-B760M-AORUS-ELITE-AX/ >> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> >> --- >> net/vmw_vsock/virtio_transport_common.c | 7 +++++-- >> 1 file changed, 5 insertions(+), 2 deletions(-) >> >> diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c >> index 9acc13ab3f82..51a494b69be8 100644 >> --- a/net/vmw_vsock/virtio_transport_common.c >> +++ b/net/vmw_vsock/virtio_transport_common.c >> @@ -1628,8 +1628,11 @@ void virtio_transport_recv_pkt(struct virtio_transport *t, >> >> lock_sock(sk); >> >> - /* Check if sk has been closed before lock_sock */ >> - if (sock_flag(sk, SOCK_DONE)) { >> + /* Check if sk has been closed or assigned to another transport before >> + * lock_sock (note: listener sockets are not assigned to any transport) >> + */ >> + if (sock_flag(sk, SOCK_DONE) || >> + (sk->sk_state != TCP_LISTEN && vsk->transport != &t->transport)) { >> (void)virtio_transport_reset_no_sock(t, skb); >> release_sock(sk); >> sock_put(sk); > >FWIW, I've tried simplifying Hyunwoo's repro to toy with some tests. Ended >up with > >``` >from threading import * >from socket import * >from signal import * > >def listener(tid): > while True: > s = socket(AF_VSOCK, SOCK_SEQPACKET) > s.bind((1, 1234)) > s.listen() > pthread_kill(tid, SIGUSR1) > >signal(SIGUSR1, lambda *args: None) >Thread(target=listener, args=[get_ident()]).start() > >while True: > c = socket(AF_VSOCK, SOCK_SEQPACKET) > c.connect_ex((1, 1234)) > c.connect_ex((42, 1234)) >``` > >which gives me splats with or without this patch. > >That said, when I apply this patch, but drop the `sk->sk_state != >TCP_LISTEN &&`: no more splats. We can't drop `sk->sk_state != TCP_LISTEN &&` because listener socket doesn't have any transport (vsk->transport == NULL), so every connection request will receive an error, so maybe this is the reason of no splats. I'm cooking some more patches to fix Hyunwoo's scenario handling better the close work when the virtio destructor is called. I'll run your reproduces to test it, thanks for that! Stefano
On 1/9/25 14:42, Stefano Garzarella wrote: > On Thu, Jan 09, 2025 at 02:34:28PM +0100, Michal Luczaj wrote: >> ... >> That said, when I apply this patch, but drop the `sk->sk_state != >> TCP_LISTEN &&`: no more splats. > > We can't drop `sk->sk_state != TCP_LISTEN &&` because listener socket > doesn't have any transport (vsk->transport == NULL), so every connection > request will receive an error, so maybe this is the reason of no splats. Bah, sorry, I didn't run the test suit. > I'm cooking some more patches to fix Hyunwoo's scenario handling better > the close work when the virtio destructor is called. > > I'll run your reproduces to test it, thanks for that! > > Stefano >
diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c index 9acc13ab3f82..51a494b69be8 100644 --- a/net/vmw_vsock/virtio_transport_common.c +++ b/net/vmw_vsock/virtio_transport_common.c @@ -1628,8 +1628,11 @@ void virtio_transport_recv_pkt(struct virtio_transport *t, lock_sock(sk); - /* Check if sk has been closed before lock_sock */ - if (sock_flag(sk, SOCK_DONE)) { + /* Check if sk has been closed or assigned to another transport before + * lock_sock (note: listener sockets are not assigned to any transport) + */ + if (sock_flag(sk, SOCK_DONE) || + (sk->sk_state != TCP_LISTEN && vsk->transport != &t->transport)) { (void)virtio_transport_reset_no_sock(t, skb); release_sock(sk); sock_put(sk);
If the socket has been de-assigned or assigned to another transport, we must discard any packets received because they are not expected and would cause issues when we access vsk->transport. A possible scenario is described by Hyunwoo Kim in the attached link, where after a first connect() interrupted by a signal, and a second connect() failed, we can find `vsk->transport` at NULL, leading to a NULL pointer dereference. Fixes: c0cfa2d8a788 ("vsock: add multi-transports support") Reported-by: Hyunwoo Kim <v4bel@theori.io> Reported-by: Wongi Lee <qwerty@theori.io> Closes: https://lore.kernel.org/netdev/Z2LvdTTQR7dBmPb5@v4bel-B760M-AORUS-ELITE-AX/ Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> --- net/vmw_vsock/virtio_transport_common.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)