From patchwork Mon Mar 17 09:52:23 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michal Luczaj X-Patchwork-Id: 14018909 Received: from mailtransmit05.runbox.com (mailtransmit05.runbox.com [185.226.149.38]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1564A230272; Mon, 17 Mar 2025 09:53:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.226.149.38 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742205183; cv=none; b=Z1s+7lJwsRXZInGYQ2Kp4R+SFrtBmaz+B5OVTIz7nJllEW/J4TOnvO1+nQjxtZc2EwPK+u0+mDM+mcXdflTsD1rrGESe1DjJLG7ryja02lOrV5XpODBCCqcJebGOjD46d+A533sqpx8UFYcr4lRKpIdzJ1DrDFHim/bvGpJiubQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742205183; c=relaxed/simple; bh=3tj+sd1hBSK0PpyiKGpyyTMF/4GJvPBwbGnV9vwWXd0=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=orr2OZ9L8PWvJxhAddnmSkiADZ3W/W1rZyxQc5eDSrwn33TGsWbeU+hhtqoN97UO0UcuiL/CW5tuwSg7HEEJtwsb1PDQWSdGfvd1o6cNIu7/47BkcT69poEFFD6TheOrs9RuVcJ4+j6a7YxZSWYi+oaoaYXG5lXWZ+UDmx0x9js= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=rbox.co; spf=pass smtp.mailfrom=rbox.co; dkim=pass (2048-bit key) header.d=rbox.co header.i=@rbox.co header.b=HhbaFGun; arc=none smtp.client-ip=185.226.149.38 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=rbox.co Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=rbox.co Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=rbox.co header.i=@rbox.co header.b="HhbaFGun" Received: from mailtransmit02.runbox ([10.9.9.162] helo=aibo.runbox.com) by mailtransmit05.runbox.com with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.93) (envelope-from ) id 1tu79W-00Dmy8-8L; Mon, 17 Mar 2025 10:52:58 +0100 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=rbox.co; s=selector1; h=Cc:To:In-Reply-To:References:Message-Id: Content-Transfer-Encoding:Content-Type:MIME-Version:Subject:Date:From; bh=bxiMFSV63H5IMVOHX04yoU+77ubhpH+TthNc1KSiRUY=; b=HhbaFGun6t3z7GgB0kuVz5pl4F 222dyubEa2BnbPJCZ6HOTiIf2rmBp7eTUz8X3Yc96PXpUshypqtNg+gO6qGDy8kZkg8eKzlqBKokp IaWOGZCNMKBnXKGd+MMqZ5AutUXXXf/BexI6Iiuz29YxajHJw7dNtJl+Q2sI1iuiKaS9ynzWVJFRH FNEnktz5u1Qn0l1VmS+cwNoZH0jVTl44B0Nr1XU8aITiQdw3xmp3vtGwE7l4uglDQz/l1885DU6jA iKu1XagdqCDIPf/44MER5G/G/pUK/K/AkvwCQHsBGmsBpANZX9iGyTdCAyQU8qin+vX9MWDeukYvz gu2BYcnQ==; Received: from [10.9.9.74] (helo=submission03.runbox) by mailtransmit02.runbox with esmtp (Exim 4.86_2) (envelope-from ) id 1tu79Q-0007ox-7D; Mon, 17 Mar 2025 10:52:52 +0100 Received: by submission03.runbox with esmtpsa [Authenticated ID (604044)] (TLS1.2:ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim 4.93) id 1tu796-00DI8D-4b; Mon, 17 Mar 2025 10:52:32 +0100 From: Michal Luczaj Date: Mon, 17 Mar 2025 10:52:23 +0100 Subject: [PATCH net v4 1/3] vsock/bpf: Fix EINTR connect() racing sockmap update Precedence: bulk X-Mailing-List: linux-kselftest@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250317-vsock-trans-signal-race-v4-1-fc8837f3f1d4@rbox.co> References: <20250317-vsock-trans-signal-race-v4-0-fc8837f3f1d4@rbox.co> In-Reply-To: <20250317-vsock-trans-signal-race-v4-0-fc8837f3f1d4@rbox.co> To: Stefano Garzarella , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman , "Michael S. Tsirkin" , Bobby Eshleman , Andrii Nakryiko , Eduard Zingerman , Mykola Lysenko , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Shuah Khan Cc: netdev@vger.kernel.org, bpf@vger.kernel.org, virtualization@lists.linux.dev, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, Michal Luczaj X-Mailer: b4 0.14.2 Signal delivery during connect() may result in a disconnect of an already TCP_ESTABLISHED socket. Problem is that such established socket might have been placed in a sockmap before the connection was closed. We end up with a SS_UNCONNECTED vsock in a sockmap. And this, combined with the ability to reassign (unconnected) vsock's transport to NULL, breaks the sockmap contract. As manifested by WARN_ON_ONCE. connect / state = SS_CONNECTED / sock_map_update_elem if signal_pending state = SS_UNCONNECTED connect transport = NULL vsock_bpf_recvmsg WARN_ON_ONCE(!vsk->transport) Ensure the socket does not stay in sockmap. WARNING: CPU: 8 PID: 1228 at net/vmw_vsock/vsock_bpf.c:97 vsock_bpf_recvmsg+0xb43/0xe00 CPU: 8 UID: 0 PID: 1228 Comm: a.out Not tainted 6.14.0-rc5+ RIP: 0010:vsock_bpf_recvmsg+0xb43/0xe00 sock_recvmsg+0x1b2/0x220 __sys_recvfrom+0x190/0x270 __x64_sys_recvfrom+0xdc/0x1b0 do_syscall_64+0x93/0x1b0 entry_SYSCALL_64_after_hwframe+0x76/0x7e Fixes: 634f1a7110b4 ("vsock: support sockmap") Reviewed-by: Stefano Garzarella Signed-off-by: Michal Luczaj --- net/vmw_vsock/af_vsock.c | 10 +++++++++- net/vmw_vsock/vsock_bpf.c | 1 + 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index 7e3db87ae4333cf63327ec105ca99253569bb9fe..81b1b8e9c946a646778367ab78ca180cef75ef72 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -1579,7 +1579,15 @@ static int vsock_connect(struct socket *sock, struct sockaddr *addr, if (signal_pending(current)) { err = sock_intr_errno(timeout); - sk->sk_state = sk->sk_state == TCP_ESTABLISHED ? TCP_CLOSING : TCP_CLOSE; + if (sk->sk_state == TCP_ESTABLISHED) { + /* Might have raced with a sockmap update. */ + if (sk->sk_prot->unhash) + sk->sk_prot->unhash(sk); + + sk->sk_state = TCP_CLOSING; + } else { + sk->sk_state = TCP_CLOSE; + } sock->state = SS_UNCONNECTED; vsock_transport_cancel_pkt(vsk); vsock_remove_connected(vsk); diff --git a/net/vmw_vsock/vsock_bpf.c b/net/vmw_vsock/vsock_bpf.c index 07b96d56f3a577af71021b1b8132743554996c4f..c68fdaf09046b68254dac3ea70ffbe73dfa45cef 100644 --- a/net/vmw_vsock/vsock_bpf.c +++ b/net/vmw_vsock/vsock_bpf.c @@ -127,6 +127,7 @@ static void vsock_bpf_rebuild_protos(struct proto *prot, const struct proto *bas { *prot = *base; prot->close = sock_map_close; + prot->unhash = sock_map_unhash; prot->recvmsg = vsock_bpf_recvmsg; prot->sock_is_readable = sk_msg_is_readable; } From patchwork Mon Mar 17 09:52:24 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michal Luczaj X-Patchwork-Id: 14018908 Received: from mailtransmit04.runbox.com (mailtransmit04.runbox.com [185.226.149.37]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3B21122ACD3; Mon, 17 Mar 2025 09:52:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.226.149.37 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742205181; cv=none; b=aCgtTK4Z/ulcwXogOvAFRrTODSZHxu3Apn/ys9hRIfTK/VSc4MyxBbhlU5a4buSzRpQ4jlP9ihqc3v8b7YCMuxuycm9Znj7BA2WWIw93vx7xgd9rR3Qr1mW+Jvdzlrvo5nctVcKrvs8JRUuE5MAs+uh4Nnaz/Xjl+NYGlGAHbgo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742205181; c=relaxed/simple; bh=ZG5kxGG80yT+wC6pgOpiIDlC4UZa5bpkuHW4Bf0jXLI=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=NUE72I22FincpDBhFKsO5owkBG2Vdx7fSMWbcIhL0lRwkjgurrK/qMQpgfx+r2yMRYQSofeShVpf9KyO7A1A08BGCmRQHZ0fag7QoZZCDq+ezIZfSGx7j4BKFFGMmtzyM/AtqPoaYDgtmieNCpr7SnhbpAHw4KEsCcZ6GaZZVpQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=rbox.co; spf=pass smtp.mailfrom=rbox.co; dkim=pass (2048-bit key) header.d=rbox.co header.i=@rbox.co header.b=Px3F4wXs; arc=none smtp.client-ip=185.226.149.37 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=rbox.co Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=rbox.co Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=rbox.co header.i=@rbox.co header.b="Px3F4wXs" Received: from mailtransmit03.runbox ([10.9.9.163] helo=aibo.runbox.com) by mailtransmit04.runbox.com with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.93) (envelope-from ) id 1tu79S-00E8dI-Ae; Mon, 17 Mar 2025 10:52:54 +0100 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=rbox.co; s=selector1; h=Cc:To:In-Reply-To:References:Message-Id: Content-Transfer-Encoding:Content-Type:MIME-Version:Subject:Date:From; bh=aMx/ebEU2TmluYJADSJ8PMYLlDnHQ7biwjbfX3GHHD4=; b=Px3F4wXsRwoTyE69Z8/8odj6kn ji0Nc5ulI/ZftQWbCmgeCUafIFR14A2fFgyqe4omarS4j84uUJFqCGukl1o/o9A7Q4Tr6vxpnpufQ sKud9OEhYpih+byIhAMSFQDk9Qw8bupg5+U9gU8VYohzaO80xmiCGDB+c5CE2qxgFPJTCpVFvIFaB By4f4GzxjDy6KWxZ33HECY10HsWgD8C4MZNYgnMtU4tlL+cBCT//E+WhvVJv1NxVHhbITkXJrASDw 1+a+LNG6T5biBaYrd/NeogNj7rxq4xsWN2i1L+bokpecr1fUYUPpLPCFjTeVXQ93amNVBjG2exYGT ciGVLIvg==; Received: from [10.9.9.74] (helo=submission03.runbox) by mailtransmit03.runbox with esmtp (Exim 4.86_2) (envelope-from ) id 1tu79S-0006s8-00; Mon, 17 Mar 2025 10:52:54 +0100 Received: by submission03.runbox with esmtpsa [Authenticated ID (604044)] (TLS1.2:ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim 4.93) id 1tu797-00DI8D-9p; Mon, 17 Mar 2025 10:52:33 +0100 From: Michal Luczaj Date: Mon, 17 Mar 2025 10:52:24 +0100 Subject: [PATCH net v4 2/3] selftest/bpf: Add test for AF_VSOCK connect() racing sockmap update Precedence: bulk X-Mailing-List: linux-kselftest@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250317-vsock-trans-signal-race-v4-2-fc8837f3f1d4@rbox.co> References: <20250317-vsock-trans-signal-race-v4-0-fc8837f3f1d4@rbox.co> In-Reply-To: <20250317-vsock-trans-signal-race-v4-0-fc8837f3f1d4@rbox.co> To: Stefano Garzarella , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman , "Michael S. Tsirkin" , Bobby Eshleman , Andrii Nakryiko , Eduard Zingerman , Mykola Lysenko , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Shuah Khan Cc: netdev@vger.kernel.org, bpf@vger.kernel.org, virtualization@lists.linux.dev, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, Michal Luczaj X-Mailer: b4 0.14.2 Racing signal-interrupted connect() and sockmap update may result in an unconnected (and missing vsock transport) socket in a sockmap. Test spends 2 seconds attempting to reach WARN_ON_ONCE(). connect / state = SS_CONNECTED / sock_map_update_elem if signal_pending state = SS_UNCONNECTED connect transport = NULL vsock_bpf_recvmsg WARN_ON_ONCE(!vsk->transport) Signed-off-by: Michal Luczaj --- .../selftests/bpf/prog_tests/sockmap_basic.c | 99 ++++++++++++++++++++++ 1 file changed, 99 insertions(+) diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c index 1e3e4392dcca0e1722c1982ecc649a80c27443b2..2f8bba27866354848f1e30b5473cedb6a85244ff 100644 --- a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c @@ -3,6 +3,7 @@ #include #include #include +#include #include "test_progs.h" #include "test_skmsg_load_helpers.skel.h" @@ -1042,6 +1043,102 @@ static void test_sockmap_vsock_unconnected(void) xclose(map); } +#define CONNECT_SIGNAL_RACE_TIMEOUT 2 /* seconds */ + +static void sig_handler(int signum) +{ + /* nop */ +} + +static void connect_signal_racer_cleanup(void *map) +{ + xclose(*(int *)map); +} + +static void *connect_signal_racer(void *arg) +{ + pid_t pid; + int map; + + map = bpf_map_create(BPF_MAP_TYPE_SOCKMAP, NULL, sizeof(int), + sizeof(int), 1, NULL); + if (!ASSERT_OK_FD(map, "bpf_map_create")) + return NULL; + + pthread_cleanup_push(connect_signal_racer_cleanup, &map); + pid = getpid(); + + for (;;) { + int c = *(int *)arg; + int zero = 0; + + (void)bpf_map_update_elem(map, &zero, &c, BPF_ANY); + + if (kill(pid, SIGUSR1)) { + FAIL_ERRNO("kill"); + break; + } + + if ((recv(c, NULL, 0, MSG_DONTWAIT) < 0) && errno == ENODEV) { + FAIL_ERRNO("recv"); + break; + } + } + + pthread_cleanup_pop(1); + + return NULL; +} + +static void test_sockmap_vsock_connect_signal_race(void) +{ + struct sockaddr_vm addr, bad_addr; + socklen_t alen = sizeof(addr); + sighandler_t orig_handler; + pthread_t thread; + int s, c, p; + __u64 tout; + + orig_handler = signal(SIGUSR1, sig_handler); + if (!ASSERT_NEQ(orig_handler, SIG_ERR, "signal handler setup")) + return; + + s = socket_loopback(AF_VSOCK, SOCK_SEQPACKET | SOCK_NONBLOCK); + if (s < 0) + goto restore; + + if (xgetsockname(s, (struct sockaddr *)&addr, &alen)) + goto close; + + bad_addr = addr; + bad_addr.svm_cid = 0x42424242; /* non-existing */ + + if (xpthread_create(&thread, 0, connect_signal_racer, &c)) + goto close; + + tout = get_time_ns() + CONNECT_SIGNAL_RACE_TIMEOUT * NSEC_PER_SEC; + do { + c = xsocket(AF_VSOCK, SOCK_SEQPACKET, 0); + if (c < 0) + break; + + if (connect(c, (struct sockaddr *)&addr, alen) && errno == EINTR) + (void)connect(c, (struct sockaddr *)&bad_addr, alen); + + xclose(c); + p = accept(s, NULL, NULL); + if (p >= 0) + xclose(p); + } while (get_time_ns() < tout); + + ASSERT_OK(pthread_cancel(thread), "pthread_cancel"); + xpthread_join(thread, NULL); +close: + xclose(s); +restore: + ASSERT_NEQ(signal(SIGUSR1, orig_handler), SIG_ERR, "handler restore"); +} + void test_sockmap_basic(void) { if (test__start_subtest("sockmap create_update_free")) @@ -1108,4 +1205,6 @@ void test_sockmap_basic(void) test_sockmap_skb_verdict_vsock_poll(); if (test__start_subtest("sockmap vsock unconnected")) test_sockmap_vsock_unconnected(); + if (test__start_subtest("sockmap vsock connect signal race")) + test_sockmap_vsock_connect_signal_race(); } From patchwork Mon Mar 17 09:52:25 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michal Luczaj X-Patchwork-Id: 14018907 Received: from mailtransmit05.runbox.com (mailtransmit05.runbox.com [185.226.149.38]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3800021D58F; Mon, 17 Mar 2025 09:52:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.226.149.38 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742205175; cv=none; b=Fd48ykes6FDf3MTSx/c67R/K0XFmgHu49+TtwXuTtd1IlHxS/Gn6yhAqdDV9/okaSp/kxOWfE7+SfOcU9bKLoJr4AAVv2pVbPXagVSLq1lgEqvhZQ6K0E9ajpp6pWLCp0b1GL7WlKF3b0yK9T99qHU2nQMuxfwoBELgObPNKWxk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742205175; c=relaxed/simple; bh=WKSI5tQo2ZCmquUaEjyal9KCtNxILWDmnnMtd21oT78=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=ZCY8OkNWINQ4v7A/1+Ro1bO3XYj+N0oaLISn+41V6g1VDttZ7gxz/zOm1/mDC0WZ+1FEE9CyqTLzY7PJakjhGWOyR0NyFaPr7BO/O2UTO04TZZm2QZbBBbcRjkxyfUss/UNBPmkyQCNS9wyTK8kU806ifQoM+BzAjjk4Gyte1Cc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=rbox.co; spf=pass smtp.mailfrom=rbox.co; dkim=pass (2048-bit key) header.d=rbox.co header.i=@rbox.co header.b=JWFtrXAg; arc=none smtp.client-ip=185.226.149.38 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=rbox.co Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=rbox.co Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=rbox.co header.i=@rbox.co header.b="JWFtrXAg" Received: from mailtransmit03.runbox ([10.9.9.163] helo=aibo.runbox.com) by mailtransmit05.runbox.com with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.93) (envelope-from ) id 1tu79O-00DmxO-It; Mon, 17 Mar 2025 10:52:50 +0100 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=rbox.co; s=selector1; h=Cc:To:In-Reply-To:References:Message-Id: Content-Transfer-Encoding:Content-Type:MIME-Version:Subject:Date:From; bh=Qx6Q42lUm6Tq+A/H1tasLuFXsBAYaSnDK2lri+KURPw=; b=JWFtrXAgkQHQ10RuuOxq1VK2w2 VWohFZK2Mnz39cLATaohGUFoccRvLTaHbCRUyXb9Rb69PFypkvVB9R3h3wdN6pBTVVJSRR6Ff2djK JCXpDW/WMF9uL6yoT8tkvr+b8Smh5S9IOcrUHzhvid3Mc+irYW0Ged44RrZgIR//QAMrdjx26eRvX jw7t12ZACagrJzLgIbgJ1zX3ruN2oeKK8f1kHbn3PKHiTAykw3+3ihDqnMm0oKlyrGB9tpcQzpPKE pZom/XHni2wGzmzKdPLE34KScel0bBELP2P3Un4qsTGtUvyB0jlSkSVTYC0zMvtQUBK1qm78Sjw7Z oHVr+lQg==; Received: from [10.9.9.74] (helo=submission03.runbox) by mailtransmit03.runbox with esmtp (Exim 4.86_2) (envelope-from ) id 1tu79O-0006rr-76; Mon, 17 Mar 2025 10:52:50 +0100 Received: by submission03.runbox with esmtpsa [Authenticated ID (604044)] (TLS1.2:ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim 4.93) id 1tu798-00DI8D-GG; Mon, 17 Mar 2025 10:52:34 +0100 From: Michal Luczaj Date: Mon, 17 Mar 2025 10:52:25 +0100 Subject: [PATCH net v4 3/3] vsock/bpf: Fix bpf recvmsg() racing transport reassignment Precedence: bulk X-Mailing-List: linux-kselftest@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250317-vsock-trans-signal-race-v4-3-fc8837f3f1d4@rbox.co> References: <20250317-vsock-trans-signal-race-v4-0-fc8837f3f1d4@rbox.co> In-Reply-To: <20250317-vsock-trans-signal-race-v4-0-fc8837f3f1d4@rbox.co> To: Stefano Garzarella , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman , "Michael S. Tsirkin" , Bobby Eshleman , Andrii Nakryiko , Eduard Zingerman , Mykola Lysenko , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Shuah Khan Cc: netdev@vger.kernel.org, bpf@vger.kernel.org, virtualization@lists.linux.dev, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, Michal Luczaj X-Mailer: b4 0.14.2 Signal delivery during connect() may lead to a disconnect of an already established socket. That involves removing socket from any sockmap and resetting state to SS_UNCONNECTED. While it correctly restores socket's proto, a call to vsock_bpf_recvmsg() might have been already under way in another thread. If the connect()ing thread reassigns the vsock transport to NULL, the recvmsg()ing thread may trigger a WARN_ON_ONCE. connect / state = SS_CONNECTED / sock_map_update_elem vsock_bpf_recvmsg psock = sk_psock_get() lock sk if signal_pending unhash sock_map_remove_links state = SS_UNCONNECTED release sk connect transport = NULL lock sk WARN_ON_ONCE(!vsk->transport) Protect recvmsg() from racing against transport reassignment. Enforce the sockmap invariant that psock implies transport: lock socket before getting psock. WARNING: CPU: 9 PID: 1222 at net/vmw_vsock/vsock_bpf.c:92 vsock_bpf_recvmsg+0xb55/0xe00 CPU: 9 UID: 0 PID: 1222 Comm: a.out Not tainted 6.14.0-rc5+ RIP: 0010:vsock_bpf_recvmsg+0xb55/0xe00 sock_recvmsg+0x1b2/0x220 __sys_recvfrom+0x190/0x270 __x64_sys_recvfrom+0xdc/0x1b0 do_syscall_64+0x93/0x1b0 entry_SYSCALL_64_after_hwframe+0x76/0x7e Fixes: 634f1a7110b4 ("vsock: support sockmap") Signed-off-by: Michal Luczaj --- net/vmw_vsock/vsock_bpf.c | 23 +++++++++++++++-------- 1 file changed, 15 insertions(+), 8 deletions(-) diff --git a/net/vmw_vsock/vsock_bpf.c b/net/vmw_vsock/vsock_bpf.c index c68fdaf09046b68254dac3ea70ffbe73dfa45cef..5138195d91fb258d4bc09b48e80e13651d62863a 100644 --- a/net/vmw_vsock/vsock_bpf.c +++ b/net/vmw_vsock/vsock_bpf.c @@ -73,28 +73,35 @@ static int __vsock_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int return err; } -static int vsock_bpf_recvmsg(struct sock *sk, struct msghdr *msg, - size_t len, int flags, int *addr_len) +static int vsock_bpf_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, + int flags, int *addr_len) { struct sk_psock *psock; struct vsock_sock *vsk; int copied; + /* Since signal delivery during connect() may reset the state of socket + * that's already in a sockmap, take the lock before checking on psock. + * This serializes a possible transport reassignment, protecting this + * function from running with NULL transport. + */ + lock_sock(sk); + psock = sk_psock_get(sk); - if (unlikely(!psock)) + if (unlikely(!psock)) { + release_sock(sk); return __vsock_recvmsg(sk, msg, len, flags); + } - lock_sock(sk); vsk = vsock_sk(sk); - if (WARN_ON_ONCE(!vsk->transport)) { copied = -ENODEV; goto out; } if (vsock_has_data(sk, psock) && sk_psock_queue_empty(psock)) { - release_sock(sk); sk_psock_put(sk, psock); + release_sock(sk); return __vsock_recvmsg(sk, msg, len, flags); } @@ -108,8 +115,8 @@ static int vsock_bpf_recvmsg(struct sock *sk, struct msghdr *msg, } if (sk_psock_queue_empty(psock)) { - release_sock(sk); sk_psock_put(sk, psock); + release_sock(sk); return __vsock_recvmsg(sk, msg, len, flags); } @@ -117,8 +124,8 @@ static int vsock_bpf_recvmsg(struct sock *sk, struct msghdr *msg, } out: - release_sock(sk); sk_psock_put(sk, psock); + release_sock(sk); return copied; }