diff mbox series

[net,1/2] vsock: Orphan socket after transport release

Message ID 20250204-vsock-linger-nullderef-v1-1-6eb1760fa93e@rbox.co (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series vsock: null-ptr-deref when SO_LINGER enabled | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers warning 1 maintainers not CCed: virtualization@lists.linux.dev
netdev/build_clang success Errors and warnings before: 2 this patch: 2
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 15 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2025-02-04--15-00 (tests: 886)

Commit Message

Michal Luczaj Feb. 4, 2025, 12:29 a.m. UTC
During socket release, sock_orphan() is called without considering that it
sets sk->sk_wq to NULL. Later, if SO_LINGER is enabled, this leads to a
null pointer dereferenced in virtio_transport_wait_close().

Orphan the socket only after transport release.

Partially reverts the 'Fixes:' commit.

KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f]
 lock_acquire+0x19e/0x500
 _raw_spin_lock_irqsave+0x47/0x70
 add_wait_queue+0x46/0x230
 virtio_transport_release+0x4e7/0x7f0
 __vsock_release+0xfd/0x490
 vsock_release+0x90/0x120
 __sock_release+0xa3/0x250
 sock_close+0x14/0x20
 __fput+0x35e/0xa90
 __x64_sys_close+0x78/0xd0
 do_syscall_64+0x93/0x1b0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Reported-by: syzbot+9d55b199192a4be7d02c@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=9d55b199192a4be7d02c
Fixes: fcdd2242c023 ("vsock: Keep the binding until socket destruction")
Signed-off-by: Michal Luczaj <mhal@rbox.co>
---
 net/vmw_vsock/af_vsock.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Stefano Garzarella Feb. 4, 2025, 10:32 a.m. UTC | #1
On Tue, Feb 04, 2025 at 01:29:52AM +0100, Michal Luczaj wrote:
>During socket release, sock_orphan() is called without considering that it
>sets sk->sk_wq to NULL. Later, if SO_LINGER is enabled, this leads to a
>null pointer dereferenced in virtio_transport_wait_close().
>
>Orphan the socket only after transport release.
>
>Partially reverts the 'Fixes:' commit.
>
>KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f]
> lock_acquire+0x19e/0x500
> _raw_spin_lock_irqsave+0x47/0x70
> add_wait_queue+0x46/0x230
> virtio_transport_release+0x4e7/0x7f0
> __vsock_release+0xfd/0x490
> vsock_release+0x90/0x120
> __sock_release+0xa3/0x250
> sock_close+0x14/0x20
> __fput+0x35e/0xa90
> __x64_sys_close+0x78/0xd0
> do_syscall_64+0x93/0x1b0
> entry_SYSCALL_64_after_hwframe+0x76/0x7e
>
>Reported-by: syzbot+9d55b199192a4be7d02c@syzkaller.appspotmail.com
>Closes: https://syzkaller.appspot.com/bug?extid=9d55b199192a4be7d02c
>Fixes: fcdd2242c023 ("vsock: Keep the binding until socket destruction")

Looking better at that patch, can you check if we break commit
3a5cc90a4d17 ("vsock/virtio: remove socket from connected/bound list on 
shutdown")

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3a5cc90a4d1756072619fe511d07621bdef7f120

BTW we also added a test to cover that scenario, so we should be fine 
since the suite I run was fine.

>Signed-off-by: Michal Luczaj <mhal@rbox.co>
>---
> net/vmw_vsock/af_vsock.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index 075695173648d3a4ecbd04e908130efdbb393b41..06250bb9afe2f253e96130b73554aae9151aaac1 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -824,13 +824,14 @@ static void __vsock_release(struct sock *sk, int level)
> 	 */
> 	lock_sock_nested(sk, level);
>

I would add a comment here to explain that we need to set it, so 
vsock_remove_sock() called here some lines above, or by transports in 
the release() callback (maybe in the future we can refactor it, and call 
it only here) will remove the binding only if it's set, since the 
release() is also called when de-assigning the transport.

Thanks,
Stefano

>-	sock_orphan(sk);
>+	sock_set_flag(sk, SOCK_DEAD);
>
> 	if (vsk->transport)
> 		vsk->transport->release(vsk);
> 	else if (sock_type_connectible(sk->sk_type))
> 		vsock_remove_sock(vsk);
>
>+	sock_orphan(sk);
> 	sk->sk_shutdown = SHUTDOWN_MASK;
>
> 	skb_queue_purge(&sk->sk_receive_queue);
>
>-- 
>2.48.1
>
Luigi Leonardi Feb. 4, 2025, 3:44 p.m. UTC | #2
On Tue, Feb 04, 2025 at 11:32:54AM +0100, Stefano Garzarella wrote:
>On Tue, Feb 04, 2025 at 01:29:52AM +0100, Michal Luczaj wrote:
>>During socket release, sock_orphan() is called without considering that it
>>sets sk->sk_wq to NULL. Later, if SO_LINGER is enabled, this leads to a
>>null pointer dereferenced in virtio_transport_wait_close().
>>
>>Orphan the socket only after transport release.
>>
>>Partially reverts the 'Fixes:' commit.
>>
>>KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f]
>>lock_acquire+0x19e/0x500
>>_raw_spin_lock_irqsave+0x47/0x70
>>add_wait_queue+0x46/0x230
>>virtio_transport_release+0x4e7/0x7f0
>>__vsock_release+0xfd/0x490
>>vsock_release+0x90/0x120
>>__sock_release+0xa3/0x250
>>sock_close+0x14/0x20
>>__fput+0x35e/0xa90
>>__x64_sys_close+0x78/0xd0
>>do_syscall_64+0x93/0x1b0
>>entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>
>>Reported-by: syzbot+9d55b199192a4be7d02c@syzkaller.appspotmail.com
>>Closes: https://syzkaller.appspot.com/bug?extid=9d55b199192a4be7d02c
>>Fixes: fcdd2242c023 ("vsock: Keep the binding until socket destruction")
>
>Looking better at that patch, can you check if we break commit
>3a5cc90a4d17 ("vsock/virtio: remove socket from connected/bound list 
>on shutdown")
>
>https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3a5cc90a4d1756072619fe511d07621bdef7f120
>
I worked with Filippo (+CC) on this patch.

IMHO it shouldn't do any harm. `sock_orphan` sets sk->sk_socket and 
sk_wq to NULL, and sets the SOCK_DEAD flag.

This patch sets the latter in the same place. All the other fields are 
not used by the transport->release() (at least in virtio-based 
transports), so from my perspective there is no real change.

What was your concern?

>BTW we also added a test to cover that scenario, so we should be fine 
>since the suite I run was fine.
Yep! Test runs fine.
>
>>Signed-off-by: Michal Luczaj <mhal@rbox.co>
>>---
>>net/vmw_vsock/af_vsock.c | 3 ++-
>>1 file changed, 2 insertions(+), 1 deletion(-)
>>
>>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>>index 075695173648d3a4ecbd04e908130efdbb393b41..06250bb9afe2f253e96130b73554aae9151aaac1 100644
>>--- a/net/vmw_vsock/af_vsock.c
>>+++ b/net/vmw_vsock/af_vsock.c
>>@@ -824,13 +824,14 @@ static void __vsock_release(struct sock *sk, int level)
>>	 */
>>	lock_sock_nested(sk, level);
>>
>
>I would add a comment here to explain that we need to set it, so 
>vsock_remove_sock() called here some lines above, or by transports in 
>the release() callback (maybe in the future we can refactor it, and 
>call it only here) will remove the binding only if it's set, since the 
>release() is also called when de-assigning the transport.
>
>Thanks,
>Stefano
>
>>-	sock_orphan(sk);
>>+	sock_set_flag(sk, SOCK_DEAD);
>>
>>	if (vsk->transport)
>>		vsk->transport->release(vsk);
>>	else if (sock_type_connectible(sk->sk_type))
>>		vsock_remove_sock(vsk);
>>
>>+	sock_orphan(sk);
>>	sk->sk_shutdown = SHUTDOWN_MASK;
>>
>>	skb_queue_purge(&sk->sk_receive_queue);
>>
>>-- 
>>2.48.1
>>
>

@Michal From the code POV I have no concerns, LGTM :)

Thanks,
Luigi
Stefano Garzarella Feb. 4, 2025, 4 p.m. UTC | #3
On Tue, Feb 04, 2025 at 04:44:13PM +0100, Luigi Leonardi wrote:
>On Tue, Feb 04, 2025 at 11:32:54AM +0100, Stefano Garzarella wrote:
>>On Tue, Feb 04, 2025 at 01:29:52AM +0100, Michal Luczaj wrote:
>>>During socket release, sock_orphan() is called without considering that it
>>>sets sk->sk_wq to NULL. Later, if SO_LINGER is enabled, this leads to a
>>>null pointer dereferenced in virtio_transport_wait_close().
>>>
>>>Orphan the socket only after transport release.
>>>
>>>Partially reverts the 'Fixes:' commit.
>>>
>>>KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f]
>>>lock_acquire+0x19e/0x500
>>>_raw_spin_lock_irqsave+0x47/0x70
>>>add_wait_queue+0x46/0x230
>>>virtio_transport_release+0x4e7/0x7f0
>>>__vsock_release+0xfd/0x490
>>>vsock_release+0x90/0x120
>>>__sock_release+0xa3/0x250
>>>sock_close+0x14/0x20
>>>__fput+0x35e/0xa90
>>>__x64_sys_close+0x78/0xd0
>>>do_syscall_64+0x93/0x1b0
>>>entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>>
>>>Reported-by: syzbot+9d55b199192a4be7d02c@syzkaller.appspotmail.com
>>>Closes: https://syzkaller.appspot.com/bug?extid=9d55b199192a4be7d02c
>>>Fixes: fcdd2242c023 ("vsock: Keep the binding until socket destruction")
>>
>>Looking better at that patch, can you check if we break commit
>>3a5cc90a4d17 ("vsock/virtio: remove socket from connected/bound list 
>>on shutdown")
>>
>>https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3a5cc90a4d1756072619fe511d07621bdef7f120
>>
>I worked with Filippo (+CC) on this patch.
>
>IMHO it shouldn't do any harm. `sock_orphan` sets sk->sk_socket and 
>sk_wq to NULL, and sets the SOCK_DEAD flag.
>
>This patch sets the latter in the same place. All the other fields are 
>not used by the transport->release() (at least in virtio-based 
>transports), so from my perspective there is no real change.
>
>What was your concern?

My concern was more about calling `vsock_remove_sock()` in 
virtio_transport_recv_connected:

I mean this block:
	case VIRTIO_VSOCK_OP_SHUTDOWN:
		if (le32_to_cpu(hdr->flags) & VIRTIO_VSOCK_SHUTDOWN_RCV)
			vsk->peer_shutdown |= RCV_SHUTDOWN;
		if (le32_to_cpu(hdr->flags) & VIRTIO_VSOCK_SHUTDOWN_SEND)
			vsk->peer_shutdown |= SEND_SHUTDOWN;
		if (vsk->peer_shutdown == SHUTDOWN_MASK) {
			if (vsock_stream_has_data(vsk) <= 0 && !sock_flag(sk, SOCK_DONE)) {
				(void)virtio_transport_reset(vsk, NULL);
				virtio_transport_do_close(vsk, true);
			}
			/* Remove this socket anyway because the remote peer sent
			 * the shutdown. This way a new connection will succeed
			 * if the remote peer uses the same source port,
			 * even if the old socket is still unreleased, but now disconnected.
			 */
			vsock_remove_sock(vsk);
		}

After commit fcdd2242c023 ("vsock: Keep the binding until socket 
destruction") calling `vsock_remove_sock` without SOCK_DEAD set, removes 
the socket only from the connected list.

So, IMHO there is a real change, but I'm not sure if it's an issue or 
not, since the issue fixed by commit 3a5cc90a4d17 ("vsock/virtio: remove 
socket from connected/bound list on shutdown") was more about the remote 
port IIRC, so that should only be affected by the connected list, which 
is stll touched now.

Stefano
Michal Luczaj Feb. 4, 2025, 11:59 p.m. UTC | #4
On 2/4/25 11:32, Stefano Garzarella wrote:
> On Tue, Feb 04, 2025 at 01:29:52AM +0100, Michal Luczaj wrote:
>> @@ -824,13 +824,14 @@ static void __vsock_release(struct sock *sk, int level)
>> 	 */
>> 	lock_sock_nested(sk, level);
>>
> 
> I would add a comment here to explain that we need to set it, so 
> vsock_remove_sock() called here some lines above, or by transports in 
> the release() callback (maybe in the future we can refactor it, and call 
> it only here) will remove the binding only if it's set, since the 
> release() is also called when de-assigning the transport.
> 
>> -	sock_orphan(sk);
>> +	sock_set_flag(sk, SOCK_DEAD);

OK, will do.

Thanks,
Michal
Michal Luczaj Feb. 5, 2025, 12:11 a.m. UTC | #5
On 2/4/25 17:00, Stefano Garzarella wrote:
> On Tue, Feb 04, 2025 at 04:44:13PM +0100, Luigi Leonardi wrote:
>> On Tue, Feb 04, 2025 at 11:32:54AM +0100, Stefano Garzarella wrote:
>>> On Tue, Feb 04, 2025 at 01:29:52AM +0100, Michal Luczaj wrote:
>>>> During socket release, sock_orphan() is called without considering that it
>>>> sets sk->sk_wq to NULL. Later, if SO_LINGER is enabled, this leads to a
>>>> null pointer dereferenced in virtio_transport_wait_close().
>>>>
>>>> Orphan the socket only after transport release.
>>>>
>>>> Partially reverts the 'Fixes:' commit.
>>>>
>>>> KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f]
>>>> lock_acquire+0x19e/0x500
>>>> _raw_spin_lock_irqsave+0x47/0x70
>>>> add_wait_queue+0x46/0x230
>>>> virtio_transport_release+0x4e7/0x7f0
>>>> __vsock_release+0xfd/0x490
>>>> vsock_release+0x90/0x120
>>>> __sock_release+0xa3/0x250
>>>> sock_close+0x14/0x20
>>>> __fput+0x35e/0xa90
>>>> __x64_sys_close+0x78/0xd0
>>>> do_syscall_64+0x93/0x1b0
>>>> entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>>>
>>>> Reported-by: syzbot+9d55b199192a4be7d02c@syzkaller.appspotmail.com
>>>> Closes: https://syzkaller.appspot.com/bug?extid=9d55b199192a4be7d02c
>>>> Fixes: fcdd2242c023 ("vsock: Keep the binding until socket destruction")
>>>
>>> Looking better at that patch, can you check if we break commit
>>> 3a5cc90a4d17 ("vsock/virtio: remove socket from connected/bound list 
>>> on shutdown")
>>>
>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3a5cc90a4d1756072619fe511d07621bdef7f120
>>>
>> I worked with Filippo (+CC) on this patch.
>>
>> IMHO it shouldn't do any harm. `sock_orphan` sets sk->sk_socket and 
>> sk_wq to NULL, and sets the SOCK_DEAD flag.
>>
>> This patch sets the latter in the same place. All the other fields are 
>> not used by the transport->release() (at least in virtio-based 
>> transports), so from my perspective there is no real change.
>>
>> What was your concern?
> 
> My concern was more about calling `vsock_remove_sock()` in 
> virtio_transport_recv_connected:
> 
> I mean this block:
> 	case VIRTIO_VSOCK_OP_SHUTDOWN:
> 		if (le32_to_cpu(hdr->flags) & VIRTIO_VSOCK_SHUTDOWN_RCV)
> 			vsk->peer_shutdown |= RCV_SHUTDOWN;
> 		if (le32_to_cpu(hdr->flags) & VIRTIO_VSOCK_SHUTDOWN_SEND)
> 			vsk->peer_shutdown |= SEND_SHUTDOWN;
> 		if (vsk->peer_shutdown == SHUTDOWN_MASK) {
> 			if (vsock_stream_has_data(vsk) <= 0 && !sock_flag(sk, SOCK_DONE)) {
> 				(void)virtio_transport_reset(vsk, NULL);
> 				virtio_transport_do_close(vsk, true);
> 			}
> 			/* Remove this socket anyway because the remote peer sent
> 			 * the shutdown. This way a new connection will succeed
> 			 * if the remote peer uses the same source port,
> 			 * even if the old socket is still unreleased, but now disconnected.
> 			 */
> 			vsock_remove_sock(vsk);
> 		}
> 
> After commit fcdd2242c023 ("vsock: Keep the binding until socket 
> destruction") calling `vsock_remove_sock` without SOCK_DEAD set, removes 
> the socket only from the connected list.
> 
> So, IMHO there is a real change, but I'm not sure if it's an issue or 
> not, since the issue fixed by commit 3a5cc90a4d17 ("vsock/virtio: remove 
> socket from connected/bound list on shutdown") was more about the remote 
> port IIRC, so that should only be affected by the connected list, which 
> is stll touched now.

I agree, not an issue. But maybe it's worth replacing
vsock_remove_sock(vsk) with vsock_remove_connected(vsk) to better convey
what kind of removal we're talking about here?

Michal
diff mbox series

Patch

diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 075695173648d3a4ecbd04e908130efdbb393b41..06250bb9afe2f253e96130b73554aae9151aaac1 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -824,13 +824,14 @@  static void __vsock_release(struct sock *sk, int level)
 	 */
 	lock_sock_nested(sk, level);
 
-	sock_orphan(sk);
+	sock_set_flag(sk, SOCK_DEAD);
 
 	if (vsk->transport)
 		vsk->transport->release(vsk);
 	else if (sock_type_connectible(sk->sk_type))
 		vsock_remove_sock(vsk);
 
+	sock_orphan(sk);
 	sk->sk_shutdown = SHUTDOWN_MASK;
 
 	skb_queue_purge(&sk->sk_receive_queue);