diff mbox series

[net] net/smc: fix general protection fault in __smc_diag_dump

Message ID 20250331081003.1503211-1-wangliang74@huawei.com (mailing list archive)
State Rejected
Delegated to: Netdev Maintainers
Headers show
Series [net] net/smc: fix general protection fault in __smc_diag_dump | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 13 of 13 maintainers
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 7 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2025-04-01--00-00 (tests: 902)

Commit Message

Wang Liang March 31, 2025, 8:10 a.m. UTC
Syzbot reported a general protection fault:

  CPU: 0 UID: 0 PID: 5830 Comm: syz-executor600 Not tainted 6.14.0-rc4-syzkaller-00090-gdd83757f6e68 #0
  RIP: 0010:smc_diag_msg_common_fill net/smc/smc_diag.c:44 [inline]
  RIP: 0010:__smc_diag_dump.constprop.0+0x3de/0x23d0 net/smc/smc_diag.c:89
  Call Trace:
   <TASK>
   smc_diag_dump_proto+0x26d/0x420 net/smc/smc_diag.c:217
   smc_diag_dump+0x84/0x90 net/smc/smc_diag.c:236
   netlink_dump+0x53c/0xd00 net/netlink/af_netlink.c:2318
   __netlink_dump_start+0x6ca/0x970 net/netlink/af_netlink.c:2433
   netlink_dump_start include/linux/netlink.h:340 [inline]
   smc_diag_handler_dump+0x1fb/0x240 net/smc/smc_diag.c:251
   __sock_diag_cmd net/core/sock_diag.c:249 [inline]
   sock_diag_rcv_msg+0x437/0x790 net/core/sock_diag.c:287
   netlink_rcv_skb+0x16b/0x440 net/netlink/af_netlink.c:2543
   netlink_unicast_kernel net/netlink/af_netlink.c:1322 [inline]
   netlink_unicast+0x53c/0x7f0 net/netlink/af_netlink.c:1348
   netlink_sendmsg+0x8b8/0xd70 net/netlink/af_netlink.c:1892
   sock_sendmsg_nosec net/socket.c:718 [inline]
   __sock_sendmsg net/socket.c:733 [inline]
   ____sys_sendmsg+0xaaf/0xc90 net/socket.c:2573
   ___sys_sendmsg+0x135/0x1e0 net/socket.c:2627
   __sys_sendmsg+0x16e/0x220 net/socket.c:2659
   do_syscall_x64 arch/x86/entry/common.c:52 [inline]
   do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83
   entry_SYSCALL_64_after_hwframe+0x77/0x7f
   </TASK>

When create smc socket, smc_inet_init_sock() first add sk to the smc_hash
by smc_hash_sk(), then create smc->clcsock. it is possible that, after
smc_diag_dump_proto() traverses the smc_hash, smc->clcsock is not created
when the function visit it.

The process like this:

  (CPU1)                         | (CPU2)
  inet6_create()                 |
    smc_inet_init_sock()         |
      smc_sk_init()              |
        smc_hash_sk()            |
          head = &smc_hash->ht;  |
          sk_add_node(sk, head); |
                                 | smc_diag_dump_proto
                                 |   head = &smc_hash->ht;
                                 |     sk_for_each(sk, head)
                                 |       __smc_diag_dump()
                                 |         visit smc->clcsock
      smc_create_clcsk()         |
          set smc->clcsock       |

Fix this by initialize smc->clcsock to NULL before add sk to smc_hash in
smc_sk_init().

Reported-by: syzbot+271fed3ed6f24600c364@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=271fed3ed6f24600c364
Fixes: f16a7dd5cf27 ("smc: netlink interface for SMC sockets")
Signed-off-by: Wang Liang <wangliang74@huawei.com>
---
 net/smc/af_smc.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Paolo Abeni April 1, 2025, 11:01 a.m. UTC | #1
On 3/31/25 10:10 AM, Wang Liang wrote:
> diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
> index 3e6cb35baf25..454801188514 100644
> --- a/net/smc/af_smc.c
> +++ b/net/smc/af_smc.c
> @@ -371,6 +371,7 @@ void smc_sk_init(struct net *net, struct sock *sk, int protocol)
>  	sk->sk_protocol = protocol;
>  	WRITE_ONCE(sk->sk_sndbuf, 2 * READ_ONCE(net->smc.sysctl_wmem));
>  	WRITE_ONCE(sk->sk_rcvbuf, 2 * READ_ONCE(net->smc.sysctl_rmem));
> +	smc->clcsock = NULL;
>  	INIT_WORK(&smc->tcp_listen_work, smc_tcp_listen_work);
>  	INIT_WORK(&smc->connect_work, smc_connect_work);
>  	INIT_DELAYED_WORK(&smc->conn.tx_work, smc_tx_work);

The syzkaller report has a few reproducers, have you tested this? AFAICS
the smc socket is already zeroed on allocation by sk_alloc().

/P
Zhu Yanjun April 1, 2025, 1:31 p.m. UTC | #2
On 01.04.25 13:01, Paolo Abeni wrote:
> On 3/31/25 10:10 AM, Wang Liang wrote:
>> diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
>> index 3e6cb35baf25..454801188514 100644
>> --- a/net/smc/af_smc.c
>> +++ b/net/smc/af_smc.c
>> @@ -371,6 +371,7 @@ void smc_sk_init(struct net *net, struct sock *sk, int protocol)
>>   	sk->sk_protocol = protocol;
>>   	WRITE_ONCE(sk->sk_sndbuf, 2 * READ_ONCE(net->smc.sysctl_wmem));
>>   	WRITE_ONCE(sk->sk_rcvbuf, 2 * READ_ONCE(net->smc.sysctl_rmem));
>> +	smc->clcsock = NULL;
>>   	INIT_WORK(&smc->tcp_listen_work, smc_tcp_listen_work);
>>   	INIT_WORK(&smc->connect_work, smc_connect_work);
>>   	INIT_DELAYED_WORK(&smc->conn.tx_work, smc_tx_work);
> 
> The syzkaller report has a few reproducers, have you tested this? AFAICS
> the smc socket is already zeroed on allocation by sk_alloc().

Yes. I also agree with you that smc socket should have already been zeroed.

Currently in this commit, this member variable is set to NULL 
explicitly. I am not sure if this can fix this problem or not.

Based on the following, it seems that this problem can be reproduced.
"
syzbot has tested the proposed patch but the reproducer is still 
triggering an issue:
general protection fault in __smc_diag_dump
"

Thus follow the instructions in this link to make tests.

https://groups.google.com/g/syzkaller-bugs/c/YwENRImdcsk/m/wBJo6qGiCAAJ?pli=1, 
the following can trigger the reproducer.

"
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.
"

Zhu Yanjun

> 
> /P
>
diff mbox series

Patch

diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index 3e6cb35baf25..454801188514 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -371,6 +371,7 @@  void smc_sk_init(struct net *net, struct sock *sk, int protocol)
 	sk->sk_protocol = protocol;
 	WRITE_ONCE(sk->sk_sndbuf, 2 * READ_ONCE(net->smc.sysctl_wmem));
 	WRITE_ONCE(sk->sk_rcvbuf, 2 * READ_ONCE(net->smc.sysctl_rmem));
+	smc->clcsock = NULL;
 	INIT_WORK(&smc->tcp_listen_work, smc_tcp_listen_work);
 	INIT_WORK(&smc->connect_work, smc_connect_work);
 	INIT_DELAYED_WORK(&smc->conn.tx_work, smc_tx_work);