diff mbox series

[net] net/smc: fix general protection fault in __smc_diag_dump

Message ID 20250331081003.1503211-1-wangliang74@huawei.com (mailing list archive)
State Not Applicable
Headers show
Series [net] net/smc: fix general protection fault in __smc_diag_dump | expand

Commit Message

Wang Liang March 31, 2025, 8:10 a.m. UTC
Syzbot reported a general protection fault:

  CPU: 0 UID: 0 PID: 5830 Comm: syz-executor600 Not tainted 6.14.0-rc4-syzkaller-00090-gdd83757f6e68 #0
  RIP: 0010:smc_diag_msg_common_fill net/smc/smc_diag.c:44 [inline]
  RIP: 0010:__smc_diag_dump.constprop.0+0x3de/0x23d0 net/smc/smc_diag.c:89
  Call Trace:
   <TASK>
   smc_diag_dump_proto+0x26d/0x420 net/smc/smc_diag.c:217
   smc_diag_dump+0x84/0x90 net/smc/smc_diag.c:236
   netlink_dump+0x53c/0xd00 net/netlink/af_netlink.c:2318
   __netlink_dump_start+0x6ca/0x970 net/netlink/af_netlink.c:2433
   netlink_dump_start include/linux/netlink.h:340 [inline]
   smc_diag_handler_dump+0x1fb/0x240 net/smc/smc_diag.c:251
   __sock_diag_cmd net/core/sock_diag.c:249 [inline]
   sock_diag_rcv_msg+0x437/0x790 net/core/sock_diag.c:287
   netlink_rcv_skb+0x16b/0x440 net/netlink/af_netlink.c:2543
   netlink_unicast_kernel net/netlink/af_netlink.c:1322 [inline]
   netlink_unicast+0x53c/0x7f0 net/netlink/af_netlink.c:1348
   netlink_sendmsg+0x8b8/0xd70 net/netlink/af_netlink.c:1892
   sock_sendmsg_nosec net/socket.c:718 [inline]
   __sock_sendmsg net/socket.c:733 [inline]
   ____sys_sendmsg+0xaaf/0xc90 net/socket.c:2573
   ___sys_sendmsg+0x135/0x1e0 net/socket.c:2627
   __sys_sendmsg+0x16e/0x220 net/socket.c:2659
   do_syscall_x64 arch/x86/entry/common.c:52 [inline]
   do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83
   entry_SYSCALL_64_after_hwframe+0x77/0x7f
   </TASK>

When create smc socket, smc_inet_init_sock() first add sk to the smc_hash
by smc_hash_sk(), then create smc->clcsock. it is possible that, after
smc_diag_dump_proto() traverses the smc_hash, smc->clcsock is not created
when the function visit it.

The process like this:

  (CPU1)                         | (CPU2)
  inet6_create()                 |
    smc_inet_init_sock()         |
      smc_sk_init()              |
        smc_hash_sk()            |
          head = &smc_hash->ht;  |
          sk_add_node(sk, head); |
                                 | smc_diag_dump_proto
                                 |   head = &smc_hash->ht;
                                 |     sk_for_each(sk, head)
                                 |       __smc_diag_dump()
                                 |         visit smc->clcsock
      smc_create_clcsk()         |
          set smc->clcsock       |

Fix this by initialize smc->clcsock to NULL before add sk to smc_hash in
smc_sk_init().

Reported-by: syzbot+271fed3ed6f24600c364@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=271fed3ed6f24600c364
Fixes: f16a7dd5cf27 ("smc: netlink interface for SMC sockets")
Signed-off-by: Wang Liang <wangliang74@huawei.com>
---
 net/smc/af_smc.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Paolo Abeni April 1, 2025, 11:01 a.m. UTC | #1
On 3/31/25 10:10 AM, Wang Liang wrote:
> diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
> index 3e6cb35baf25..454801188514 100644
> --- a/net/smc/af_smc.c
> +++ b/net/smc/af_smc.c
> @@ -371,6 +371,7 @@ void smc_sk_init(struct net *net, struct sock *sk, int protocol)
>  	sk->sk_protocol = protocol;
>  	WRITE_ONCE(sk->sk_sndbuf, 2 * READ_ONCE(net->smc.sysctl_wmem));
>  	WRITE_ONCE(sk->sk_rcvbuf, 2 * READ_ONCE(net->smc.sysctl_rmem));
> +	smc->clcsock = NULL;
>  	INIT_WORK(&smc->tcp_listen_work, smc_tcp_listen_work);
>  	INIT_WORK(&smc->connect_work, smc_connect_work);
>  	INIT_DELAYED_WORK(&smc->conn.tx_work, smc_tx_work);

The syzkaller report has a few reproducers, have you tested this? AFAICS
the smc socket is already zeroed on allocation by sk_alloc().

/P
Zhu Yanjun April 1, 2025, 1:31 p.m. UTC | #2
On 01.04.25 13:01, Paolo Abeni wrote:
> On 3/31/25 10:10 AM, Wang Liang wrote:
>> diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
>> index 3e6cb35baf25..454801188514 100644
>> --- a/net/smc/af_smc.c
>> +++ b/net/smc/af_smc.c
>> @@ -371,6 +371,7 @@ void smc_sk_init(struct net *net, struct sock *sk, int protocol)
>>   	sk->sk_protocol = protocol;
>>   	WRITE_ONCE(sk->sk_sndbuf, 2 * READ_ONCE(net->smc.sysctl_wmem));
>>   	WRITE_ONCE(sk->sk_rcvbuf, 2 * READ_ONCE(net->smc.sysctl_rmem));
>> +	smc->clcsock = NULL;
>>   	INIT_WORK(&smc->tcp_listen_work, smc_tcp_listen_work);
>>   	INIT_WORK(&smc->connect_work, smc_connect_work);
>>   	INIT_DELAYED_WORK(&smc->conn.tx_work, smc_tx_work);
> 
> The syzkaller report has a few reproducers, have you tested this? AFAICS
> the smc socket is already zeroed on allocation by sk_alloc().

Yes. I also agree with you that smc socket should have already been zeroed.

Currently in this commit, this member variable is set to NULL 
explicitly. I am not sure if this can fix this problem or not.

Based on the following, it seems that this problem can be reproduced.
"
syzbot has tested the proposed patch but the reproducer is still 
triggering an issue:
general protection fault in __smc_diag_dump
"

Thus follow the instructions in this link to make tests.

https://groups.google.com/g/syzkaller-bugs/c/YwENRImdcsk/m/wBJo6qGiCAAJ?pli=1, 
the following can trigger the reproducer.

"
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.
"

Zhu Yanjun

> 
> /P
>
Wang Liang April 2, 2025, 2:37 a.m. UTC | #3
在 2025/4/1 19:01, Paolo Abeni 写道:
> On 3/31/25 10:10 AM, Wang Liang wrote:
>> diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
>> index 3e6cb35baf25..454801188514 100644
>> --- a/net/smc/af_smc.c
>> +++ b/net/smc/af_smc.c
>> @@ -371,6 +371,7 @@ void smc_sk_init(struct net *net, struct sock *sk, int protocol)
>>   	sk->sk_protocol = protocol;
>>   	WRITE_ONCE(sk->sk_sndbuf, 2 * READ_ONCE(net->smc.sysctl_wmem));
>>   	WRITE_ONCE(sk->sk_rcvbuf, 2 * READ_ONCE(net->smc.sysctl_rmem));
>> +	smc->clcsock = NULL;
>>   	INIT_WORK(&smc->tcp_listen_work, smc_tcp_listen_work);
>>   	INIT_WORK(&smc->connect_work, smc_connect_work);
>>   	INIT_DELAYED_WORK(&smc->conn.tx_work, smc_tx_work);
> The syzkaller report has a few reproducers, have you tested this? AFAICS
> the smc socket is already zeroed on allocation by sk_alloc().


Yes, I test it by the C repro:
https://syzkaller.appspot.com/text?tag=ReproC&x=13d2dc98580000

The C repro is provided by the 2025/02/27 15:16 crash from
   https://syzkaller.appspot.com/bug?extid=271fed3ed6f24600c364

After apply my patch, the crash no longer happens when running the C repro.

"the smc socket is already zeroed on allocation by sk_alloc()", That is 
right.
However, smc->clcsock may be modified indirectly in inet6_create().
The process like this:

   __sys_socket
     __sys_socket_create
       sock_create
         __sock_create
           # pf->create
           inet6_create
             // init smc->clcsock = 0
             sk = sk_alloc()

             // set smc->clcsock to invalid address
             inet = inet_sk(sk);
             inet_assign_bit(IS_ICSK, sk, INET_PROTOSW_ICSK & answer_flags);
             inet6_set_bit(MC6_LOOP, sk);
             inet6_set_bit(MC6_ALL, sk);

             smc_inet_init_sock
               smc_sk_init
                 // add sk to smc_hash
                 smc_hash_sk
                   sk_add_node(sk, head);
               smc_create_clcsk
                 // set smc->clcsock
                 sock_create_kern(..., &smc->clcsock);)

So initialize smc->clcsock to NULL explicitly in smc_sk_init() can fix
this crash scene. If the problem can be reproduced after this patch, I
guess it is not the same reason, and fix it by another patch is more
appropriate.

>
> /P
>
>
diff mbox series

Patch

diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index 3e6cb35baf25..454801188514 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -371,6 +371,7 @@  void smc_sk_init(struct net *net, struct sock *sk, int protocol)
 	sk->sk_protocol = protocol;
 	WRITE_ONCE(sk->sk_sndbuf, 2 * READ_ONCE(net->smc.sysctl_wmem));
 	WRITE_ONCE(sk->sk_rcvbuf, 2 * READ_ONCE(net->smc.sysctl_rmem));
+	smc->clcsock = NULL;
 	INIT_WORK(&smc->tcp_listen_work, smc_tcp_listen_work);
 	INIT_WORK(&smc->connect_work, smc_connect_work);
 	INIT_DELAYED_WORK(&smc->conn.tx_work, smc_tx_work);