diff mbox series

[net] drop_monitor: fix incorrect initialization order

Message ID 20250212134150.377169-1-Ilia.Gavrilov@infotecs.ru (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series [net] drop_monitor: fix incorrect initialization order | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 6 of 6 maintainers
netdev/build_clang success Errors and warnings before: 1 this patch: 1
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch warning WARNING: Do not crash the kernel unless it is absolutely unavoidable--use WARN_ON_ONCE() plus recovery code (if feasible) instead of BUG() or variants WARNING: From:/Signed-off-by: email name mismatch: 'From: Gavrilov Ilia <Ilia.Gavrilov@infotecs.ru>' != 'Signed-off-by: Ilia Gavrilov <Ilia.Gavrilov@infotecs.ru>'
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2025-02-13--12-00 (tests: 889)

Commit Message

Gavrilov Ilia Feb. 12, 2025, 1:41 p.m. UTC
Syzkaller reports the following bug:

BUG: spinlock bad magic on CPU#1, syz-executor.0/7995
 lock: 0xffff88805303f3e0, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
CPU: 1 PID: 7995 Comm: syz-executor.0 Tainted: G            E     5.10.209+ #1
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x119/0x179 lib/dump_stack.c:118
 debug_spin_lock_before kernel/locking/spinlock_debug.c:83 [inline]
 do_raw_spin_lock+0x1f6/0x270 kernel/locking/spinlock_debug.c:112
 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:117 [inline]
 _raw_spin_lock_irqsave+0x50/0x70 kernel/locking/spinlock.c:159
 reset_per_cpu_data+0xe6/0x240 [drop_monitor]
 net_dm_cmd_trace+0x43d/0x17a0 [drop_monitor]
 genl_family_rcv_msg_doit+0x22f/0x330 net/netlink/genetlink.c:739
 genl_family_rcv_msg net/netlink/genetlink.c:783 [inline]
 genl_rcv_msg+0x341/0x5a0 net/netlink/genetlink.c:800
 netlink_rcv_skb+0x14d/0x440 net/netlink/af_netlink.c:2497
 genl_rcv+0x29/0x40 net/netlink/genetlink.c:811
 netlink_unicast_kernel net/netlink/af_netlink.c:1322 [inline]
 netlink_unicast+0x54b/0x800 net/netlink/af_netlink.c:1348
 netlink_sendmsg+0x914/0xe00 net/netlink/af_netlink.c:1916
 sock_sendmsg_nosec net/socket.c:651 [inline]
 __sock_sendmsg+0x157/0x190 net/socket.c:663
 ____sys_sendmsg+0x712/0x870 net/socket.c:2378
 ___sys_sendmsg+0xf8/0x170 net/socket.c:2432
 __sys_sendmsg+0xea/0x1b0 net/socket.c:2461
 do_syscall_64+0x30/0x40 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x62/0xc7
RIP: 0033:0x7f3f9815aee9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f3f972bf0c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f3f9826d050 RCX: 00007f3f9815aee9
RDX: 0000000020000000 RSI: 0000000020001300 RDI: 0000000000000007
RBP: 00007f3f981b63bd R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 000000000000006e R14: 00007f3f9826d050 R15: 00007ffe01ee6768

If drop_monitor is built as a kernel module, syzkaller may have time
to send a netlink NET_DM_CMD_START message during the module loading.
This will call the net_dm_monitor_start() function that uses
a spinlock that has not yet been initialized.

To fix this, let's place resource initialization above the registration
of a generic netlink family.

Found by InfoTeCS on behalf of Linux Verification Center
(linuxtesting.org) with SVACE.

Fixes: 9a8afc8d3962 ("Network Drop Monitor: Adding drop monitor implementation & Netlink protocol")
Cc: stable@vger.kernel.org
Signed-off-by: Ilia Gavrilov <Ilia.Gavrilov@infotecs.ru>
---
 net/core/drop_monitor.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

Comments

Ido Schimmel Feb. 13, 2025, 2:11 p.m. UTC | #1
On Wed, Feb 12, 2025 at 01:41:51PM +0000, Gavrilov Ilia wrote:
> Syzkaller reports the following bug:
> 
> BUG: spinlock bad magic on CPU#1, syz-executor.0/7995
>  lock: 0xffff88805303f3e0, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
> CPU: 1 PID: 7995 Comm: syz-executor.0 Tainted: G            E     5.10.209+ #1
> Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x119/0x179 lib/dump_stack.c:118
>  debug_spin_lock_before kernel/locking/spinlock_debug.c:83 [inline]
>  do_raw_spin_lock+0x1f6/0x270 kernel/locking/spinlock_debug.c:112
>  __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:117 [inline]
>  _raw_spin_lock_irqsave+0x50/0x70 kernel/locking/spinlock.c:159
>  reset_per_cpu_data+0xe6/0x240 [drop_monitor]
>  net_dm_cmd_trace+0x43d/0x17a0 [drop_monitor]
>  genl_family_rcv_msg_doit+0x22f/0x330 net/netlink/genetlink.c:739
>  genl_family_rcv_msg net/netlink/genetlink.c:783 [inline]
>  genl_rcv_msg+0x341/0x5a0 net/netlink/genetlink.c:800
>  netlink_rcv_skb+0x14d/0x440 net/netlink/af_netlink.c:2497
>  genl_rcv+0x29/0x40 net/netlink/genetlink.c:811
>  netlink_unicast_kernel net/netlink/af_netlink.c:1322 [inline]
>  netlink_unicast+0x54b/0x800 net/netlink/af_netlink.c:1348
>  netlink_sendmsg+0x914/0xe00 net/netlink/af_netlink.c:1916
>  sock_sendmsg_nosec net/socket.c:651 [inline]
>  __sock_sendmsg+0x157/0x190 net/socket.c:663
>  ____sys_sendmsg+0x712/0x870 net/socket.c:2378
>  ___sys_sendmsg+0xf8/0x170 net/socket.c:2432
>  __sys_sendmsg+0xea/0x1b0 net/socket.c:2461
>  do_syscall_64+0x30/0x40 arch/x86/entry/common.c:46
>  entry_SYSCALL_64_after_hwframe+0x62/0xc7
> RIP: 0033:0x7f3f9815aee9
> Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
> RSP: 002b:00007f3f972bf0c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
> RAX: ffffffffffffffda RBX: 00007f3f9826d050 RCX: 00007f3f9815aee9
> RDX: 0000000020000000 RSI: 0000000020001300 RDI: 0000000000000007
> RBP: 00007f3f981b63bd R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> R13: 000000000000006e R14: 00007f3f9826d050 R15: 00007ffe01ee6768
> 
> If drop_monitor is built as a kernel module, syzkaller may have time
> to send a netlink NET_DM_CMD_START message during the module loading.
> This will call the net_dm_monitor_start() function that uses
> a spinlock that has not yet been initialized.
> 
> To fix this, let's place resource initialization above the registration
> of a generic netlink family.
> 
> Found by InfoTeCS on behalf of Linux Verification Center
> (linuxtesting.org) with SVACE.
> 
> Fixes: 9a8afc8d3962 ("Network Drop Monitor: Adding drop monitor implementation & Netlink protocol")
> Cc: stable@vger.kernel.org
> Signed-off-by: Ilia Gavrilov <Ilia.Gavrilov@infotecs.ru>
> ---
>  net/core/drop_monitor.c | 13 ++++++-------
>  1 file changed, 6 insertions(+), 7 deletions(-)
> 
> diff --git a/net/core/drop_monitor.c b/net/core/drop_monitor.c
> index 6efd4cccc9dd..9755d2010e70 100644
> --- a/net/core/drop_monitor.c
> +++ b/net/core/drop_monitor.c
> @@ -1734,6 +1734,11 @@ static int __init init_net_drop_monitor(void)
>  		return -ENOSPC;
>  	}
>  
> +	for_each_possible_cpu(cpu) {
> +		net_dm_cpu_data_init(cpu);
> +		net_dm_hw_cpu_data_init(cpu);
> +	}
> +
>  	rc = genl_register_family(&net_drop_monitor_family);

This might be fine as-is, but it would be cleaner to move the
registration of the netlink family to the end of the module
initialization function, so that it's exposed to user space after all
the preparations have been done, including the registration of the net
device notifier.

>  	if (rc) {
>  		pr_err("Could not create drop monitor netlink family\n");
> @@ -1749,11 +1754,6 @@ static int __init init_net_drop_monitor(void)
>  
>  	rc = 0;
>  
> -	for_each_possible_cpu(cpu) {
> -		net_dm_cpu_data_init(cpu);
> -		net_dm_hw_cpu_data_init(cpu);
> -	}
> -
>  	goto out;
>  
>  out_unreg:
> @@ -1772,13 +1772,12 @@ static void exit_net_drop_monitor(void)
>  	 * Because of the module_get/put we do in the trace state change path
>  	 * we are guaranteed not to have any current users when we get here
>  	 */
> +	BUG_ON(genl_unregister_family(&net_drop_monitor_family));

Similarly, unregister the netlink family at the beginning of the module
exit function, before unregistering the net device notifier.

>  
>  	for_each_possible_cpu(cpu) {
>  		net_dm_hw_cpu_data_fini(cpu);
>  		net_dm_cpu_data_fini(cpu);
>  	}
> -
> -	BUG_ON(genl_unregister_family(&net_drop_monitor_family));
>  }
>  
>  module_init(init_net_drop_monitor);
> -- 
> 2.39.5
>
Gavrilov Ilia Feb. 13, 2025, 2:28 p.m. UTC | #2
On 2/13/25 17:11, Ido Schimmel wrote:
> On Wed, Feb 12, 2025 at 01:41:51PM +0000, Gavrilov Ilia wrote:
>> Syzkaller reports the following bug:
>>
>> BUG: spinlock bad magic on CPU#1, syz-executor.0/7995
>>  lock: 0xffff88805303f3e0, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
>> CPU: 1 PID: 7995 Comm: syz-executor.0 Tainted: G            E     5.10.209+ #1
>> Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
>> Call Trace:
>>  __dump_stack lib/dump_stack.c:77 [inline]
>>  dump_stack+0x119/0x179 lib/dump_stack.c:118
>>  debug_spin_lock_before kernel/locking/spinlock_debug.c:83 [inline]
>>  do_raw_spin_lock+0x1f6/0x270 kernel/locking/spinlock_debug.c:112
>>  __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:117 [inline]
>>  _raw_spin_lock_irqsave+0x50/0x70 kernel/locking/spinlock.c:159
>>  reset_per_cpu_data+0xe6/0x240 [drop_monitor]
>>  net_dm_cmd_trace+0x43d/0x17a0 [drop_monitor]
>>  genl_family_rcv_msg_doit+0x22f/0x330 net/netlink/genetlink.c:739
>>  genl_family_rcv_msg net/netlink/genetlink.c:783 [inline]
>>  genl_rcv_msg+0x341/0x5a0 net/netlink/genetlink.c:800
>>  netlink_rcv_skb+0x14d/0x440 net/netlink/af_netlink.c:2497
>>  genl_rcv+0x29/0x40 net/netlink/genetlink.c:811
>>  netlink_unicast_kernel net/netlink/af_netlink.c:1322 [inline]
>>  netlink_unicast+0x54b/0x800 net/netlink/af_netlink.c:1348
>>  netlink_sendmsg+0x914/0xe00 net/netlink/af_netlink.c:1916
>>  sock_sendmsg_nosec net/socket.c:651 [inline]
>>  __sock_sendmsg+0x157/0x190 net/socket.c:663
>>  ____sys_sendmsg+0x712/0x870 net/socket.c:2378
>>  ___sys_sendmsg+0xf8/0x170 net/socket.c:2432
>>  __sys_sendmsg+0xea/0x1b0 net/socket.c:2461
>>  do_syscall_64+0x30/0x40 arch/x86/entry/common.c:46
>>  entry_SYSCALL_64_after_hwframe+0x62/0xc7
>> RIP: 0033:0x7f3f9815aee9
>> Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
>> RSP: 002b:00007f3f972bf0c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
>> RAX: ffffffffffffffda RBX: 00007f3f9826d050 RCX: 00007f3f9815aee9
>> RDX: 0000000020000000 RSI: 0000000020001300 RDI: 0000000000000007
>> RBP: 00007f3f981b63bd R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
>> R13: 000000000000006e R14: 00007f3f9826d050 R15: 00007ffe01ee6768
>>
>> If drop_monitor is built as a kernel module, syzkaller may have time
>> to send a netlink NET_DM_CMD_START message during the module loading.
>> This will call the net_dm_monitor_start() function that uses
>> a spinlock that has not yet been initialized.
>>
>> To fix this, let's place resource initialization above the registration
>> of a generic netlink family.
>>
>> Found by InfoTeCS on behalf of Linux Verification Center
>> (linuxtesting.org) with SVACE.
>>
>> Fixes: 9a8afc8d3962 ("Network Drop Monitor: Adding drop monitor implementation & Netlink protocol")
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Ilia Gavrilov <Ilia.Gavrilov@infotecs.ru>
>> ---
>>  net/core/drop_monitor.c | 13 ++++++-------
>>  1 file changed, 6 insertions(+), 7 deletions(-)
>>
>> diff --git a/net/core/drop_monitor.c b/net/core/drop_monitor.c
>> index 6efd4cccc9dd..9755d2010e70 100644
>> --- a/net/core/drop_monitor.c
>> +++ b/net/core/drop_monitor.c
>> @@ -1734,6 +1734,11 @@ static int __init init_net_drop_monitor(void)
>>  		return -ENOSPC;
>>  	}
>>  
>> +	for_each_possible_cpu(cpu) {
>> +		net_dm_cpu_data_init(cpu);
>> +		net_dm_hw_cpu_data_init(cpu);
>> +	}
>> +
>>  	rc = genl_register_family(&net_drop_monitor_family);
> 
> This might be fine as-is, but it would be cleaner to move the
> registration of the netlink family to the end of the module
> initialization function, so that it's exposed to user space after all
> the preparations have been done, including the registration of the net
> device notifier.
> 

Thank you for your review. I'll send it in V2.

>>  	if (rc) {
>>  		pr_err("Could not create drop monitor netlink family\n");
>> @@ -1749,11 +1754,6 @@ static int __init init_net_drop_monitor(void)
>>  
>>  	rc = 0;
>>  
>> -	for_each_possible_cpu(cpu) {
>> -		net_dm_cpu_data_init(cpu);
>> -		net_dm_hw_cpu_data_init(cpu);
>> -	}
>> -
>>  	goto out;
>>  
>>  out_unreg:
>> @@ -1772,13 +1772,12 @@ static void exit_net_drop_monitor(void)
>>  	 * Because of the module_get/put we do in the trace state change path
>>  	 * we are guaranteed not to have any current users when we get here
>>  	 */
>> +	BUG_ON(genl_unregister_family(&net_drop_monitor_family));
> 
> Similarly, unregister the netlink family at the beginning of the module
> exit function, before unregistering the net device notifier.
> 
>>  
>>  	for_each_possible_cpu(cpu) {
>>  		net_dm_hw_cpu_data_fini(cpu);
>>  		net_dm_cpu_data_fini(cpu);
>>  	}
>> -
>> -	BUG_ON(genl_unregister_family(&net_drop_monitor_family));
>>  }
>>  
>>  module_init(init_net_drop_monitor);
>> -- 
>> 2.39.5
>>
diff mbox series

Patch

diff --git a/net/core/drop_monitor.c b/net/core/drop_monitor.c
index 6efd4cccc9dd..9755d2010e70 100644
--- a/net/core/drop_monitor.c
+++ b/net/core/drop_monitor.c
@@ -1734,6 +1734,11 @@  static int __init init_net_drop_monitor(void)
 		return -ENOSPC;
 	}
 
+	for_each_possible_cpu(cpu) {
+		net_dm_cpu_data_init(cpu);
+		net_dm_hw_cpu_data_init(cpu);
+	}
+
 	rc = genl_register_family(&net_drop_monitor_family);
 	if (rc) {
 		pr_err("Could not create drop monitor netlink family\n");
@@ -1749,11 +1754,6 @@  static int __init init_net_drop_monitor(void)
 
 	rc = 0;
 
-	for_each_possible_cpu(cpu) {
-		net_dm_cpu_data_init(cpu);
-		net_dm_hw_cpu_data_init(cpu);
-	}
-
 	goto out;
 
 out_unreg:
@@ -1772,13 +1772,12 @@  static void exit_net_drop_monitor(void)
 	 * Because of the module_get/put we do in the trace state change path
 	 * we are guaranteed not to have any current users when we get here
 	 */
+	BUG_ON(genl_unregister_family(&net_drop_monitor_family));
 
 	for_each_possible_cpu(cpu) {
 		net_dm_hw_cpu_data_fini(cpu);
 		net_dm_cpu_data_fini(cpu);
 	}
-
-	BUG_ON(genl_unregister_family(&net_drop_monitor_family));
 }
 
 module_init(init_net_drop_monitor);