diff mbox series

net/ipv4: Fix circular deadlock in do_ip_setsockop

Message ID 20240917235027.218692-2-srikarananta01@gmail.com (mailing list archive)
State Rejected
Delegated to: Netdev Maintainers
Headers show
Series net/ipv4: Fix circular deadlock in do_ip_setsockop | expand

Checks

Context Check Description
netdev/series_format warning Single patches do not need cover letters; Target tree name not specified in the subject
netdev/tree_selection success Guessed tree name to be net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 16 this patch: 16
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 5 of 5 maintainers
netdev/build_clang success Errors and warnings before: 16 this patch: 16
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 16 this patch: 16
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 12 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-09-19--09-00 (tests: 764)

Commit Message

Ananta Srikar Puranam Sept. 17, 2024, 11:50 p.m. UTC
Fixed the circular lock dependency reported by syzkaller.

Signed-off-by: AnantaSrikar <srikarananta01@gmail.com>
Reported-by: syzbot+e4c27043b9315839452d@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=e4c27043b9315839452d
Fixes: d2bafcf224f3 ("Merge tag 'cgroup-for-6.11-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup")
---
 net/ipv4/ip_sockglue.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

kernel test robot Sept. 22, 2024, 9:26 a.m. UTC | #1
Hello,

kernel test robot noticed "WARNING:possible_circular_locking_dependency_detected" on:

commit: 1b1e90e04f3485bbd37b605a863b16f42fa9566c ("[PATCH] net/ipv4: Fix circular deadlock in do_ip_setsockop")
url: https://github.com/intel-lab-lkp/linux/commits/AnantaSrikar/net-ipv4-Fix-circular-deadlock-in-do_ip_setsockop/20240918-075223
base: https://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git 9410645520e9b820069761f3450ef6661418e279
patch link: https://lore.kernel.org/all/20240917235027.218692-2-srikarananta01@gmail.com/
patch subject: [PATCH] net/ipv4: Fix circular deadlock in do_ip_setsockop

in testcase: trinity
version: trinity-i386-abe9de86-1_20230429
with following parameters:

	runtime: 300s
	group: group-00
	nr_groups: 5



compiler: clang-18
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

(please refer to attached dmesg/kmsg for entire log/backtrace)



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202409221753.e29d62c8-lkp@intel.com


[  102.908754][T20485] WARNING: possible circular locking dependency detected
[  102.909639][T20485] 6.11.0-01459-g1b1e90e04f34 #1 Not tainted
[  102.910197][T20485] ------------------------------------------------------
[  102.910822][T20485] trinity-c2/20485 is trying to acquire lock:
[102.911369][T20485] c2ab6a78 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock (net/core/rtnetlink.c:80) 
[  102.912029][T20485]
[  102.912029][T20485] but task is already holding lock:
[102.912663][T20485] edce5dd8 (sk_lock-AF_INET6){+.+.}-{0:0}, at: sockopt_lock_sock (include/net/sock.h:? net/core/sock.c:1125) 
[  102.913455][T20485]
[  102.913455][T20485] which lock already depends on the new lock.
[  102.913455][T20485]
[  102.914386][T20485]
[  102.914386][T20485] the existing dependency chain (in reverse order) is:
[  102.915187][T20485]
[  102.915187][T20485] -> #1 (sk_lock-AF_INET6){+.+.}-{0:0}:
[102.915862][T20485] lock_sock_nested (net/core/sock.c:3611) 
[102.916319][T20485] sockopt_lock_sock (include/net/sock.h:? net/core/sock.c:1125) 
[102.916778][T20485] do_ipv6_setsockopt (net/ipv6/ipv6_sockglue.c:?) 
[102.917283][T20485] ipv6_setsockopt (net/ipv6/ipv6_sockglue.c:?) 
[102.917740][T20485] udpv6_setsockopt (net/ipv6/udp.c:1702) 
[102.918226][T20485] sock_common_setsockopt (net/core/sock.c:3803) 
[102.918766][T20485] __sys_setsockopt (net/socket.c:? net/socket.c:2353) 
[102.919224][T20485] __ia32_sys_socketcall (net/socket.c:?) 
[102.919727][T20485] ia32_sys_call (kbuild/obj/consumer/i386-randconfig-053-20240920/./arch/x86/include/generated/asm/syscalls_32.h:?) 
[102.920298][T20485] __do_fast_syscall_32 (arch/x86/entry/common.c:?) 
[102.920778][T20485] do_fast_syscall_32 (arch/x86/entry/common.c:411) 
[102.921263][T20485] do_SYSENTER_32 (arch/x86/entry/common.c:449) 
[102.921720][T20485] entry_SYSENTER_32 (arch/x86/entry/entry_32.S:836) 
[  102.922222][T20485]
[  102.922222][T20485] -> #0 (rtnl_mutex){+.+.}-{3:3}:
[102.922871][T20485] __lock_acquire (kernel/locking/lockdep.c:?) 
[102.923356][T20485] lock_acquire (kernel/locking/lockdep.c:5759) 
[102.923817][T20485] __mutex_lock_common (kernel/locking/mutex.c:608) 
[102.924293][T20485] mutex_lock_nested (kernel/locking/mutex.c:752 kernel/locking/mutex.c:804) 
[102.924748][T20485] rtnl_lock (net/core/rtnetlink.c:80) 
[102.925154][T20485] do_ip_setsockopt (net/ipv4/ip_sockglue.c:1082) 
[102.925613][T20485] ip_setsockopt (net/ipv4/ip_sockglue.c:1419) 
[102.926060][T20485] ipv6_setsockopt (net/ipv6/ipv6_sockglue.c:?) 
[102.926522][T20485] tcp_setsockopt (net/ipv4/tcp.c:?) 
[102.926976][T20485] sock_common_setsockopt (net/core/sock.c:3803) 
[102.927459][T20485] __sys_setsockopt (net/socket.c:? net/socket.c:2353) 
[102.927910][T20485] __ia32_sys_setsockopt (net/socket.c:2362 net/socket.c:2359 net/socket.c:2359) 
[102.928384][T20485] ia32_sys_call (kbuild/obj/consumer/i386-randconfig-053-20240920/./arch/x86/include/generated/asm/syscalls_32.h:?) 
[102.928839][T20485] __do_fast_syscall_32 (arch/x86/entry/common.c:?) 
[102.929362][T20485] do_fast_syscall_32 (arch/x86/entry/common.c:411) 
[102.929824][T20485] do_SYSENTER_32 (arch/x86/entry/common.c:449) 
[102.930273][T20485] entry_SYSENTER_32 (arch/x86/entry/entry_32.S:836) 
[  102.930754][T20485]
[  102.930754][T20485] other info that might help us debug this:
[  102.930754][T20485]
[  102.931662][T20485]  Possible unsafe locking scenario:
[  102.931662][T20485]
[  102.932291][T20485]        CPU0                    CPU1
[  102.932748][T20485]        ----                    ----
[  102.933216][T20485]   lock(sk_lock-AF_INET6);
[  102.937469][T20485]                                lock(rtnl_mutex);
[  102.938054][T20485]                                lock(sk_lock-AF_INET6);
[  102.938658][T20485]   lock(rtnl_mutex);
[  102.939013][T20485]
[  102.939013][T20485]  *** DEADLOCK ***
[  102.939013][T20485]
[  102.939714][T20485] 1 lock held by trinity-c2/20485:
[102.940182][T20485] #0: edce5dd8 (sk_lock-AF_INET6){+.+.}-{0:0}, at: sockopt_lock_sock (include/net/sock.h:? net/core/sock.c:1125) 
[  102.940929][T20485]
[  102.940929][T20485] stack backtrace:
[  102.941423][T20485] CPU: 1 UID: 65534 PID: 20485 Comm: trinity-c2 Not tainted 6.11.0-01459-g1b1e90e04f34 #1
[  102.942250][T20485] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[  102.943160][T20485] Call Trace:
[102.943455][T20485] dump_stack_lvl (lib/dump_stack.c:121) 
[102.943867][T20485] dump_stack (lib/dump_stack.c:128) 
[102.944221][T20485] print_circular_bug (kernel/locking/lockdep.c:?) 
[102.944654][T20485] check_noncircular (kernel/locking/lockdep.c:2186) 
[102.945117][T20485] __lock_acquire (kernel/locking/lockdep.c:?) 
[102.945557][T20485] ? kvm_sched_clock_read (arch/x86/kernel/kvmclock.c:91) 
[102.946018][T20485] ? sched_clock_noinstr (arch/x86/kernel/tsc.c:266) 
[102.946456][T20485] ? local_clock_noinstr (kernel/sched/clock.c:301) 
[102.946906][T20485] ? kvm_sched_clock_read (arch/x86/kernel/kvmclock.c:91) 
[102.947350][T20485] ? sched_clock_noinstr (arch/x86/kernel/tsc.c:266) 
[102.947782][T20485] ? local_clock_noinstr (kernel/sched/clock.c:301) 
[102.948222][T20485] ? kvm_sched_clock_read (arch/x86/kernel/kvmclock.c:91) 
[102.948678][T20485] ? sched_clock_noinstr (arch/x86/kernel/tsc.c:266) 
[102.949157][T20485] lock_acquire (kernel/locking/lockdep.c:5759) 
[102.949581][T20485] ? rtnl_lock (net/core/rtnetlink.c:80) 
[102.949977][T20485] ? sched_clock_noinstr (arch/x86/kernel/tsc.c:266) 
[102.950442][T20485] __mutex_lock_common (kernel/locking/mutex.c:608) 
[102.950906][T20485] ? rtnl_lock (net/core/rtnetlink.c:80) 
[102.951292][T20485] ? lock_sock_nested (net/core/sock.c:3619) 
[102.951724][T20485] ? trace_hardirqs_on (kernel/trace/trace_preemptirq.c:63) 
[102.952182][T20485] ? __local_bh_enable_ip (arch/x86/include/asm/irqflags.h:42 arch/x86/include/asm/irqflags.h:97 kernel/softirq.c:387) 
[102.952627][T20485] mutex_lock_nested (kernel/locking/mutex.c:752 kernel/locking/mutex.c:804) 
[102.953058][T20485] ? rtnl_lock (net/core/rtnetlink.c:80) 
[102.953414][T20485] rtnl_lock (net/core/rtnetlink.c:80) 
[102.953790][T20485] do_ip_setsockopt (net/ipv4/ip_sockglue.c:1082) 
[102.954216][T20485] ? kvm_sched_clock_read (arch/x86/kernel/kvmclock.c:91) 
[102.954690][T20485] ? sched_clock_noinstr (arch/x86/kernel/tsc.c:266) 
[102.955140][T20485] ip_setsockopt (net/ipv4/ip_sockglue.c:1419) 
[102.955521][T20485] ipv6_setsockopt (net/ipv6/ipv6_sockglue.c:?) 
[102.955912][T20485] ? ipv6_set_mcast_msfilter (net/ipv6/ipv6_sockglue.c:984) 
[102.956398][T20485] tcp_setsockopt (net/ipv4/tcp.c:?) 
[102.956828][T20485] ? tcp_enable_tx_delay (net/ipv4/tcp.c:4024) 
[102.957294][T20485] sock_common_setsockopt (net/core/sock.c:3803) 
[102.957771][T20485] ? sock_common_recvmsg (net/core/sock.c:3799) 
[102.958233][T20485] ? sock_common_recvmsg (net/core/sock.c:3799) 
[102.958680][T20485] __sys_setsockopt (net/socket.c:? net/socket.c:2353) 
[102.959124][T20485] __ia32_sys_setsockopt (net/socket.c:2362 net/socket.c:2359 net/socket.c:2359) 
[102.959583][T20485] ia32_sys_call (kbuild/obj/consumer/i386-randconfig-053-20240920/./arch/x86/include/generated/asm/syscalls_32.h:?) 
[102.960019][T20485] __do_fast_syscall_32 (arch/x86/entry/common.c:?) 
[102.960473][T20485] ? irqentry_exit_to_user_mode (kernel/entry/common.c:234) 
[102.960984][T20485] do_fast_syscall_32 (arch/x86/entry/common.c:411) 
[102.961511][T20485] do_SYSENTER_32 (arch/x86/entry/common.c:449) 
[102.961909][T20485] entry_SYSENTER_32 (arch/x86/entry/entry_32.S:836) 
[  102.962336][T20485] EIP: 0xb7fbb539
[ 102.962665][T20485] Code: 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 0f 1f 00 58 b8 77 00 00 00 cd 80 90 0f 1f
All code
========
   0:	03 74 b4 01          	add    0x1(%rsp,%rsi,4),%esi
   4:	10 07                	adc    %al,(%rdi)
   6:	03 74 b0 01          	add    0x1(%rax,%rsi,4),%esi
   a:	10 08                	adc    %cl,(%rax)
   c:	03 74 d8 01          	add    0x1(%rax,%rbx,8),%esi
	...
  20:	00 51 52             	add    %dl,0x52(%rcx)
  23:	55                   	push   %rbp
  24:*	89 e5                	mov    %esp,%ebp		<-- trapping instruction
  26:	0f 34                	sysenter
  28:	cd 80                	int    $0x80
  2a:	5d                   	pop    %rbp
  2b:	5a                   	pop    %rdx
  2c:	59                   	pop    %rcx
  2d:	c3                   	ret
  2e:	90                   	nop
  2f:	90                   	nop
  30:	90                   	nop
  31:	90                   	nop
  32:	0f 1f 00             	nopl   (%rax)
  35:	58                   	pop    %rax
  36:	b8 77 00 00 00       	mov    $0x77,%eax
  3b:	cd 80                	int    $0x80
  3d:	90                   	nop
  3e:	0f                   	.byte 0xf
  3f:	1f                   	(bad)

Code starting with the faulting instruction
===========================================
   0:	5d                   	pop    %rbp
   1:	5a                   	pop    %rdx
   2:	59                   	pop    %rcx
   3:	c3                   	ret
   4:	90                   	nop
   5:	90                   	nop
   6:	90                   	nop
   7:	90                   	nop
   8:	0f 1f 00             	nopl   (%rax)
   b:	58                   	pop    %rax
   c:	b8 77 00 00 00       	mov    $0x77,%eax
  11:	cd 80                	int    $0x80
  13:	90                   	nop
  14:	0f                   	.byte 0xf
  15:	1f                   	(bad)


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240922/202409221753.e29d62c8-lkp@intel.com
Eric Dumazet Sept. 22, 2024, 4:11 p.m. UTC | #2
On Wed, Sep 18, 2024 at 1:51 AM AnantaSrikar <srikarananta01@gmail.com> wrote:
>
> Fixed the circular lock dependency reported by syzkaller.
>
> Signed-off-by: AnantaSrikar <srikarananta01@gmail.com>
> Reported-by: syzbot+e4c27043b9315839452d@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=e4c27043b9315839452d
> Fixes: d2bafcf224f3 ("Merge tag 'cgroup-for-6.11-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup")
> ---
>  net/ipv4/ip_sockglue.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
> index cf377377b52d..a8f46d1ba62b 100644
> --- a/net/ipv4/ip_sockglue.c
> +++ b/net/ipv4/ip_sockglue.c
> @@ -1073,9 +1073,11 @@ int do_ip_setsockopt(struct sock *sk, int level, int optname,
>         }
>
>         err = 0;
> +
> +       sockopt_lock_sock(sk);
> +
>         if (needs_rtnl)
>                 rtnl_lock();
> -       sockopt_lock_sock(sk);
>
>         switch (optname) {
>         case IP_OPTIONS:

I think you missed an earlier conversation about SMC being at fault here.

https://lore.kernel.org/netdev/CANn89iKcWmufo83xy-SwSrXYt6UpL2Pb+5pWuzyYjMva5F8bBQ@mail.gmail.com/
Ananta Srikar Puranam Sept. 22, 2024, 9:04 p.m. UTC | #3
On 22/09/24 12:11 pm, Eric Dumazet wrote:
> I think you missed an earlier conversation about SMC being at fault here.

You're right, I missed the earlier discussion about SMC. I apologize for 
the oversight and thank you for pointing it out. As a first-time 
contributor, I'll be more diligent in researching existing discussions 
before submitting patches in the future.

Best regards,
Srikar
diff mbox series

Patch

diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index cf377377b52d..a8f46d1ba62b 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -1073,9 +1073,11 @@  int do_ip_setsockopt(struct sock *sk, int level, int optname,
 	}
 
 	err = 0;
+
+	sockopt_lock_sock(sk);
+
 	if (needs_rtnl)
 		rtnl_lock();
-	sockopt_lock_sock(sk);
 
 	switch (optname) {
 	case IP_OPTIONS: