Message ID | 20250211-b4-ovpn-v19-0-86d5daf2a47a@openvpn.net (mailing list archive) |
---|---|
Headers | show |
Series | Introducing OpenVPN Data Channel Offload | expand |
Hello, 2025-02-11, 01:39:53 +0100, Antonio Quartulli wrote: > All minor and major reported problems have been finally addressed. > Big thanks to Sabrina, who took the time to guide me through > converting the peer socket to an RCU pointer. Something is off (not sure if it's new to this version): if I use test-tcp.sh to setup a set of interfaces and peers (I stop the test just after setup to keep the environment alive), then remove all netns with "ip -all netns delete", I expect all devices to go away, but they don't. With debug messages enabled I'm seeing some activity from the module ("tun0: sending keepalive to peer 3" and so on), and ovpn_net_uninit/ovpn_priv_free never got called. [...] > So there is NO risk of deadlock (and indeed nothing hangs), but I > couldn't find a way to make the warning go away. I've spotted another splat on strparser cleanup that looked like an actual deadlock, but it's not very reproducible. Still looking into it, but I'm not convinced it's ok to call strp_done (as is done from ovpn_tcp_socket_detach) while under lock_sock, because AFAIU cancel_work_sync(&strp->work) may be waiting for a work that needs to lock the socket (cb.lock in do_strp_work). I guess tcp_tx_work would have the same problem.
On 13/02/2025 00:34, Sabrina Dubroca wrote: > Hello, > > 2025-02-11, 01:39:53 +0100, Antonio Quartulli wrote: >> All minor and major reported problems have been finally addressed. >> Big thanks to Sabrina, who took the time to guide me through >> converting the peer socket to an RCU pointer. > > Something is off (not sure if it's new to this version): if I use > test-tcp.sh to setup a set of interfaces and peers (I stop the test > just after setup to keep the environment alive), then remove all netns > with "ip -all netns delete", I expect all devices to go away, but they > don't. With debug messages enabled I'm seeing some activity from the > module ("tun0: sending keepalive to peer 3" and so on), and > ovpn_net_uninit/ovpn_priv_free never got called. I can reproduce it. If later I rmmod ovpn I then get all the "Deleting peer" messages. So instances are not being purged on netns exit. Will dive into it. > > [...] >> So there is NO risk of deadlock (and indeed nothing hangs), but I >> couldn't find a way to make the warning go away. > > I've spotted another splat on strparser cleanup that looked like an > actual deadlock, but it's not very reproducible. Still looking into > it, but I'm not convinced it's ok to call strp_done (as is done from > ovpn_tcp_socket_detach) while under lock_sock, because AFAIU > cancel_work_sync(&strp->work) may be waiting for a work that needs to > lock the socket (cb.lock in do_strp_work). I guess tcp_tx_work would > have the same problem. Will have a look here too. Thanks!
Hi Antonio, On 2/10/25 19:39, Antonio Quartulli wrote: > NOTE: TCP tests are still showing the following warning while running > iperf. > I have analysed the report several times, but it definitely looks like > a false positive to me, so nothing to worry about. > > Basically the lockdep engine gets confused thinking that we are > acquiring the lock twice on the same sock, but actually: > 1) the kernel is first locking the 'iperf' (user) TCP socket; > 2) ovpn is later locking the underlying TCP trasport socket. > > So there is NO risk of deadlock (and indeed nothing hangs), but I > couldn't find a way to make the warning go away. I think you can use a "nested" lock to avoid this. See e.g. commit 86a41ea9fd79 ("l2tp: fix lockdep splat") for an example. --Sean [1] https://www.kernel.org/doc/html/latest/locking/lockdep-design.html#exception-nested-data-dependencies-leading-to-nested-locking
2025-02-13, 12:46:34 +0100, Antonio Quartulli wrote: > On 13/02/2025 00:34, Sabrina Dubroca wrote: > > Hello, > > > > 2025-02-11, 01:39:53 +0100, Antonio Quartulli wrote: > > > All minor and major reported problems have been finally addressed. > > > Big thanks to Sabrina, who took the time to guide me through > > > converting the peer socket to an RCU pointer. > > > > Something is off (not sure if it's new to this version): if I use > > test-tcp.sh to setup a set of interfaces and peers (I stop the test > > just after setup to keep the environment alive), then remove all netns > > with "ip -all netns delete", I expect all devices to go away, but they > > don't. With debug messages enabled I'm seeing some activity from the > > module ("tun0: sending keepalive to peer 3" and so on), and > > ovpn_net_uninit/ovpn_priv_free never got called. > > I can reproduce it. If later I rmmod ovpn I then get all the "Deleting peer" > messages. > So instances are not being purged on netns exit. > > Will dive into it. I think the socket holds a ref on the netns, so it's not getting destroyed, simply "removed" from iproute's point of view. And the socket isn't going away as long as it's used by a peer. If I delete the peer(s) for the ovpn device and then the netns it was in, the netns is fully removed, and the ovpn device is gone. Also no issue if I delete the ovpn device before its netns, then everything is destroyed as expected. I'm not sure that can be solved, as least under the current refcount scheme. But I don't think there's a way to re-attach to that netns afterwards if we wanted to clean up manually (something similar to "ip netns attach <name> <pid>", but that won't work if whatever created the socket is not running anymore -- as is the case with ovpn-cli). > > > > [...] > > > So there is NO risk of deadlock (and indeed nothing hangs), but I > > > couldn't find a way to make the warning go away. > > > > I've spotted another splat on strparser cleanup that looked like an > > actual deadlock, but it's not very reproducible. Still looking into > > it, but I'm not convinced it's ok to call strp_done (as is done from > > ovpn_tcp_socket_detach) while under lock_sock, because AFAIU > > cancel_work_sync(&strp->work) may be waiting for a work that needs to > > lock the socket (cb.lock in do_strp_work). I guess tcp_tx_work would > > have the same problem. > > Will have a look here too. The only ways I've managed to reproduce it is by using some ugly kernel-side hacks to try to force that path being hit. Either forcing the strp work to be queued just as we detach the socket (in strp_stop), or this: - hack some code path to do a big sleep() under lock_sock(), to give me the time to do the next steps - ping over ovpn, or any other way to add packets on the receive socket - delete the peer for the sleeping socket (while it's still sleeping) When that big sleep is over, strp_data_ready will kick off and queue its worker (because at this point the socket lock is still owned), then I think del_peer proceeds toward tcp_detach and I got the splat below. So that's maybe a bit hard to trigger in real life (I don't remember what I was doing the first time I got it, I think something messing with the RCU stuff we discussed earlier). [ 922.681435][ T300] ====================================================== [ 922.686247][ T300] WARNING: possible circular locking dependency detected [ 922.690971][ T300] 6.14.0-rc1-net-00276-g95f1f7ea224e-dirty #279 Tainted: G N [ 922.696584][ T300] ------------------------------------------------------ [ 922.699697][ T300] kworker/1:2/300 is trying to acquire lock: [ 922.702105][ T300] ffff88800a662160 ((work_completion)(&strp->work)){+.+.}-{0:0}, at: start_flush_work+0x407/0xa50 [ 922.705716][ T300] [ 922.705716][ T300] but task is already holding lock: [ 922.707779][ T300] ffff8880113edd98 (sk_lock-AF_INET){+.+.}-{0:0}, at: ovpn_socket_release+0x8a/0x1a0 [ 922.710238][ T300] [ 922.710238][ T300] which lock already depends on the new lock. [ 922.710238][ T300] [ 922.712628][ T300] [ 922.712628][ T300] the existing dependency chain (in reverse order) is: [ 922.714443][ T300] [ 922.714443][ T300] -> #1 (sk_lock-AF_INET){+.+.}-{0:0}: [ 922.716127][ T300] __lock_acquire+0xc4d/0x1ee0 [ 922.717250][ T300] lock_acquire+0x1a9/0x500 [ 922.718266][ T300] lock_sock_nested+0x40/0xf0 [ 922.719325][ T300] strp_work+0x95/0x1e0 [ 922.720240][ T300] process_one_work+0xe28/0x1460 [ 922.721307][ T300] worker_thread+0x674/0xee0 [ 922.722283][ T300] kthread+0x3c3/0x760 [ 922.723101][ T300] ret_from_fork+0x46/0x80 [ 922.723792][ T300] ret_from_fork_asm+0x1a/0x30 [ 922.724532][ T300] [ 922.724532][ T300] -> #0 ((work_completion)(&strp->work)){+.+.}-{0:0}: [ 922.726038][ T300] check_prev_add+0x1af/0x2400 [ 922.726927][ T300] validate_chain+0xdcf/0x1a10 [ 922.727847][ T300] __lock_acquire+0xc4d/0x1ee0 [ 922.728721][ T300] lock_acquire+0x1a9/0x500 [ 922.729590][ T300] start_flush_work+0x41a/0xa50 [ 922.730434][ T300] __flush_work+0xee/0x210 [ 922.731213][ T300] cancel_work_sync+0xb8/0xd0 [ 922.732022][ T300] strp_done.cold+0x51/0xcf [ 922.732830][ T300] ovpn_tcp_socket_detach+0x28e/0x2de [ 922.733752][ T300] ovpn_socket_release_kref+0x1ef/0x350 [ 922.734713][ T300] ovpn_socket_release+0xe7/0x1a0 [ 922.735577][ T300] ovpn_peer_remove_work+0x2b/0x90 [ 922.736468][ T300] process_one_work+0xe28/0x1460 [ 922.737357][ T300] worker_thread+0x674/0xee0 [ 922.737985][ T300] kthread+0x3c3/0x760 [ 922.738545][ T300] ret_from_fork+0x46/0x80 [ 922.739177][ T300] ret_from_fork_asm+0x1a/0x30 [ 922.739846][ T300] [ 922.739846][ T300] other info that might help us debug this: [ 922.739846][ T300] [ 922.741159][ T300] Possible unsafe locking scenario: [ 922.741159][ T300] [ 922.742165][ T300] CPU0 CPU1 [ 922.743055][ T300] ---- ---- [ 922.743943][ T300] lock(sk_lock-AF_INET); [ 922.744638][ T300] lock((work_completion)(&strp->work)); [ 922.745786][ T300] lock(sk_lock-AF_INET); [ 922.746837][ T300] lock((work_completion)(&strp->work)); [ 922.747788][ T300] [ 922.747788][ T300] *** DEADLOCK *** [ 922.747788][ T300] [ 922.748856][ T300] 4 locks held by kworker/1:2/300: [ 922.749474][ T300] #0: ffff888008499b48 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0xd80/0x1460 [ 922.750588][ T300] #1: ffffc90002527d30 ((work_completion)(&peer->remove_work)){+.+.}-{0:0}, at: process_one_work+0xddf/0x1460 [ 922.751894][ T300] #2: ffff8880113edd98 (sk_lock-AF_INET){+.+.}-{0:0}, at: ovpn_socket_release+0x8a/0x1a0 [ 922.752983][ T300] #3: ffffffff854de980 (rcu_read_lock){....}-{1:3}, at: start_flush_work+0x39/0xa50 [ 922.754018][ T300] [ 922.754018][ T300] stack backtrace: [ 922.754734][ T300] CPU: 1 UID: 0 PID: 300 Comm: kworker/1:2 Tainted: G N 6.14.0-rc1-net-00276-g95f1f7ea224e-dirty #279 [ 922.754748][ T300] Tainted: [N]=TEST [ 922.754752][ T300] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 [ 922.754761][ T300] Workqueue: events ovpn_peer_remove_work [ 922.754779][ T300] Call Trace: [ 922.754785][ T300] <TASK> [ 922.754791][ T300] dump_stack_lvl+0xa5/0x100 [ 922.754803][ T300] print_circular_bug.cold+0x38/0x48 [ 922.754820][ T300] check_noncircular+0x2f2/0x3d0 [ 922.754922][ T300] check_prev_add+0x1af/0x2400 [ 922.754942][ T300] validate_chain+0xdcf/0x1a10 [ 922.754991][ T300] __lock_acquire+0xc4d/0x1ee0 [ 922.755023][ T300] lock_acquire+0x1a9/0x500 [ 922.755104][ T300] start_flush_work+0x41a/0xa50 [ 922.755128][ T300] __flush_work+0xee/0x210 [ 922.755198][ T300] cancel_work_sync+0xb8/0xd0 [ 922.755211][ T300] strp_done.cold+0x51/0xcf [ 922.755222][ T300] ovpn_tcp_socket_detach+0x28e/0x2de [ 922.755237][ T300] ovpn_socket_release_kref+0x1ef/0x350 [ 922.755253][ T300] ovpn_socket_release+0xe7/0x1a0 [ 922.755268][ T300] ovpn_peer_remove_work+0x2b/0x90 [ 922.755282][ T300] process_one_work+0xe28/0x1460 [ 922.755330][ T300] worker_thread+0x674/0xee0 [ 922.755402][ T300] kthread+0x3c3/0x760 [ 922.755472][ T300] ret_from_fork+0x46/0x80 [ 922.755497][ T300] ret_from_fork_asm+0x1a/0x30 [ 922.755521][ T300] </TASK> And then come the hung task warnings: - del_peer waiting on the completion - strp waiting on lock_sock - peer cleanup/socket detach waiting on cancel_work_sync/strp [ 1106.347400][ T39] INFO: task kworker/u16:0:11 blocked for more than 122 seconds. [ 1106.348547][ T39] Tainted: G N 6.14.0-rc1-net-00276-g95f1f7ea224e-dirty #279 [ 1106.349671][ T39] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1106.350748][ T39] task:kworker/u16:0 state:D stack:0 pid:11 tgid:11 ppid:2 task_flags:0x4208160 flags:0x00004000 [ 1106.352454][ T39] Workqueue: kstrp strp_work [ 1106.353133][ T39] Call Trace: [ 1106.353659][ T39] <TASK> [ 1106.354143][ T39] __schedule+0xace/0x2620 [ 1106.360659][ T39] schedule+0xd0/0x210 [ 1106.361266][ T39] __lock_sock+0x137/0x230 [ 1106.365430][ T39] lock_sock_nested+0xcb/0xf0 [ 1106.366173][ T39] strp_work+0x95/0x1e0 [ 1106.366845][ T39] process_one_work+0xe28/0x1460 [ 1106.369231][ T39] worker_thread+0x674/0xee0 [ 1106.371594][ T39] kthread+0x3c3/0x760 [ 1106.375692][ T39] ret_from_fork+0x46/0x80 [ 1106.377040][ T39] ret_from_fork_asm+0x1a/0x30 [ 1106.377851][ T39] </TASK> [ 1106.378355][ T39] INFO: task kworker/1:2:300 blocked for more than 122 seconds. [ 1106.379590][ T39] Tainted: G N 6.14.0-rc1-net-00276-g95f1f7ea224e-dirty #279 [ 1106.381079][ T39] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1106.382372][ T39] task:kworker/1:2 state:D stack:0 pid:300 tgid:300 ppid:2 task_flags:0x4208060 flags:0x00004000 [ 1106.384211][ T39] Workqueue: events ovpn_peer_remove_work [ 1106.385072][ T39] Call Trace: [ 1106.385648][ T39] <TASK> [ 1106.386139][ T39] __schedule+0xace/0x2620 [ 1106.393518][ T39] schedule+0xd0/0x210 [ 1106.394195][ T39] schedule_timeout+0x18c/0x240 [ 1106.398840][ T39] __wait_for_common+0x3e3/0x610 [ 1106.403079][ T39] __flush_work+0x14f/0x210 [ 1106.406874][ T39] cancel_work_sync+0xb8/0xd0 [ 1106.407519][ T39] strp_done.cold+0x51/0xcf [ 1106.408109][ T39] ovpn_tcp_socket_detach+0x28e/0x2de [ 1106.408851][ T39] ovpn_socket_release_kref+0x1ef/0x350 [ 1106.409588][ T39] ovpn_socket_release+0xe7/0x1a0 [ 1106.410271][ T39] ovpn_peer_remove_work+0x2b/0x90 [ 1106.410973][ T39] process_one_work+0xe28/0x1460 [ 1106.413075][ T39] worker_thread+0x674/0xee0 [ 1106.416968][ T39] kthread+0x3c3/0x760 [ 1106.419983][ T39] ret_from_fork+0x46/0x80 [ 1106.421172][ T39] ret_from_fork_asm+0x1a/0x30 [ 1106.421829][ T39] </TASK> [ 1106.422260][ T39] INFO: task ovpn-cli:1213 blocked for more than 122 seconds. [ 1106.423215][ T39] Tainted: G N 6.14.0-rc1-net-00276-g95f1f7ea224e-dirty #279 [ 1106.424451][ T39] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1106.425643][ T39] task:ovpn-cli state:D stack:0 pid:1213 tgid:1213 ppid:514 task_flags:0x400100 flags:0x00000002 [ 1106.427145][ T39] Call Trace: [ 1106.427610][ T39] <TASK> [ 1106.428005][ T39] __schedule+0xace/0x2620 [ 1106.431820][ T39] schedule+0xd0/0x210 [ 1106.432351][ T39] schedule_timeout+0x18c/0x240 [ 1106.435050][ T39] __wait_for_common+0x3e3/0x610 [ 1106.439809][ T39] ovpn_nl_peer_del_doit+0x270/0x6e0 [ 1106.442825][ T39] genl_family_rcv_msg_doit+0x1ea/0x2e0 [ 1106.445602][ T39] genl_family_rcv_msg+0x3a7/0x5b0 [ 1106.450559][ T39] genl_rcv_msg+0xb1/0x160 [ 1106.451154][ T39] netlink_rcv_skb+0x13e/0x3d0 [ 1106.455220][ T39] genl_rcv+0x29/0x40 [ 1106.455737][ T39] netlink_unicast+0x491/0x730 [ 1106.457107][ T39] netlink_sendmsg+0x77d/0xc00 [ 1106.458517][ T39] ____sys_sendmsg+0x7c5/0xac0 [ 1106.461329][ T39] ___sys_sendmsg+0x163/0x1b0 [ 1106.468146][ T39] __sys_sendmsg+0x135/0x1d0 [ 1106.471564][ T39] do_syscall_64+0x64/0x140 [ 1106.472173][ T39] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 1106.472947][ T39] RIP: 0033:0x7faa76628e56 [ 1106.473558][ T39] RSP: 002b:00007ffe757311e0 EFLAGS: 00000202 ORIG_RAX: 000000000000002e [ 1106.474674][ T39] RAX: ffffffffffffffda RBX: 00007faa7654f740 RCX: 00007faa76628e56 [ 1106.475715][ T39] RDX: 0000000000000000 RSI: 00007ffe75731270 RDI: 0000000000000003 [ 1106.476847][ T39] RBP: 00007ffe757311f0 R08: 0000000000000000 R09: 0000000000000000 [ 1106.477959][ T39] R10: 0000000000000000 R11: 0000000000000202 R12: 00005617490cf490 [ 1106.479064][ T39] R13: 00005617490d04f0 R14: 00007ffe75731270 R15: 0000561733d62d78 [ 1106.480175][ T39] </TASK> [ 1106.480646][ T39] INFO: lockdep is turned off.
On 13/02/2025 16:46, Sabrina Dubroca wrote: > 2025-02-13, 12:46:34 +0100, Antonio Quartulli wrote: >> On 13/02/2025 00:34, Sabrina Dubroca wrote: >>> Hello, >>> >>> 2025-02-11, 01:39:53 +0100, Antonio Quartulli wrote: >>>> All minor and major reported problems have been finally addressed. >>>> Big thanks to Sabrina, who took the time to guide me through >>>> converting the peer socket to an RCU pointer. >>> >>> Something is off (not sure if it's new to this version): if I use >>> test-tcp.sh to setup a set of interfaces and peers (I stop the test >>> just after setup to keep the environment alive), then remove all netns >>> with "ip -all netns delete", I expect all devices to go away, but they >>> don't. With debug messages enabled I'm seeing some activity from the >>> module ("tun0: sending keepalive to peer 3" and so on), and >>> ovpn_net_uninit/ovpn_priv_free never got called. >> >> I can reproduce it. If later I rmmod ovpn I then get all the "Deleting peer" >> messages. >> So instances are not being purged on netns exit. >> >> Will dive into it. > > I think the socket holds a ref on the netns, so it's not getting > destroyed, simply "removed" from iproute's point of view. And the > socket isn't going away as long as it's used by a peer. After a deep dive I was getting to the same conclusion: cleanup_net() is never invoked because we lack an invocation of put_net(). sk_alloc() invokes get_net(), so this is the ref that is not being released. > > If I delete the peer(s) for the ovpn device and then the netns it was > in, the netns is fully removed, and the ovpn device is gone. Also no > issue if I delete the ovpn device before its netns, then everything is > destroyed as expected. > > I'm not sure that can be solved, as least under the current refcount > scheme. I went back to v12 of the patchset (which is pre-refcount-restructuring) and indeed the problem there doesn't exist. However, it's unclear to me how in v12 the socket release was happening upon netns delete. Who was triggering that? ovpn still needed to call sockfd_put() in order to let it go. Will investigate some more and think about a solution. > > But I don't think there's a way to re-attach to that netns afterwards > if we wanted to clean up manually (something similar to "ip netns > attach <name> <pid>", but that won't work if whatever created the > socket is not running anymore -- as is the case with ovpn-cli). Yeah, it just hangs in background until reboot. > >>> >>> [...] >>>> So there is NO risk of deadlock (and indeed nothing hangs), but I >>>> couldn't find a way to make the warning go away. >>> >>> I've spotted another splat on strparser cleanup that looked like an >>> actual deadlock, but it's not very reproducible. Still looking into >>> it, but I'm not convinced it's ok to call strp_done (as is done from >>> ovpn_tcp_socket_detach) while under lock_sock, because AFAIU >>> cancel_work_sync(&strp->work) may be waiting for a work that needs to >>> lock the socket (cb.lock in do_strp_work). I guess tcp_tx_work would >>> have the same problem. >> >> Will have a look here too. Thanks for all the info below. I will have a deeper look at this tomorrow. Regards, > > The only ways I've managed to reproduce it is by using some ugly > kernel-side hacks to try to force that path being hit. Either forcing > the strp work to be queued just as we detach the socket (in > strp_stop), or this: > > - hack some code path to do a big sleep() under lock_sock(), to give > me the time to do the next steps > - ping over ovpn, or any other way to add packets on the receive > socket > - delete the peer for the sleeping socket (while it's still sleeping) > > When that big sleep is over, strp_data_ready will kick off and queue > its worker (because at this point the socket lock is still owned), > then I think del_peer proceeds toward tcp_detach and I got the splat > below. So that's maybe a bit hard to trigger in real life (I don't > remember what I was doing the first time I got it, I think something > messing with the RCU stuff we discussed earlier). > > > [ 922.681435][ T300] ====================================================== > [ 922.686247][ T300] WARNING: possible circular locking dependency detected > [ 922.690971][ T300] 6.14.0-rc1-net-00276-g95f1f7ea224e-dirty #279 Tainted: G N > [ 922.696584][ T300] ------------------------------------------------------ > [ 922.699697][ T300] kworker/1:2/300 is trying to acquire lock: > [ 922.702105][ T300] ffff88800a662160 ((work_completion)(&strp->work)){+.+.}-{0:0}, at: start_flush_work+0x407/0xa50 > [ 922.705716][ T300] > [ 922.705716][ T300] but task is already holding lock: > [ 922.707779][ T300] ffff8880113edd98 (sk_lock-AF_INET){+.+.}-{0:0}, at: ovpn_socket_release+0x8a/0x1a0 > [ 922.710238][ T300] > [ 922.710238][ T300] which lock already depends on the new lock. > [ 922.710238][ T300] > [ 922.712628][ T300] > [ 922.712628][ T300] the existing dependency chain (in reverse order) is: > [ 922.714443][ T300] > [ 922.714443][ T300] -> #1 (sk_lock-AF_INET){+.+.}-{0:0}: > [ 922.716127][ T300] __lock_acquire+0xc4d/0x1ee0 > [ 922.717250][ T300] lock_acquire+0x1a9/0x500 > [ 922.718266][ T300] lock_sock_nested+0x40/0xf0 > [ 922.719325][ T300] strp_work+0x95/0x1e0 > [ 922.720240][ T300] process_one_work+0xe28/0x1460 > [ 922.721307][ T300] worker_thread+0x674/0xee0 > [ 922.722283][ T300] kthread+0x3c3/0x760 > [ 922.723101][ T300] ret_from_fork+0x46/0x80 > [ 922.723792][ T300] ret_from_fork_asm+0x1a/0x30 > [ 922.724532][ T300] > [ 922.724532][ T300] -> #0 ((work_completion)(&strp->work)){+.+.}-{0:0}: > [ 922.726038][ T300] check_prev_add+0x1af/0x2400 > [ 922.726927][ T300] validate_chain+0xdcf/0x1a10 > [ 922.727847][ T300] __lock_acquire+0xc4d/0x1ee0 > [ 922.728721][ T300] lock_acquire+0x1a9/0x500 > [ 922.729590][ T300] start_flush_work+0x41a/0xa50 > [ 922.730434][ T300] __flush_work+0xee/0x210 > [ 922.731213][ T300] cancel_work_sync+0xb8/0xd0 > [ 922.732022][ T300] strp_done.cold+0x51/0xcf > [ 922.732830][ T300] ovpn_tcp_socket_detach+0x28e/0x2de > [ 922.733752][ T300] ovpn_socket_release_kref+0x1ef/0x350 > [ 922.734713][ T300] ovpn_socket_release+0xe7/0x1a0 > [ 922.735577][ T300] ovpn_peer_remove_work+0x2b/0x90 > [ 922.736468][ T300] process_one_work+0xe28/0x1460 > [ 922.737357][ T300] worker_thread+0x674/0xee0 > [ 922.737985][ T300] kthread+0x3c3/0x760 > [ 922.738545][ T300] ret_from_fork+0x46/0x80 > [ 922.739177][ T300] ret_from_fork_asm+0x1a/0x30 > [ 922.739846][ T300] > [ 922.739846][ T300] other info that might help us debug this: > [ 922.739846][ T300] > [ 922.741159][ T300] Possible unsafe locking scenario: > [ 922.741159][ T300] > [ 922.742165][ T300] CPU0 CPU1 > [ 922.743055][ T300] ---- ---- > [ 922.743943][ T300] lock(sk_lock-AF_INET); > [ 922.744638][ T300] lock((work_completion)(&strp->work)); > [ 922.745786][ T300] lock(sk_lock-AF_INET); > [ 922.746837][ T300] lock((work_completion)(&strp->work)); > [ 922.747788][ T300] > [ 922.747788][ T300] *** DEADLOCK *** > [ 922.747788][ T300] > [ 922.748856][ T300] 4 locks held by kworker/1:2/300: > [ 922.749474][ T300] #0: ffff888008499b48 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0xd80/0x1460 > [ 922.750588][ T300] #1: ffffc90002527d30 ((work_completion)(&peer->remove_work)){+.+.}-{0:0}, at: process_one_work+0xddf/0x1460 > [ 922.751894][ T300] #2: ffff8880113edd98 (sk_lock-AF_INET){+.+.}-{0:0}, at: ovpn_socket_release+0x8a/0x1a0 > [ 922.752983][ T300] #3: ffffffff854de980 (rcu_read_lock){....}-{1:3}, at: start_flush_work+0x39/0xa50 > [ 922.754018][ T300] > [ 922.754018][ T300] stack backtrace: > [ 922.754734][ T300] CPU: 1 UID: 0 PID: 300 Comm: kworker/1:2 Tainted: G N 6.14.0-rc1-net-00276-g95f1f7ea224e-dirty #279 > [ 922.754748][ T300] Tainted: [N]=TEST > [ 922.754752][ T300] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 > [ 922.754761][ T300] Workqueue: events ovpn_peer_remove_work > [ 922.754779][ T300] Call Trace: > [ 922.754785][ T300] <TASK> > [ 922.754791][ T300] dump_stack_lvl+0xa5/0x100 > [ 922.754803][ T300] print_circular_bug.cold+0x38/0x48 > [ 922.754820][ T300] check_noncircular+0x2f2/0x3d0 > [ 922.754922][ T300] check_prev_add+0x1af/0x2400 > [ 922.754942][ T300] validate_chain+0xdcf/0x1a10 > [ 922.754991][ T300] __lock_acquire+0xc4d/0x1ee0 > [ 922.755023][ T300] lock_acquire+0x1a9/0x500 > [ 922.755104][ T300] start_flush_work+0x41a/0xa50 > [ 922.755128][ T300] __flush_work+0xee/0x210 > [ 922.755198][ T300] cancel_work_sync+0xb8/0xd0 > [ 922.755211][ T300] strp_done.cold+0x51/0xcf > [ 922.755222][ T300] ovpn_tcp_socket_detach+0x28e/0x2de > [ 922.755237][ T300] ovpn_socket_release_kref+0x1ef/0x350 > [ 922.755253][ T300] ovpn_socket_release+0xe7/0x1a0 > [ 922.755268][ T300] ovpn_peer_remove_work+0x2b/0x90 > [ 922.755282][ T300] process_one_work+0xe28/0x1460 > [ 922.755330][ T300] worker_thread+0x674/0xee0 > [ 922.755402][ T300] kthread+0x3c3/0x760 > [ 922.755472][ T300] ret_from_fork+0x46/0x80 > [ 922.755497][ T300] ret_from_fork_asm+0x1a/0x30 > [ 922.755521][ T300] </TASK> > > > And then come the hung task warnings: > - del_peer waiting on the completion > - strp waiting on lock_sock > - peer cleanup/socket detach waiting on cancel_work_sync/strp > > [ 1106.347400][ T39] INFO: task kworker/u16:0:11 blocked for more than 122 seconds. > [ 1106.348547][ T39] Tainted: G N 6.14.0-rc1-net-00276-g95f1f7ea224e-dirty #279 > [ 1106.349671][ T39] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [ 1106.350748][ T39] task:kworker/u16:0 state:D stack:0 pid:11 tgid:11 ppid:2 task_flags:0x4208160 flags:0x00004000 > [ 1106.352454][ T39] Workqueue: kstrp strp_work > [ 1106.353133][ T39] Call Trace: > [ 1106.353659][ T39] <TASK> > [ 1106.354143][ T39] __schedule+0xace/0x2620 > [ 1106.360659][ T39] schedule+0xd0/0x210 > [ 1106.361266][ T39] __lock_sock+0x137/0x230 > [ 1106.365430][ T39] lock_sock_nested+0xcb/0xf0 > [ 1106.366173][ T39] strp_work+0x95/0x1e0 > [ 1106.366845][ T39] process_one_work+0xe28/0x1460 > [ 1106.369231][ T39] worker_thread+0x674/0xee0 > [ 1106.371594][ T39] kthread+0x3c3/0x760 > [ 1106.375692][ T39] ret_from_fork+0x46/0x80 > [ 1106.377040][ T39] ret_from_fork_asm+0x1a/0x30 > [ 1106.377851][ T39] </TASK> > [ 1106.378355][ T39] INFO: task kworker/1:2:300 blocked for more than 122 seconds. > [ 1106.379590][ T39] Tainted: G N 6.14.0-rc1-net-00276-g95f1f7ea224e-dirty #279 > [ 1106.381079][ T39] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [ 1106.382372][ T39] task:kworker/1:2 state:D stack:0 pid:300 tgid:300 ppid:2 task_flags:0x4208060 flags:0x00004000 > [ 1106.384211][ T39] Workqueue: events ovpn_peer_remove_work > [ 1106.385072][ T39] Call Trace: > [ 1106.385648][ T39] <TASK> > [ 1106.386139][ T39] __schedule+0xace/0x2620 > [ 1106.393518][ T39] schedule+0xd0/0x210 > [ 1106.394195][ T39] schedule_timeout+0x18c/0x240 > [ 1106.398840][ T39] __wait_for_common+0x3e3/0x610 > [ 1106.403079][ T39] __flush_work+0x14f/0x210 > [ 1106.406874][ T39] cancel_work_sync+0xb8/0xd0 > [ 1106.407519][ T39] strp_done.cold+0x51/0xcf > [ 1106.408109][ T39] ovpn_tcp_socket_detach+0x28e/0x2de > [ 1106.408851][ T39] ovpn_socket_release_kref+0x1ef/0x350 > [ 1106.409588][ T39] ovpn_socket_release+0xe7/0x1a0 > [ 1106.410271][ T39] ovpn_peer_remove_work+0x2b/0x90 > [ 1106.410973][ T39] process_one_work+0xe28/0x1460 > [ 1106.413075][ T39] worker_thread+0x674/0xee0 > [ 1106.416968][ T39] kthread+0x3c3/0x760 > [ 1106.419983][ T39] ret_from_fork+0x46/0x80 > [ 1106.421172][ T39] ret_from_fork_asm+0x1a/0x30 > [ 1106.421829][ T39] </TASK> > [ 1106.422260][ T39] INFO: task ovpn-cli:1213 blocked for more than 122 seconds. > [ 1106.423215][ T39] Tainted: G N 6.14.0-rc1-net-00276-g95f1f7ea224e-dirty #279 > [ 1106.424451][ T39] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [ 1106.425643][ T39] task:ovpn-cli state:D stack:0 pid:1213 tgid:1213 ppid:514 task_flags:0x400100 flags:0x00000002 > [ 1106.427145][ T39] Call Trace: > [ 1106.427610][ T39] <TASK> > [ 1106.428005][ T39] __schedule+0xace/0x2620 > [ 1106.431820][ T39] schedule+0xd0/0x210 > [ 1106.432351][ T39] schedule_timeout+0x18c/0x240 > [ 1106.435050][ T39] __wait_for_common+0x3e3/0x610 > [ 1106.439809][ T39] ovpn_nl_peer_del_doit+0x270/0x6e0 > [ 1106.442825][ T39] genl_family_rcv_msg_doit+0x1ea/0x2e0 > [ 1106.445602][ T39] genl_family_rcv_msg+0x3a7/0x5b0 > [ 1106.450559][ T39] genl_rcv_msg+0xb1/0x160 > [ 1106.451154][ T39] netlink_rcv_skb+0x13e/0x3d0 > [ 1106.455220][ T39] genl_rcv+0x29/0x40 > [ 1106.455737][ T39] netlink_unicast+0x491/0x730 > [ 1106.457107][ T39] netlink_sendmsg+0x77d/0xc00 > [ 1106.458517][ T39] ____sys_sendmsg+0x7c5/0xac0 > [ 1106.461329][ T39] ___sys_sendmsg+0x163/0x1b0 > [ 1106.468146][ T39] __sys_sendmsg+0x135/0x1d0 > [ 1106.471564][ T39] do_syscall_64+0x64/0x140 > [ 1106.472173][ T39] entry_SYSCALL_64_after_hwframe+0x76/0x7e > [ 1106.472947][ T39] RIP: 0033:0x7faa76628e56 > [ 1106.473558][ T39] RSP: 002b:00007ffe757311e0 EFLAGS: 00000202 ORIG_RAX: 000000000000002e > [ 1106.474674][ T39] RAX: ffffffffffffffda RBX: 00007faa7654f740 RCX: 00007faa76628e56 > [ 1106.475715][ T39] RDX: 0000000000000000 RSI: 00007ffe75731270 RDI: 0000000000000003 > [ 1106.476847][ T39] RBP: 00007ffe757311f0 R08: 0000000000000000 R09: 0000000000000000 > [ 1106.477959][ T39] R10: 0000000000000000 R11: 0000000000000202 R12: 00005617490cf490 > [ 1106.479064][ T39] R13: 00005617490d04f0 R14: 00007ffe75731270 R15: 0000561733d62d78 > [ 1106.480175][ T39] </TASK> > [ 1106.480646][ T39] INFO: lockdep is turned off. >
Hi Sean, On 13/02/2025 15:26, Sean Anderson wrote: > Hi Antonio, > > On 2/10/25 19:39, Antonio Quartulli wrote: >> NOTE: TCP tests are still showing the following warning while running >> iperf. >> I have analysed the report several times, but it definitely looks like >> a false positive to me, so nothing to worry about. >> >> Basically the lockdep engine gets confused thinking that we are >> acquiring the lock twice on the same sock, but actually: >> 1) the kernel is first locking the 'iperf' (user) TCP socket; >> 2) ovpn is later locking the underlying TCP trasport socket. >> >> So there is NO risk of deadlock (and indeed nothing hangs), but I >> couldn't find a way to make the warning go away. > > I think you can use a "nested" lock to avoid this. See e.g. commit > 86a41ea9fd79 ("l2tp: fix lockdep splat") for an example. Thanks a lot for this pointer. By reading the commit message it seems I am hitting the very same "issue". I will try to port the same solution to ovpn then :) Thanks again! Regards, > > --Sean > > [1] https://www.kernel.org/doc/html/latest/locking/lockdep- > design.html#exception-nested-data-dependencies-leading-to-nested-locking
On 13/02/2025 20:40, Antonio Quartulli wrote: > On 13/02/2025 16:46, Sabrina Dubroca wrote: >> 2025-02-13, 12:46:34 +0100, Antonio Quartulli wrote: >>> On 13/02/2025 00:34, Sabrina Dubroca wrote: >>>> Hello, >>>> >>>> 2025-02-11, 01:39:53 +0100, Antonio Quartulli wrote: >>>>> All minor and major reported problems have been finally addressed. >>>>> Big thanks to Sabrina, who took the time to guide me through >>>>> converting the peer socket to an RCU pointer. >>>> >>>> Something is off (not sure if it's new to this version): if I use >>>> test-tcp.sh to setup a set of interfaces and peers (I stop the test >>>> just after setup to keep the environment alive), then remove all netns >>>> with "ip -all netns delete", I expect all devices to go away, but they >>>> don't. With debug messages enabled I'm seeing some activity from the >>>> module ("tun0: sending keepalive to peer 3" and so on), and >>>> ovpn_net_uninit/ovpn_priv_free never got called. >>> >>> I can reproduce it. If later I rmmod ovpn I then get all the >>> "Deleting peer" >>> messages. >>> So instances are not being purged on netns exit. >>> >>> Will dive into it. >> >> I think the socket holds a ref on the netns, so it's not getting >> destroyed, simply "removed" from iproute's point of view. And the >> socket isn't going away as long as it's used by a peer. > > After a deep dive I was getting to the same conclusion: > cleanup_net() is never invoked because we lack an invocation of put_net(). > sk_alloc() invokes get_net(), so this is the ref that is not being > released. > >> >> If I delete the peer(s) for the ovpn device and then the netns it was >> in, the netns is fully removed, and the ovpn device is gone. Also no >> issue if I delete the ovpn device before its netns, then everything is >> destroyed as expected. >> >> I'm not sure that can be solved, as least under the current refcount >> scheme. > > I went back to v12 of the patchset (which is pre-refcount-restructuring) > and indeed the problem there doesn't exist. > > However, it's unclear to me how in v12 the socket release was happening > upon netns delete. Who was triggering that? ovpn still needed to call > sockfd_put() in order to let it go. > > Will investigate some more and think about a solution. > I may have done something wrong, but today I tried again and I am reproducing this issue also on v8 + 6.11. I am indeed wondering how it could have ever worked: if we don't delete the peer, we don't detach the socket, hence the following chain: sockfd_put() -> sk_destructor() -> put_net() does not happen. Shouldn't we be notified somehow that the netns is going away? For example in wireguard/device.c the socket is released in pernet_operations.pre_exit(). But pre_exit() is invoked in cleanup_net(), which is invoked ONLY if the net refcount has reached 0...but how can it be zero before the sockets have been released? I must be missing something, because this seems to be a reference loop. I'll keep digging.. Regards,
On 14/02/2025 14:54, Antonio Quartulli wrote: > For example in wireguard/device.c the socket is released in > pernet_operations.pre_exit(). > > But pre_exit() is invoked in cleanup_net(), which is invoked ONLY if the > net refcount has reached 0...but how can it be zero before the sockets > have been released? > > I must be missing something, because this seems to be a reference loop. FTR, the answer is that sockets created in-kernel (like for wireguard) have sk->sk_ref_cnt set to 0, which implies that no reference to the netns is taken. So ovpn has this issue because our sockets are coming from userspace. Regards,