Message ID | 61cded134e753976259b48a4269f203dcae6c8fa.1739969092.git.pabeni@redhat.com (mailing list archive) |
---|---|
State | Accepted, archived |
Commit | c02a912d7eee88f0e435aa347d44b53126d77a55 |
Delegated to: | Matthieu Baerts |
Headers | show |
Series | [mptcp-net] mptcp: always handle address removal under msk socket lock | expand |
Context | Check | Description |
---|---|---|
matttbe/build | success | Build and static analysis OK |
matttbe/checkpatch | success | total: 0 errors, 0 warnings, 0 checks, 11 lines checked |
matttbe/shellcheck | success | MPTCP selftests files have not been modified |
matttbe/KVM_Validation__normal | success | Success! ✅ |
matttbe/KVM_Validation__debug | success | Success! ✅ |
matttbe/KVM_Validation__btf-normal__only_bpftest_all_ | success | Success! ✅ |
matttbe/KVM_Validation__btf-debug__only_bpftest_all_ | success | Success! ✅ |
Hi Paolo, Thank you for your modifications, that's great! Our CI did some validations and here is its report: - KVM Validation: normal: Success! ✅ - KVM Validation: debug: Success! ✅ - KVM Validation: btf-normal (only bpftest_all): Success! ✅ - KVM Validation: btf-debug (only bpftest_all): Success! ✅ - Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/13413108659 Initiator: Patchew Applier Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/f8164af657a2 Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=935572 If there are some issues, you can reproduce them using the same environment as the one used by the CI thanks to a docker image, e.g.: $ cd [kernel source code] $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \ --pull always mptcp/mptcp-upstream-virtme-docker:latest \ auto-normal For more details: https://github.com/multipath-tcp/mptcp-upstream-virtme-docker Please note that despite all the efforts that have been already done to have a stable tests suite when executed on a public CI like here, it is possible some reported issues are not due to your modifications. Still, do not hesitate to help us improve that ;-) Cheers, MPTCP GH Action bot Bot operated by Matthieu Baerts (NGI0 Core)
Hi Paolo, On 19/02/2025 13:45, Paolo Abeni wrote: > Syzkaller reported a lockdep splat in the PM control path: (...) > Indeed the PM can try to send a RM_ADDR over a msk without acquiring > first the msk socket lock. > > The bugged code-path comes from an early optimization: when there > are no subflows, the PM should (usually) not send RM_ADDR > notifications. > > The above statement is incorrect, as without locks another process > could concurrent create a new subflow and cause the RM_ADDR generation. > > Additionally the supposed optimization is not very effective even > performance-wise, as most mptcp sockets should have at least one > subflow: the MPC one. Thank you for having looked at this! By chance, do you know how syzkaller managed to interact with an MPTCP socket not having any attached subflows? An msk being destroyed? Just to know if there are specific cases we need to pay attention to in the future :) > Address the issue removing the buggy code path, the existing "slow-path" > will handle correctly even the edge case. Thank you, it looks good to me! Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Cheers, Matt
Hi Paolo, On 19/02/2025 18:45, Matthieu Baerts wrote: > Hi Paolo, > > On 19/02/2025 13:45, Paolo Abeni wrote: >> Syzkaller reported a lockdep splat in the PM control path: > > (...) > >> Indeed the PM can try to send a RM_ADDR over a msk without acquiring >> first the msk socket lock. >> >> The bugged code-path comes from an early optimization: when there >> are no subflows, the PM should (usually) not send RM_ADDR >> notifications. >> >> The above statement is incorrect, as without locks another process >> could concurrent create a new subflow and cause the RM_ADDR generation. >> >> Additionally the supposed optimization is not very effective even >> performance-wise, as most mptcp sockets should have at least one >> subflow: the MPC one. > Thank you for having looked at this! > > By chance, do you know how syzkaller managed to interact with an MPTCP > socket not having any attached subflows? An msk being destroyed? > > Just to know if there are specific cases we need to pay attention to in > the future :) (This question was obviously not blocking this patch, so I just applied it) Now in our tree (fixes for -net): New patches for t/upstream-net and t/upstream: - c02a912d7eee: mptcp: always handle address removal under msk socket lock - Results: aba580cd4fc8..eb57543e795e (export-net) - Results: 3ee8e45b1642..98f1e7131283 (export) Tests are now in progress: - export-net: https://github.com/multipath-tcp/mptcp_net-next/commit/0098419fb05cdbc73ffd9855dc2db613f79366ea/checks - export: https://github.com/multipath-tcp/mptcp_net-next/commit/d29ae41108d2ac0ae39cd5fe27da1898e826833f/checks Cheers, Matt
On 2/19/25 6:45 PM, Matthieu Baerts wrote: > Hi Paolo, > > On 19/02/2025 13:45, Paolo Abeni wrote: >> Syzkaller reported a lockdep splat in the PM control path: > > (...) > >> Indeed the PM can try to send a RM_ADDR over a msk without acquiring >> first the msk socket lock. >> >> The bugged code-path comes from an early optimization: when there >> are no subflows, the PM should (usually) not send RM_ADDR >> notifications. >> >> The above statement is incorrect, as without locks another process >> could concurrent create a new subflow and cause the RM_ADDR generation. >> >> Additionally the supposed optimization is not very effective even >> performance-wise, as most mptcp sockets should have at least one >> subflow: the MPC one. > Thank you for having looked at this! > > By chance, do you know how syzkaller managed to interact with an MPTCP > socket not having any attached subflows? An msk being destroyed? I'm not 110% and I don't have the syzkaller log for the relevant splat handy, but i.e. a passive socket at clone time does: list_add(&subflow->node, &msk->conn_list); // [1] sock_hold(ssk); mptcp_token_accept(subflow_req, msk); // [2] under the msk socket spin lock. After [2], the PM ops can find and operate on the newly accepted msk socket, but lacking any lock protection and memory barriers could still observe an empty conn_list, even given the prior [1]. The race is tiny, thus is hard for syzkaller to find a repro. /P
Hi Paolo, On 20/02/2025 10:33, Paolo Abeni wrote: > On 2/19/25 6:45 PM, Matthieu Baerts wrote: >> Hi Paolo, >> >> On 19/02/2025 13:45, Paolo Abeni wrote: >>> Syzkaller reported a lockdep splat in the PM control path: >> >> (...) >> >>> Indeed the PM can try to send a RM_ADDR over a msk without acquiring >>> first the msk socket lock. >>> >>> The bugged code-path comes from an early optimization: when there >>> are no subflows, the PM should (usually) not send RM_ADDR >>> notifications. >>> >>> The above statement is incorrect, as without locks another process >>> could concurrent create a new subflow and cause the RM_ADDR generation. >>> >>> Additionally the supposed optimization is not very effective even >>> performance-wise, as most mptcp sockets should have at least one >>> subflow: the MPC one. >> Thank you for having looked at this! >> >> By chance, do you know how syzkaller managed to interact with an MPTCP >> socket not having any attached subflows? An msk being destroyed? > > I'm not 110% and I don't have the syzkaller log for the relevant splat > handy, but i.e. a passive socket at clone time does: > > list_add(&subflow->node, &msk->conn_list); // [1] > sock_hold(ssk); > mptcp_token_accept(subflow_req, msk); // [2] > > under the msk socket spin lock. > > After [2], the PM ops can find and operate on the newly accepted msk > socket, but lacking any lock protection and memory barriers could still > observe an empty conn_list, even given the prior [1]. > > The race is tiny, thus is hard for syzkaller to find a repro. I see, thank you for your reply and for having checked! I will try to keep this in mind when looking at future potential issues. (And not always assuming there is at least one subflow in the list.) Cheers, Matt
On 2/20/25 11:40 AM, Matthieu Baerts wrote: > On 20/02/2025 10:33, Paolo Abeni wrote: >> I'm not 110% and I don't have the syzkaller log for the relevant splat >> handy, but i.e. a passive socket at clone time does: >> >> list_add(&subflow->node, &msk->conn_list); // [1] >> sock_hold(ssk); >> mptcp_token_accept(subflow_req, msk); // [2] >> >> under the msk socket spin lock. >> >> After [2], the PM ops can find and operate on the newly accepted msk >> socket, but lacking any lock protection and memory barriers could still >> observe an empty conn_list, even given the prior [1]. >> >> The race is tiny, thus is hard for syzkaller to find a repro. > > I see, thank you for your reply and for having checked! > > I will try to keep this in mind when looking at future potential issues. > (And not always assuming there is at least one subflow in the list.) To clarify: I think such assumption is true under the msk socket lock, but it can't be asserted without the lock. /P
diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index 572d160edca3..c0e47f4f7b1a 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -1514,11 +1514,6 @@ static int mptcp_nl_remove_subflow_and_signal_addr(struct net *net, if (mptcp_pm_is_userspace(msk)) goto next; - if (list_empty(&msk->conn_list)) { - mptcp_pm_remove_anno_addr(msk, addr, false); - goto next; - } - lock_sock(sk); remove_subflow = mptcp_lookup_subflow_by_saddr(&msk->conn_list, addr); mptcp_pm_remove_anno_addr(msk, addr, remove_subflow &&
Syzkaller reported a lockdep splat in the PM control path: WARNING: CPU: 0 PID: 6693 at ./include/net/sock.h:1711 sock_owned_by_me include/net/sock.h:1711 [inline] WARNING: CPU: 0 PID: 6693 at ./include/net/sock.h:1711 msk_owned_by_me net/mptcp/protocol.h:363 [inline] WARNING: CPU: 0 PID: 6693 at ./include/net/sock.h:1711 mptcp_pm_nl_addr_send_ack+0x57c/0x610 net/mptcp/pm_netlink.c:788 Modules linked in: CPU: 0 UID: 0 PID: 6693 Comm: syz.0.205 Not tainted 6.14.0-rc2-syzkaller-00303-gad1b832bf1cf #0 Hardware name: Google Compute Engine/Google Compute Engine, BIOS Google 12/27/2024 RIP: 0010:sock_owned_by_me include/net/sock.h:1711 [inline] RIP: 0010:msk_owned_by_me net/mptcp/protocol.h:363 [inline] RIP: 0010:mptcp_pm_nl_addr_send_ack+0x57c/0x610 net/mptcp/pm_netlink.c:788 Code: 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc e8 ca 7b d3 f5 eb b9 e8 c3 7b d3 f5 90 0f 0b 90 e9 dd fb ff ff e8 b5 7b d3 f5 90 <0f> 0b 90 e9 3e fb ff ff 44 89 f1 80 e1 07 38 c1 0f 8c eb fb ff ff RSP: 0000:ffffc900034f6f60 EFLAGS: 00010283 RAX: ffffffff8bee3c2b RBX: 0000000000000001 RCX: 0000000000080000 RDX: ffffc90004d42000 RSI: 000000000000a407 RDI: 000000000000a408 RBP: ffffc900034f7030 R08: ffffffff8bee37f6 R09: 0100000000000000 R10: dffffc0000000000 R11: ffffed100bcc62e4 R12: ffff88805e6316e0 R13: ffff88805e630c00 R14: dffffc0000000000 R15: ffff88805e630c00 FS: 00007f7e9a7e96c0(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b2fd18ff8 CR3: 0000000032c24000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> mptcp_pm_remove_addr+0x103/0x1d0 net/mptcp/pm.c:59 mptcp_pm_remove_anno_addr+0x1f4/0x2f0 net/mptcp/pm_netlink.c:1486 mptcp_nl_remove_subflow_and_signal_addr net/mptcp/pm_netlink.c:1518 [inline] mptcp_pm_nl_del_addr_doit+0x118d/0x1af0 net/mptcp/pm_netlink.c:1629 genl_family_rcv_msg_doit net/netlink/genetlink.c:1115 [inline] genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline] genl_rcv_msg+0xb1f/0xec0 net/netlink/genetlink.c:1210 netlink_rcv_skb+0x206/0x480 net/netlink/af_netlink.c:2543 genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219 netlink_unicast_kernel net/netlink/af_netlink.c:1322 [inline] netlink_unicast+0x7f6/0x990 net/netlink/af_netlink.c:1348 netlink_sendmsg+0x8de/0xcb0 net/netlink/af_netlink.c:1892 sock_sendmsg_nosec net/socket.c:718 [inline] __sock_sendmsg+0x221/0x270 net/socket.c:733 ____sys_sendmsg+0x53a/0x860 net/socket.c:2573 ___sys_sendmsg net/socket.c:2627 [inline] __sys_sendmsg+0x269/0x350 net/socket.c:2659 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f7e9998cde9 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f7e9a7e9038 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007f7e99ba5fa0 RCX: 00007f7e9998cde9 RDX: 000000002000c094 RSI: 0000400000000000 RDI: 0000000000000007 RBP: 00007f7e99a0e2a0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 00007f7e99ba5fa0 R15: 00007fff49231088 Indeed the PM can try to send a RM_ADDR over a msk without acquiring first the msk socket lock. The bugged code-path comes from an early optimization: when there are no subflows, the PM should (usually) not send RM_ADDR notifications. The above statement is incorrect, as without locks another process could concurrent create a new subflow and cause the RM_ADDR generation. Additionally the supposed optimization is not very effective even performance-wise, as most mptcp sockets should have at least one subflow: the MPC one. Address the issue removing the buggy code path, the existing "slow-path" will handle correctly even the edge case. Reported-by: syzbot+cd3ce3d03a3393ae9700@syzkaller.appspotmail.com Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/546 Fixes: b6c08380860b ("mptcp: remove addr and subflow in PM netlink") Signed-off-by: Paolo Abeni <pabeni@redhat.com> --- net/mptcp/pm_netlink.c | 5 ----- 1 file changed, 5 deletions(-)