Message ID | tencent_274B82754376EF66A23C0D37029644374609@qq.com (mailing list archive) |
---|---|
State | Superseded, archived |
Delegated to: | Matthieu Baerts |
Headers | show |
Series | mptcp: pm: Fix uaf in __timer_delete_sync | expand |
Context | Check | Description |
---|---|---|
matttbe/build | success | Build and static analysis OK |
matttbe/checkpatch | warning | total: 0 errors, 2 warnings, 0 checks, 13 lines checked |
matttbe/shellcheck | success | MPTCP selftests files have not been modified |
matttbe/KVM_Validation__normal | success | Success! ✅ |
matttbe/KVM_Validation__debug | warning | Unstable: 1 failed test(s): mptcp_connect_mmap - Critical: 1 Call Trace(s) ❌ |
matttbe/KVM_Validation__btf__only_bpftest_all_ | success | Success! ✅ |
On Tue, Sep 3, 2024 at 5:10 PM Edward Adam Davis <eadavis@qq.com> wrote: > > There are two paths to access mptcp_pm_del_add_timer, result in a race > condition: > > CPU1 CPU2 > ==== ==== > net_rx_action > napi_poll netlink_sendmsg > __napi_poll netlink_unicast > process_backlog netlink_unicast_kernel > __netif_receive_skb genl_rcv > __netif_receive_skb_one_core netlink_rcv_skb > NF_HOOK genl_rcv_msg > ip_local_deliver_finish genl_family_rcv_msg > ip_protocol_deliver_rcu genl_family_rcv_msg_doit > tcp_v4_rcv mptcp_pm_nl_flush_addrs_doit > tcp_v4_do_rcv mptcp_nl_remove_addrs_list > tcp_rcv_established mptcp_pm_remove_addrs_and_subflows > tcp_data_queue remove_anno_list_by_saddr > mptcp_incoming_options mptcp_pm_del_add_timer > mptcp_pm_del_add_timer kfree(entry) > > In remove_anno_list_by_saddr(running on CPU2), after leaving the critical > zone protected by "pm.lock", the entry will be released, which leads to the > occurrence of uaf in the mptcp_pm_del_add_timer(running on CPU1). > > Reported-and-tested-by: syzbot+f3a31fb909db9b2a5c4d@syzkaller.appspotmail.com > Closes: https://syzkaller.appspot.com/bug?extid=f3a31fb909db9b2a5c4d > Signed-off-by: Edward Adam Davis <eadavis@qq.com> > --- > net/mptcp/pm_netlink.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c > index 3e4ad801786f..d28bf0c9ad66 100644 > --- a/net/mptcp/pm_netlink.c > +++ b/net/mptcp/pm_netlink.c > @@ -336,11 +336,12 @@ mptcp_pm_del_add_timer(struct mptcp_sock *msk, > entry = mptcp_lookup_anno_list_by_saddr(msk, addr); > if (entry && (!check_id || entry->addr.id == addr->id)) > entry->retrans_times = ADD_ADDR_RETRANS_MAX; > - spin_unlock_bh(&msk->pm.lock); > > if (entry && (!check_id || entry->addr.id == addr->id)) > sk_stop_timer_sync(sk, &entry->add_timer); > > + spin_unlock_bh(&msk->pm.lock); mptcp_pm_add_timer() needs to lock msk->pm.lock Your patch might add a deadlock, because sk_stop_timer_sync() is calling del_timer_sync() What is preventing this ?
Hi Edward, Thank you for your modifications, that's great! Our CI did some validations and here is its report: - KVM Validation: normal: Success! ✅ - KVM Validation: debug: Unstable: 1 failed test(s): mptcp_connect_mmap - Critical: 1 Call Trace(s) ❌ - KVM Validation: btf (only bpftest_all): Success! ✅ - Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/10685982560 Initiator: Patchew Applier Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/96d64469d937 Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=886357 If there are some issues, you can reproduce them using the same environment as the one used by the CI thanks to a docker image, e.g.: $ cd [kernel source code] $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \ --pull always mptcp/mptcp-upstream-virtme-docker:latest \ auto-normal For more details: https://github.com/multipath-tcp/mptcp-upstream-virtme-docker Please note that despite all the efforts that have been already done to have a stable tests suite when executed on a public CI like here, it is possible some reported issues are not due to your modifications. Still, do not hesitate to help us improve that ;-) Cheers, MPTCP GH Action bot Bot operated by Matthieu Baerts (NGI0 Core)
On Tue, 3 Sep 2024 17:18:42 +0200, Eric Dumazet <edumazet@google.com> wrote: >On Tue, Sep 3, 2024 at 5:10 PM Edward Adam Davis <eadavis@qq.com> wrote: >> >> There are two paths to access mptcp_pm_del_add_timer, result in a race >> condition: >> >> CPU1 CPU2 >> ==== ==== >> net_rx_action >> napi_poll netlink_sendmsg >> __napi_poll netlink_unicast >> process_backlog netlink_unicast_kernel >> __netif_receive_skb genl_rcv >> __netif_receive_skb_one_core netlink_rcv_skb >> NF_HOOK genl_rcv_msg >> ip_local_deliver_finish genl_family_rcv_msg >> ip_protocol_deliver_rcu genl_family_rcv_msg_doit >> tcp_v4_rcv mptcp_pm_nl_flush_addrs_doit >> tcp_v4_do_rcv mptcp_nl_remove_addrs_list >> tcp_rcv_established mptcp_pm_remove_addrs_and_subflows >> tcp_data_queue remove_anno_list_by_saddr >> mptcp_incoming_options mptcp_pm_del_add_timer >> mptcp_pm_del_add_timer kfree(entry) >> >> In remove_anno_list_by_saddr(running on CPU2), after leaving the critical >> zone protected by "pm.lock", the entry will be released, which leads to the >> occurrence of uaf in the mptcp_pm_del_add_timer(running on CPU1). >> >> Reported-and-tested-by: syzbot+f3a31fb909db9b2a5c4d@syzkaller.appspotmail.com >> Closes: https://syzkaller.appspot.com/bug?extid=f3a31fb909db9b2a5c4d >> Signed-off-by: Edward Adam Davis <eadavis@qq.com> >> --- >> net/mptcp/pm_netlink.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c >> index 3e4ad801786f..d28bf0c9ad66 100644 >> --- a/net/mptcp/pm_netlink.c >> +++ b/net/mptcp/pm_netlink.c >> @@ -336,11 +336,12 @@ mptcp_pm_del_add_timer(struct mptcp_sock *msk, >> entry = mptcp_lookup_anno_list_by_saddr(msk, addr); >> if (entry && (!check_id || entry->addr.id == addr->id)) >> entry->retrans_times = ADD_ADDR_RETRANS_MAX; >> - spin_unlock_bh(&msk->pm.lock); >> >> if (entry && (!check_id || entry->addr.id == addr->id)) >> sk_stop_timer_sync(sk, &entry->add_timer); >> >> + spin_unlock_bh(&msk->pm.lock); > > >mptcp_pm_add_timer() needs to lock msk->pm.lock > >Your patch might add a deadlock, because sk_stop_timer_sync() is Got it. MPTCP CI reported it. >calling del_timer_sync() > >What is preventing this ? I will change the strategy for protecting entry and use pm.lock to synchronize when it is released in remove_anno_list_by_saddr. Link: https://syzkaller.appspot.com/text?tag=Patch&x=124282ab980000 BR, Edward
diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index 3e4ad801786f..d28bf0c9ad66 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -336,11 +336,12 @@ mptcp_pm_del_add_timer(struct mptcp_sock *msk, entry = mptcp_lookup_anno_list_by_saddr(msk, addr); if (entry && (!check_id || entry->addr.id == addr->id)) entry->retrans_times = ADD_ADDR_RETRANS_MAX; - spin_unlock_bh(&msk->pm.lock); if (entry && (!check_id || entry->addr.id == addr->id)) sk_stop_timer_sync(sk, &entry->add_timer); + spin_unlock_bh(&msk->pm.lock); + return entry; }
There are two paths to access mptcp_pm_del_add_timer, result in a race condition: CPU1 CPU2 ==== ==== net_rx_action napi_poll netlink_sendmsg __napi_poll netlink_unicast process_backlog netlink_unicast_kernel __netif_receive_skb genl_rcv __netif_receive_skb_one_core netlink_rcv_skb NF_HOOK genl_rcv_msg ip_local_deliver_finish genl_family_rcv_msg ip_protocol_deliver_rcu genl_family_rcv_msg_doit tcp_v4_rcv mptcp_pm_nl_flush_addrs_doit tcp_v4_do_rcv mptcp_nl_remove_addrs_list tcp_rcv_established mptcp_pm_remove_addrs_and_subflows tcp_data_queue remove_anno_list_by_saddr mptcp_incoming_options mptcp_pm_del_add_timer mptcp_pm_del_add_timer kfree(entry) In remove_anno_list_by_saddr(running on CPU2), after leaving the critical zone protected by "pm.lock", the entry will be released, which leads to the occurrence of uaf in the mptcp_pm_del_add_timer(running on CPU1). Reported-and-tested-by: syzbot+f3a31fb909db9b2a5c4d@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=f3a31fb909db9b2a5c4d Signed-off-by: Edward Adam Davis <eadavis@qq.com> --- net/mptcp/pm_netlink.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)