Message ID | 20221208120452.556997-1-liuhangbin@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 3cf7203ca620682165706f70a1b12b5194607dce |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] net/tunnel: wait until all sk_user_data reader finish before releasing the sock | expand |
Thu, Dec 08, 2022 at 01:04:52PM CET, liuhangbin@gmail.com wrote: >There is a race condition in vxlan that when deleting a vxlan device >during receiving packets, there is a possibility that the sock is >released after getting vxlan_sock vs from sk_user_data. Then in >later vxlan_ecn_decapsulate(), vxlan_get_sk_family() we will got >NULL pointer dereference. e.g. > > #0 [ffffa25ec6978a38] machine_kexec at ffffffff8c669757 > #1 [ffffa25ec6978a90] __crash_kexec at ffffffff8c7c0a4d > #2 [ffffa25ec6978b58] crash_kexec at ffffffff8c7c1c48 > #3 [ffffa25ec6978b60] oops_end at ffffffff8c627f2b > #4 [ffffa25ec6978b80] page_fault_oops at ffffffff8c678fcb > #5 [ffffa25ec6978bd8] exc_page_fault at ffffffff8d109542 > #6 [ffffa25ec6978c00] asm_exc_page_fault at ffffffff8d200b62 > [exception RIP: vxlan_ecn_decapsulate+0x3b] > RIP: ffffffffc1014e7b RSP: ffffa25ec6978cb0 RFLAGS: 00010246 > RAX: 0000000000000008 RBX: ffff8aa000888000 RCX: 0000000000000000 > RDX: 000000000000000e RSI: ffff8a9fc7ab803e RDI: ffff8a9fd1168700 > RBP: ffff8a9fc7ab803e R8: 0000000000700000 R9: 00000000000010ae > R10: ffff8a9fcb748980 R11: 0000000000000000 R12: ffff8a9fd1168700 > R13: ffff8aa000888000 R14: 00000000002a0000 R15: 00000000000010ae > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 > #7 [ffffa25ec6978ce8] vxlan_rcv at ffffffffc10189cd [vxlan] > #8 [ffffa25ec6978d90] udp_queue_rcv_one_skb at ffffffff8cfb6507 > #9 [ffffa25ec6978dc0] udp_unicast_rcv_skb at ffffffff8cfb6e45 > #10 [ffffa25ec6978dc8] __udp4_lib_rcv at ffffffff8cfb8807 > #11 [ffffa25ec6978e20] ip_protocol_deliver_rcu at ffffffff8cf76951 > #12 [ffffa25ec6978e48] ip_local_deliver at ffffffff8cf76bde > #13 [ffffa25ec6978ea0] __netif_receive_skb_one_core at ffffffff8cecde9b > #14 [ffffa25ec6978ec8] process_backlog at ffffffff8cece139 > #15 [ffffa25ec6978f00] __napi_poll at ffffffff8ceced1a > #16 [ffffa25ec6978f28] net_rx_action at ffffffff8cecf1f3 > #17 [ffffa25ec6978fa0] __softirqentry_text_start at ffffffff8d4000ca > #18 [ffffa25ec6978ff0] do_softirq at ffffffff8c6fbdc3 > >Reproducer: https://github.com/Mellanox/ovs-tests/blob/master/test-ovs-vxlan-remove-tunnel-during-traffic.sh > >Fix this by waiting for all sk_user_data reader to finish before >releasing the sock. > >Reported-by: Jianlin Shi <jishi@redhat.com> >Suggested-by: Jakub Sitnicki <jakub@cloudflare.com> >Fixes: 6a93cc905274 ("udp-tunnel: Add a few more UDP tunnel APIs") >Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Hello: This patch was applied to netdev/net.git (master) by David S. Miller <davem@davemloft.net>: On Thu, 8 Dec 2022 20:04:52 +0800 you wrote: > There is a race condition in vxlan that when deleting a vxlan device > during receiving packets, there is a possibility that the sock is > released after getting vxlan_sock vs from sk_user_data. Then in > later vxlan_ecn_decapsulate(), vxlan_get_sk_family() we will got > NULL pointer dereference. e.g. > > #0 [ffffa25ec6978a38] machine_kexec at ffffffff8c669757 > #1 [ffffa25ec6978a90] __crash_kexec at ffffffff8c7c0a4d > #2 [ffffa25ec6978b58] crash_kexec at ffffffff8c7c1c48 > #3 [ffffa25ec6978b60] oops_end at ffffffff8c627f2b > #4 [ffffa25ec6978b80] page_fault_oops at ffffffff8c678fcb > #5 [ffffa25ec6978bd8] exc_page_fault at ffffffff8d109542 > #6 [ffffa25ec6978c00] asm_exc_page_fault at ffffffff8d200b62 > [exception RIP: vxlan_ecn_decapsulate+0x3b] > RIP: ffffffffc1014e7b RSP: ffffa25ec6978cb0 RFLAGS: 00010246 > RAX: 0000000000000008 RBX: ffff8aa000888000 RCX: 0000000000000000 > RDX: 000000000000000e RSI: ffff8a9fc7ab803e RDI: ffff8a9fd1168700 > RBP: ffff8a9fc7ab803e R8: 0000000000700000 R9: 00000000000010ae > R10: ffff8a9fcb748980 R11: 0000000000000000 R12: ffff8a9fd1168700 > R13: ffff8aa000888000 R14: 00000000002a0000 R15: 00000000000010ae > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 > #7 [ffffa25ec6978ce8] vxlan_rcv at ffffffffc10189cd [vxlan] > #8 [ffffa25ec6978d90] udp_queue_rcv_one_skb at ffffffff8cfb6507 > #9 [ffffa25ec6978dc0] udp_unicast_rcv_skb at ffffffff8cfb6e45 > #10 [ffffa25ec6978dc8] __udp4_lib_rcv at ffffffff8cfb8807 > #11 [ffffa25ec6978e20] ip_protocol_deliver_rcu at ffffffff8cf76951 > #12 [ffffa25ec6978e48] ip_local_deliver at ffffffff8cf76bde > #13 [ffffa25ec6978ea0] __netif_receive_skb_one_core at ffffffff8cecde9b > #14 [ffffa25ec6978ec8] process_backlog at ffffffff8cece139 > #15 [ffffa25ec6978f00] __napi_poll at ffffffff8ceced1a > #16 [ffffa25ec6978f28] net_rx_action at ffffffff8cecf1f3 > #17 [ffffa25ec6978fa0] __softirqentry_text_start at ffffffff8d4000ca > #18 [ffffa25ec6978ff0] do_softirq at ffffffff8c6fbdc3 > > [...] Here is the summary with links: - [net] net/tunnel: wait until all sk_user_data reader finish before releasing the sock https://git.kernel.org/netdev/net/c/3cf7203ca620 You are awesome, thank you!
diff --git a/net/ipv4/udp_tunnel_core.c b/net/ipv4/udp_tunnel_core.c index 8242c8947340..5f8104cf082d 100644 --- a/net/ipv4/udp_tunnel_core.c +++ b/net/ipv4/udp_tunnel_core.c @@ -176,6 +176,7 @@ EXPORT_SYMBOL_GPL(udp_tunnel_xmit_skb); void udp_tunnel_sock_release(struct socket *sock) { rcu_assign_sk_user_data(sock->sk, NULL); + synchronize_rcu(); kernel_sock_shutdown(sock, SHUT_RDWR); sock_release(sock); }
There is a race condition in vxlan that when deleting a vxlan device during receiving packets, there is a possibility that the sock is released after getting vxlan_sock vs from sk_user_data. Then in later vxlan_ecn_decapsulate(), vxlan_get_sk_family() we will got NULL pointer dereference. e.g. #0 [ffffa25ec6978a38] machine_kexec at ffffffff8c669757 #1 [ffffa25ec6978a90] __crash_kexec at ffffffff8c7c0a4d #2 [ffffa25ec6978b58] crash_kexec at ffffffff8c7c1c48 #3 [ffffa25ec6978b60] oops_end at ffffffff8c627f2b #4 [ffffa25ec6978b80] page_fault_oops at ffffffff8c678fcb #5 [ffffa25ec6978bd8] exc_page_fault at ffffffff8d109542 #6 [ffffa25ec6978c00] asm_exc_page_fault at ffffffff8d200b62 [exception RIP: vxlan_ecn_decapsulate+0x3b] RIP: ffffffffc1014e7b RSP: ffffa25ec6978cb0 RFLAGS: 00010246 RAX: 0000000000000008 RBX: ffff8aa000888000 RCX: 0000000000000000 RDX: 000000000000000e RSI: ffff8a9fc7ab803e RDI: ffff8a9fd1168700 RBP: ffff8a9fc7ab803e R8: 0000000000700000 R9: 00000000000010ae R10: ffff8a9fcb748980 R11: 0000000000000000 R12: ffff8a9fd1168700 R13: ffff8aa000888000 R14: 00000000002a0000 R15: 00000000000010ae ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #7 [ffffa25ec6978ce8] vxlan_rcv at ffffffffc10189cd [vxlan] #8 [ffffa25ec6978d90] udp_queue_rcv_one_skb at ffffffff8cfb6507 #9 [ffffa25ec6978dc0] udp_unicast_rcv_skb at ffffffff8cfb6e45 #10 [ffffa25ec6978dc8] __udp4_lib_rcv at ffffffff8cfb8807 #11 [ffffa25ec6978e20] ip_protocol_deliver_rcu at ffffffff8cf76951 #12 [ffffa25ec6978e48] ip_local_deliver at ffffffff8cf76bde #13 [ffffa25ec6978ea0] __netif_receive_skb_one_core at ffffffff8cecde9b #14 [ffffa25ec6978ec8] process_backlog at ffffffff8cece139 #15 [ffffa25ec6978f00] __napi_poll at ffffffff8ceced1a #16 [ffffa25ec6978f28] net_rx_action at ffffffff8cecf1f3 #17 [ffffa25ec6978fa0] __softirqentry_text_start at ffffffff8d4000ca #18 [ffffa25ec6978ff0] do_softirq at ffffffff8c6fbdc3 Reproducer: https://github.com/Mellanox/ovs-tests/blob/master/test-ovs-vxlan-remove-tunnel-during-traffic.sh Fix this by waiting for all sk_user_data reader to finish before releasing the sock. Reported-by: Jianlin Shi <jishi@redhat.com> Suggested-by: Jakub Sitnicki <jakub@cloudflare.com> Fixes: 6a93cc905274 ("udp-tunnel: Add a few more UDP tunnel APIs") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> --- net/ipv4/udp_tunnel_core.c | 1 + 1 file changed, 1 insertion(+)