Message ID | 20210708065001.1150422-1-eric.dumazet@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [v2,net] ipv6: tcp: drop silly ICMPv6 packet too big messages | expand |
Context | Check | Description |
---|---|---|
netdev/cover_letter | success | Link |
netdev/fixes_present | success | Link |
netdev/patch_count | success | Link |
netdev/tree_selection | success | Clearly marked for net |
netdev/subject_prefix | success | Link |
netdev/cc_maintainers | warning | 2 maintainers not CCed: yoshfuji@linux-ipv6.org dsahern@kernel.org |
netdev/source_inline | success | Was 0 now: 0 |
netdev/verify_signedoff | success | Link |
netdev/module_param | success | Was 0 now: 0 |
netdev/build_32bit | fail | Errors and warnings before: 2 this patch: 6 |
netdev/kdoc | success | Errors and warnings before: 0 this patch: 0 |
netdev/verify_fixes | success | Link |
netdev/checkpatch | success | total: 0 errors, 0 warnings, 0 checks, 41 lines checked |
netdev/build_allmodconfig_warn | fail | Errors and warnings before: 2 this patch: 6 |
netdev/header_inline | success | Link |
On Wed, Jul 7, 2021 at 11:50 PM Eric Dumazet <eric.dumazet@gmail.com> wrote: > > From: Eric Dumazet <edumazet@google.com> > > While TCP stack scales reasonably well, there is still one part that > can be used to DDOS it. > > IPv6 Packet too big messages have to lookup/insert a new route, > and if abused by attackers, can easily put hosts under high stress, > with many cpus contending on a spinlock while one is stuck in fib6_run_gc() > > ip6_protocol_deliver_rcu() > icmpv6_rcv() > icmpv6_notify() > tcp_v6_err() > tcp_v6_mtu_reduced() > inet6_csk_update_pmtu() > ip6_rt_update_pmtu() > __ip6_rt_update_pmtu() > ip6_rt_cache_alloc() > ip6_dst_alloc() > dst_alloc() > ip6_dst_gc() > fib6_run_gc() > spin_lock_bh() ... > > Some of our servers have been hit by malicious ICMPv6 packets > trying to _increase_ the MTU/MSS of TCP flows. > > We believe these ICMPv6 packets are a result of a bug in one ISP stack, > since they were blindly sent back for _every_ (small) packet sent to them. > > These packets are for one TCP flow: > 09:24:36.266491 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240 > 09:24:36.266509 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240 > 09:24:36.316688 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240 > 09:24:36.316704 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240 > 09:24:36.608151 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240 > > TCP stack can filter some silly requests : > > 1) MTU below IPV6_MIN_MTU can be filtered early in tcp_v6_err() > 2) tcp_v6_mtu_reduced() can drop requests trying to increase current MSS. > > This tests happen before the IPv6 routing stack is entered, thus > removing the potential contention and route exhaustion. > > Note that IPv6 stack was performing these checks, but too late > (ie : after the route has been added, and after the potential > garbage collect war) > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") > Signed-off-by: Eric Dumazet <edumazet@google.com> > Reviewed-by: Maciej Żenczykowski <maze@google.com> > Cc: Martin KaFai Lau <kafai@fb.com> > --- > v2: fix typo caught by Martin, thanks ! > > net/ipv6/tcp_ipv6.c | 19 +++++++++++++++++-- > 1 file changed, 17 insertions(+), 2 deletions(-) > > diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c > index 593c32fe57ed13a218492fd6056f2593e601ec79..323989927a0a6a2274bcbc1cd0ac72e9d49b24ad 100644 > --- a/net/ipv6/tcp_ipv6.c > +++ b/net/ipv6/tcp_ipv6.c > @@ -348,11 +348,20 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr, > static void tcp_v6_mtu_reduced(struct sock *sk) > { > struct dst_entry *dst; > + u32 mtu; > > if ((1 << sk->sk_state) & (TCPF_LISTEN | TCPF_CLOSE)) > return; > > - dst = inet6_csk_update_pmtu(sk, READ_ONCE(tcp_sk(sk)->mtu_info)); > + mtu = READ_ONCE(tcp_sk(sk)->mtu_info); > + > + /* Drop requests trying to increase our current mss. > + * Check done in __ip6_rt_update_pmtu() is too late. > + */ > + if (tcp_mtu_to_mss(sk, mtu) >= tcp_sk(sk)->mss_cache) > + return; > + > + dst = inet6_csk_update_pmtu(sk, mtu); > if (!dst) > return; > > @@ -433,6 +442,8 @@ static int tcp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt, > } > > if (type == ICMPV6_PKT_TOOBIG) { > + u32 mtu = ntohl(info); > + > /* We are not interested in TCP_LISTEN and open_requests > * (SYN-ACKs send out by Linux are always <576bytes so > * they should go through unfragmented). > @@ -443,7 +454,11 @@ static int tcp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt, > if (!ip6_sk_accept_pmtu(sk)) > goto out; > > - WRITE_ONCE(tp->mtu_info, ntohl(info)); > + if (mtu < IPV6_MIN_MTU) > + goto out; > + > + WRITE_ONCE(tp->mtu_info, mtu); > + > if (!sock_owned_by_user(sk)) > tcp_v6_mtu_reduced(sk); > else if (!test_and_set_bit(TCP_MTU_REDUCED_DEFERRED, > -- > 2.32.0.93.g670b81a890-goog (this looks fine) btw. is there a need/desire for a similar change for ipv4?
tcp_mtu_to_mss needs to be exported
On Thu, Jul 8, 2021 at 8:56 AM Maciej Żenczykowski <maze@google.com> wrote: > > On Wed, Jul 7, 2021 at 11:50 PM Eric Dumazet <eric.dumazet@gmail.com> wrote: > > (this looks fine) > > btw. is there a need/desire for a similar change for ipv4? My understanding is that in IPv4, the relevant check is done before the FIB lookup, we should be fine.
On Thu, Jul 8, 2021 at 9:13 AM David Miller <davem@davemloft.net> wrote: > > tcp_mtu_to_mss needs to be exported Arg, right you are. Thanks !
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 593c32fe57ed13a218492fd6056f2593e601ec79..323989927a0a6a2274bcbc1cd0ac72e9d49b24ad 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -348,11 +348,20 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr, static void tcp_v6_mtu_reduced(struct sock *sk) { struct dst_entry *dst; + u32 mtu; if ((1 << sk->sk_state) & (TCPF_LISTEN | TCPF_CLOSE)) return; - dst = inet6_csk_update_pmtu(sk, READ_ONCE(tcp_sk(sk)->mtu_info)); + mtu = READ_ONCE(tcp_sk(sk)->mtu_info); + + /* Drop requests trying to increase our current mss. + * Check done in __ip6_rt_update_pmtu() is too late. + */ + if (tcp_mtu_to_mss(sk, mtu) >= tcp_sk(sk)->mss_cache) + return; + + dst = inet6_csk_update_pmtu(sk, mtu); if (!dst) return; @@ -433,6 +442,8 @@ static int tcp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt, } if (type == ICMPV6_PKT_TOOBIG) { + u32 mtu = ntohl(info); + /* We are not interested in TCP_LISTEN and open_requests * (SYN-ACKs send out by Linux are always <576bytes so * they should go through unfragmented). @@ -443,7 +454,11 @@ static int tcp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt, if (!ip6_sk_accept_pmtu(sk)) goto out; - WRITE_ONCE(tp->mtu_info, ntohl(info)); + if (mtu < IPV6_MIN_MTU) + goto out; + + WRITE_ONCE(tp->mtu_info, mtu); + if (!sock_owned_by_user(sk)) tcp_v6_mtu_reduced(sk); else if (!test_and_set_bit(TCP_MTU_REDUCED_DEFERRED,