diff mbox series

[v3,net] ipv6: tcp: drop silly ICMPv6 packet too big messages

Message ID 20210708072109.1241563-1-eric.dumazet@gmail.com (mailing list archive)
State Accepted
Commit c7bb4b89033b764eb07db4e060548a6311d801ee
Delegated to: Netdev Maintainers
Headers show
Series [v3,net] ipv6: tcp: drop silly ICMPv6 packet too big messages | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for net
netdev/subject_prefix success Link
netdev/cc_maintainers warning 2 maintainers not CCed: yoshfuji@linux-ipv6.org dsahern@kernel.org
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 3 this patch: 3
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 48 lines checked
netdev/build_allmodconfig_warn success Errors and warnings before: 3 this patch: 3
netdev/header_inline success Link

Commit Message

Eric Dumazet July 8, 2021, 7:21 a.m. UTC
From: Eric Dumazet <edumazet@google.com>

While TCP stack scales reasonably well, there is still one part that
can be used to DDOS it.

IPv6 Packet too big messages have to lookup/insert a new route,
and if abused by attackers, can easily put hosts under high stress,
with many cpus contending on a spinlock while one is stuck in fib6_run_gc()

ip6_protocol_deliver_rcu()
 icmpv6_rcv()
  icmpv6_notify()
   tcp_v6_err()
    tcp_v6_mtu_reduced()
     inet6_csk_update_pmtu()
      ip6_rt_update_pmtu()
       __ip6_rt_update_pmtu()
        ip6_rt_cache_alloc()
         ip6_dst_alloc()
          dst_alloc()
           ip6_dst_gc()
            fib6_run_gc()
             spin_lock_bh() ...

Some of our servers have been hit by malicious ICMPv6 packets
trying to _increase_ the MTU/MSS of TCP flows.

We believe these ICMPv6 packets are a result of a bug in one ISP stack,
since they were blindly sent back for _every_ (small) packet sent to them.

These packets are for one TCP flow:
09:24:36.266491 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240
09:24:36.266509 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240
09:24:36.316688 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240
09:24:36.316704 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240
09:24:36.608151 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240

TCP stack can filter some silly requests :

1) MTU below IPV6_MIN_MTU can be filtered early in tcp_v6_err()
2) tcp_v6_mtu_reduced() can drop requests trying to increase current MSS.

This tests happen before the IPv6 routing stack is entered, thus
removing the potential contention and route exhaustion.

Note that IPv6 stack was performing these checks, but too late
(ie : after the route has been added, and after the potential
garbage collect war)

v2: fix typo caught by Martin, thanks !
v3: exports tcp_mtu_to_mss(), caught by David, thanks !

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Maciej Żenczykowski <maze@google.com>
Cc: Martin KaFai Lau <kafai@fb.com>
---
 net/ipv4/tcp_output.c |  1 +
 net/ipv6/tcp_ipv6.c   | 19 +++++++++++++++++--
 2 files changed, 18 insertions(+), 2 deletions(-)

Comments

Martin KaFai Lau July 8, 2021, 5:46 p.m. UTC | #1
On Thu, Jul 08, 2021 at 12:21:09AM -0700, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> While TCP stack scales reasonably well, there is still one part that
> can be used to DDOS it.
> 
> IPv6 Packet too big messages have to lookup/insert a new route,
> and if abused by attackers, can easily put hosts under high stress,
> with many cpus contending on a spinlock while one is stuck in fib6_run_gc()
> 
> ip6_protocol_deliver_rcu()
>  icmpv6_rcv()
>   icmpv6_notify()
>    tcp_v6_err()
>     tcp_v6_mtu_reduced()
>      inet6_csk_update_pmtu()
>       ip6_rt_update_pmtu()
>        __ip6_rt_update_pmtu()
>         ip6_rt_cache_alloc()
>          ip6_dst_alloc()
>           dst_alloc()
>            ip6_dst_gc()
>             fib6_run_gc()
>              spin_lock_bh() ...
> 
> Some of our servers have been hit by malicious ICMPv6 packets
> trying to _increase_ the MTU/MSS of TCP flows.
> 
> We believe these ICMPv6 packets are a result of a bug in one ISP stack,
> since they were blindly sent back for _every_ (small) packet sent to them.
> 
> These packets are for one TCP flow:
> 09:24:36.266491 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240
> 09:24:36.266509 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240
> 09:24:36.316688 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240
> 09:24:36.316704 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240
> 09:24:36.608151 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240
> 
> TCP stack can filter some silly requests :
> 
> 1) MTU below IPV6_MIN_MTU can be filtered early in tcp_v6_err()
> 2) tcp_v6_mtu_reduced() can drop requests trying to increase current MSS.
> 
> This tests happen before the IPv6 routing stack is entered, thus
> removing the potential contention and route exhaustion.
> 
> Note that IPv6 stack was performing these checks, but too late
> (ie : after the route has been added, and after the potential
> garbage collect war)
> 
> v2: fix typo caught by Martin, thanks !
> v3: exports tcp_mtu_to_mss(), caught by David, thanks !
Acked-by: Martin KaFai Lau <kafai@fb.com>
patchwork-bot+netdevbpf@kernel.org July 8, 2021, 7:30 p.m. UTC | #2
Hello:

This patch was applied to netdev/net.git (refs/heads/master):

On Thu,  8 Jul 2021 00:21:09 -0700 you wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> While TCP stack scales reasonably well, there is still one part that
> can be used to DDOS it.
> 
> IPv6 Packet too big messages have to lookup/insert a new route,
> and if abused by attackers, can easily put hosts under high stress,
> with many cpus contending on a spinlock while one is stuck in fib6_run_gc()
> 
> [...]

Here is the summary with links:
  - [v3,net] ipv6: tcp: drop silly ICMPv6 packet too big messages
    https://git.kernel.org/netdev/net/c/c7bb4b89033b

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
Jakub Kicinski March 15, 2022, 11:19 p.m. UTC | #3
On Thu,  8 Jul 2021 00:21:09 -0700 Eric Dumazet wrote:
> +	/* Drop requests trying to increase our current mss.
> +	 * Check done in __ip6_rt_update_pmtu() is too late.
> +	 */
> +	if (tcp_mtu_to_mss(sk, mtu) >= tcp_sk(sk)->mss_cache)
> +		return;

Hi Eric! I think this breaks TFO + pMTU.

mss_cache is 1208 but TFO seems to use tp->rx_opt.mss_clamp
to size the packet. So we ignore the incoming PTB. Should we 
init mss_cache for TFO?
diff mbox series

Patch

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index bde781f46b41a5dd9eb8db3fb65b45d73e592b4b..29553fce8502861087830b94cc4fbebfce6e60dc 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1732,6 +1732,7 @@  int tcp_mtu_to_mss(struct sock *sk, int pmtu)
 	return __tcp_mtu_to_mss(sk, pmtu) -
 	       (tcp_sk(sk)->tcp_header_len - sizeof(struct tcphdr));
 }
+EXPORT_SYMBOL(tcp_mtu_to_mss);
 
 /* Inverse of above */
 int tcp_mss_to_mtu(struct sock *sk, int mss)
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 593c32fe57ed13a218492fd6056f2593e601ec79..323989927a0a6a2274bcbc1cd0ac72e9d49b24ad 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -348,11 +348,20 @@  static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
 static void tcp_v6_mtu_reduced(struct sock *sk)
 {
 	struct dst_entry *dst;
+	u32 mtu;
 
 	if ((1 << sk->sk_state) & (TCPF_LISTEN | TCPF_CLOSE))
 		return;
 
-	dst = inet6_csk_update_pmtu(sk, READ_ONCE(tcp_sk(sk)->mtu_info));
+	mtu = READ_ONCE(tcp_sk(sk)->mtu_info);
+
+	/* Drop requests trying to increase our current mss.
+	 * Check done in __ip6_rt_update_pmtu() is too late.
+	 */
+	if (tcp_mtu_to_mss(sk, mtu) >= tcp_sk(sk)->mss_cache)
+		return;
+
+	dst = inet6_csk_update_pmtu(sk, mtu);
 	if (!dst)
 		return;
 
@@ -433,6 +442,8 @@  static int tcp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
 	}
 
 	if (type == ICMPV6_PKT_TOOBIG) {
+		u32 mtu = ntohl(info);
+
 		/* We are not interested in TCP_LISTEN and open_requests
 		 * (SYN-ACKs send out by Linux are always <576bytes so
 		 * they should go through unfragmented).
@@ -443,7 +454,11 @@  static int tcp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
 		if (!ip6_sk_accept_pmtu(sk))
 			goto out;
 
-		WRITE_ONCE(tp->mtu_info, ntohl(info));
+		if (mtu < IPV6_MIN_MTU)
+			goto out;
+
+		WRITE_ONCE(tp->mtu_info, mtu);
+
 		if (!sock_owned_by_user(sk))
 			tcp_v6_mtu_reduced(sk);
 		else if (!test_and_set_bit(TCP_MTU_REDUCED_DEFERRED,