diff mbox series

[net-next] tcp: drop skb dst in tcp_rcv_established()

Message ID 20220430011523.3004693-1-eric.dumazet@gmail.com (mailing list archive)
State Accepted
Commit 783d108dd71d97e4cac5fe8ce70ca43ed7dc7bb7
Delegated to: Netdev Maintainers
Headers show
Series [net-next] tcp: drop skb dst in tcp_rcv_established() | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 3 this patch: 3
netdev/cc_maintainers warning 2 maintainers not CCed: yoshfuji@linux-ipv6.org dsahern@kernel.org
netdev/build_clang success Errors and warnings before: 9 this patch: 9
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 3 this patch: 3
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 7 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Eric Dumazet April 30, 2022, 1:15 a.m. UTC
From: Eric Dumazet <edumazet@google.com>

In commit f84af32cbca7 ("net: ip_queue_rcv_skb() helper")
I dropped the skb dst in tcp_data_queue().

This only dealt with so-called TCP input slow path.

When fast path is taken, tcp_rcv_established() calls
tcp_queue_rcv() while skb still has a dst.

This was mostly fine, because most dsts at this point
are not refcounted (thanks to early demux)

However, TCP packets sent over loopback have refcounted dst.

Then commit 68822bdf76f1 ("net: generalize skb freeing
deferral to per-cpu lists") came and had the effect
of delaying skb freeing for an arbitrary time.

If during this time the involved netns is dismantled, cleanup_net()
frees the struct net with embedded net->ipv6.ip6_dst_ops.

Then when eventually dst_destroy_rcu() is called,
if (dst->ops->destroy) ... triggers an use-after-free.

It is not clear if ip6_route_net_exit() lacks a rcu_barrier()
as syzbot reported similar issues before the blamed commit.

( https://groups.google.com/g/syzkaller-bugs/c/CofzW4eeA9A/m/009WjumTAAAJ )

Fixes: 68822bdf76f1 ("net: generalize skb freeing deferral to per-cpu lists")
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv4/tcp_input.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Neal Cardwell April 30, 2022, 2:20 a.m. UTC | #1
On Fri, Apr 29, 2022 at 9:15 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
> From: Eric Dumazet <edumazet@google.com>
>
> In commit f84af32cbca7 ("net: ip_queue_rcv_skb() helper")
> I dropped the skb dst in tcp_data_queue().
>
> This only dealt with so-called TCP input slow path.
>
> When fast path is taken, tcp_rcv_established() calls
> tcp_queue_rcv() while skb still has a dst.
>
> This was mostly fine, because most dsts at this point
> are not refcounted (thanks to early demux)
>
> However, TCP packets sent over loopback have refcounted dst.
>
> Then commit 68822bdf76f1 ("net: generalize skb freeing
> deferral to per-cpu lists") came and had the effect
> of delaying skb freeing for an arbitrary time.
>
> If during this time the involved netns is dismantled, cleanup_net()
> frees the struct net with embedded net->ipv6.ip6_dst_ops.
>
> Then when eventually dst_destroy_rcu() is called,
> if (dst->ops->destroy) ... triggers an use-after-free.
>
> It is not clear if ip6_route_net_exit() lacks a rcu_barrier()
> as syzbot reported similar issues before the blamed commit.
>
> ( https://groups.google.com/g/syzkaller-bugs/c/CofzW4eeA9A/m/009WjumTAAAJ )
>
> Fixes: 68822bdf76f1 ("net: generalize skb freeing deferral to per-cpu lists")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---
>  net/ipv4/tcp_input.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index cc3de8dc57970c97316ad1591cac0ca5f1a24c47..97cfcd85f84e6f873c3e60c388e6c27628451a7d 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -5928,6 +5928,7 @@ void tcp_rcv_established(struct sock *sk, struct sk_buff *skb)
>                         NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPHPHITS);
>
>                         /* Bulk data transfer: receiver */
> +                       skb_dst_drop(skb);
>                         __skb_pull(skb, tcp_header_len);
>                         eaten = tcp_queue_rcv(sk, skb, &fragstolen);
>
> --

Nice catch. Thanks, Eric!

Acked-by: Neal Cardwell <ncardwell@google.com>

neal
Soheil Hassas Yeganeh April 30, 2022, 2:47 a.m. UTC | #2
On Fri, Apr 29, 2022 at 10:20 PM Neal Cardwell <ncardwell@google.com> wrote:
>
> On Fri, Apr 29, 2022 at 9:15 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
> >
> > From: Eric Dumazet <edumazet@google.com>
> >
> > In commit f84af32cbca7 ("net: ip_queue_rcv_skb() helper")
> > I dropped the skb dst in tcp_data_queue().
> >
> > This only dealt with so-called TCP input slow path.
> >
> > When fast path is taken, tcp_rcv_established() calls
> > tcp_queue_rcv() while skb still has a dst.
> >
> > This was mostly fine, because most dsts at this point
> > are not refcounted (thanks to early demux)
> >
> > However, TCP packets sent over loopback have refcounted dst.
> >
> > Then commit 68822bdf76f1 ("net: generalize skb freeing
> > deferral to per-cpu lists") came and had the effect
> > of delaying skb freeing for an arbitrary time.
> >
> > If during this time the involved netns is dismantled, cleanup_net()
> > frees the struct net with embedded net->ipv6.ip6_dst_ops.
> >
> > Then when eventually dst_destroy_rcu() is called,
> > if (dst->ops->destroy) ... triggers an use-after-free.
> >
> > It is not clear if ip6_route_net_exit() lacks a rcu_barrier()
> > as syzbot reported similar issues before the blamed commit.
> >
> > ( https://groups.google.com/g/syzkaller-bugs/c/CofzW4eeA9A/m/009WjumTAAAJ )
> >
> > Fixes: 68822bdf76f1 ("net: generalize skb freeing deferral to per-cpu lists")
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > ---
> >  net/ipv4/tcp_input.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> > index cc3de8dc57970c97316ad1591cac0ca5f1a24c47..97cfcd85f84e6f873c3e60c388e6c27628451a7d 100644
> > --- a/net/ipv4/tcp_input.c
> > +++ b/net/ipv4/tcp_input.c
> > @@ -5928,6 +5928,7 @@ void tcp_rcv_established(struct sock *sk, struct sk_buff *skb)
> >                         NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPHPHITS);
> >
> >                         /* Bulk data transfer: receiver */
> > +                       skb_dst_drop(skb);
> >                         __skb_pull(skb, tcp_header_len);
> >                         eaten = tcp_queue_rcv(sk, skb, &fragstolen);
> >
> > --
>
> Nice catch. Thanks, Eric!
>
> Acked-by: Neal Cardwell <ncardwell@google.com>
>
> neal

Acked-by: Soheil Hassas Yeganeh <soheil@google.com>

Thank you for the fix!
patchwork-bot+netdevbpf@kernel.org April 30, 2022, 12:30 p.m. UTC | #3
Hello:

This patch was applied to netdev/net-next.git (master)
by David S. Miller <davem@davemloft.net>:

On Fri, 29 Apr 2022 18:15:23 -0700 you wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> In commit f84af32cbca7 ("net: ip_queue_rcv_skb() helper")
> I dropped the skb dst in tcp_data_queue().
> 
> This only dealt with so-called TCP input slow path.
> 
> [...]

Here is the summary with links:
  - [net-next] tcp: drop skb dst in tcp_rcv_established()
    https://git.kernel.org/netdev/net-next/c/783d108dd71d

You are awesome, thank you!
diff mbox series

Patch

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index cc3de8dc57970c97316ad1591cac0ca5f1a24c47..97cfcd85f84e6f873c3e60c388e6c27628451a7d 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -5928,6 +5928,7 @@  void tcp_rcv_established(struct sock *sk, struct sk_buff *skb)
 			NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPHPHITS);
 
 			/* Bulk data transfer: receiver */
+			skb_dst_drop(skb);
 			__skb_pull(skb, tcp_header_len);
 			eaten = tcp_queue_rcv(sk, skb, &fragstolen);