Message ID | 20240926075646.15592-1-lena.wang@mediatek.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] tcp: check if skb is true to avoid crash | expand |
On 26/09/2024 08:56, Lena Wang wrote: > A kernel NULL pointer dereference reported. > Backtrace: > vmlinux tcp_can_coalesce_send_queue_head(sk=0xFFFFFF80316D9400, len=755) > + 28 </alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:2315> > vmlinux tcp_mtu_probe(sk=0xFFFFFF80316D9400) + 3196 > </alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:2452> > vmlinux tcp_write_xmit(sk=0xFFFFFF80316D9400, mss_now=128, > nonagle=-2145862684, push_one=0, gfp=2080) + 3296 > </alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:2689> > vmlinux tcp_tsq_write() + 172 > </alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:1033> > vmlinux tcp_tsq_handler() + 104 > </alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:1042> > vmlinux tcp_tasklet_func() + 208 > > When there is no pending skb in sk->sk_write_queue, tcp_send_head > returns NULL. Directly dereference of skb->len will result crash. > So it is necessary to evaluate the skb to be true here. > > Fixes: 808cf9e38cd7 ("tcp: Honor the eor bit in tcp_mtu_probe") > Signed-off-by: Lena Wang <lena.wang@mediatek.com> > --- > net/ipv4/tcp_output.c | 20 +++++++++++--------- > 1 file changed, 11 insertions(+), 9 deletions(-) > > diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c > index 4fd746bd4d54..12cde5d879c5 100644 > --- a/net/ipv4/tcp_output.c > +++ b/net/ipv4/tcp_output.c > @@ -2338,17 +2338,19 @@ static bool tcp_can_coalesce_send_queue_head(struct sock *sk, int len) > struct sk_buff *skb, *next; > > skb = tcp_send_head(sk); > - tcp_for_write_queue_from_safe(skb, next, sk) { > - if (len <= skb->len) > - break; > + if (skb) { > + tcp_for_write_queue_from_safe(skb, next, sk) { > + if (len <= skb->len) > + break; Hi Lena! I believe the patch can be simplified by using "fast return" if (!skb) return true; This will make less changes and can simplify further bisecting. Thanks, Vadim > - if (unlikely(TCP_SKB_CB(skb)->eor) || > - tcp_has_tx_tstamp(skb) || > - !skb_pure_zcopy_same(skb, next) || > - skb_frags_readable(skb) != skb_frags_readable(next)) > - return false; > + if (unlikely(TCP_SKB_CB(skb)->eor) || > + tcp_has_tx_tstamp(skb) || > + !skb_pure_zcopy_same(skb, next) || > + skb_frags_readable(skb) != skb_frags_readable(next)) > + return false; > > - len -= skb->len; > + len -= skb->len; > + } > } > > return true;
On 26/09/2024 08:56, Lena Wang wrote: > A kernel NULL pointer dereference reported. > Backtrace: > vmlinux tcp_can_coalesce_send_queue_head(sk=0xFFFFFF80316D9400, len=755) > + 28 </alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:2315> > vmlinux tcp_mtu_probe(sk=0xFFFFFF80316D9400) + 3196 > </alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:2452> > vmlinux tcp_write_xmit(sk=0xFFFFFF80316D9400, mss_now=128, > nonagle=-2145862684, push_one=0, gfp=2080) + 3296 > </alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:2689> > vmlinux tcp_tsq_write() + 172 > </alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:1033> > vmlinux tcp_tsq_handler() + 104 > </alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:1042> > vmlinux tcp_tasklet_func() + 208 > > When there is no pending skb in sk->sk_write_queue, tcp_send_head > returns NULL. Directly dereference of skb->len will result crash. > So it is necessary to evaluate the skb to be true here. > > Fixes: 808cf9e38cd7 ("tcp: Honor the eor bit in tcp_mtu_probe") > Signed-off-by: Lena Wang <lena.wang@mediatek.com> > --- > net/ipv4/tcp_output.c | 20 +++++++++++--------- > 1 file changed, 11 insertions(+), 9 deletions(-) > > diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c > index 4fd746bd4d54..12cde5d879c5 100644 > --- a/net/ipv4/tcp_output.c > +++ b/net/ipv4/tcp_output.c > @@ -2338,17 +2338,19 @@ static bool tcp_can_coalesce_send_queue_head(struct sock *sk, int len) > struct sk_buff *skb, *next; > > skb = tcp_send_head(sk); > - tcp_for_write_queue_from_safe(skb, next, sk) { > - if (len <= skb->len) > - break; > + if (skb) { Thinking more of this, I don't really understand how is it possible to reach tcp_can_coalesce_send_queue_head() with empty send queue, but anyway, this patch will move NULL dereference further to tcp_mtu_probe() where new skb is build with the same tcp_send_head() call. I believe the proper way is to return false here in case of missing skb: if (!skb) return false; This will prevent tcp_mtu_probe() to continue execution and avoid possible NULL dereference. > + tcp_for_write_queue_from_safe(skb, next, sk) { > + if (len <= skb->len) > + break; > > - if (unlikely(TCP_SKB_CB(skb)->eor) || > - tcp_has_tx_tstamp(skb) || > - !skb_pure_zcopy_same(skb, next) || > - skb_frags_readable(skb) != skb_frags_readable(next)) > - return false; > + if (unlikely(TCP_SKB_CB(skb)->eor) || > + tcp_has_tx_tstamp(skb) || > + !skb_pure_zcopy_same(skb, next) || > + skb_frags_readable(skb) != skb_frags_readable(next)) > + return false; > > - len -= skb->len; > + len -= skb->len; > + } > } > > return true;
On Thu, Sep 26, 2024 at 9:55 AM Lena Wang <lena.wang@mediatek.com> wrote: > > A kernel NULL pointer dereference reported. > Backtrace: > vmlinux tcp_can_coalesce_send_queue_head(sk=0xFFFFFF80316D9400, len=755) > + 28 </alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:2315> > vmlinux tcp_mtu_probe(sk=0xFFFFFF80316D9400) + 3196 > </alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:2452> > vmlinux tcp_write_xmit(sk=0xFFFFFF80316D9400, mss_now=128, > nonagle=-2145862684, push_one=0, gfp=2080) + 3296 > </alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:2689> > vmlinux tcp_tsq_write() + 172 > </alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:1033> > vmlinux tcp_tsq_handler() + 104 > </alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:1042> > vmlinux tcp_tasklet_func() + 208 > > When there is no pending skb in sk->sk_write_queue, tcp_send_head > returns NULL. Directly dereference of skb->len will result crash. > So it is necessary to evaluate the skb to be true here. > > Fixes: 808cf9e38cd7 ("tcp: Honor the eor bit in tcp_mtu_probe") > Signed-off-by: Lena Wang <lena.wang@mediatek.com> > --- I am not sure why tcp_send_head() can return NULL. Before tcp_can_coalesce_send_queue_head() is called, we have this code : size_needed = probe_size + (tp->reordering + 1) * tp->mss_cache; /* Have enough data in the send queue to probe? */ if (tp->write_seq - tp->snd_nxt < size_needed) return -1; Do you have a repro ?
On Thu, 2024-09-26 at 11:07 +0200, Eric Dumazet wrote: > > External email : Please do not click links or open attachments until > you have verified the sender or the content. > On Thu, Sep 26, 2024 at 9:55 AM Lena Wang <lena.wang@mediatek.com> > wrote: > > > > A kernel NULL pointer dereference reported. > > Backtrace: > > vmlinux tcp_can_coalesce_send_queue_head(sk=0xFFFFFF80316D9400, > len=755) > > + 28 </alps/OfficialRelease/Of/alps/kernel- > 6.6/net/ipv4/tcp_output.c:2315> > > vmlinux tcp_mtu_probe(sk=0xFFFFFF80316D9400) + 3196 > > </alps/OfficialRelease/Of/alps/kernel- > 6.6/net/ipv4/tcp_output.c:2452> > > vmlinux tcp_write_xmit(sk=0xFFFFFF80316D9400, mss_now=128, > > nonagle=-2145862684, push_one=0, gfp=2080) + 3296 > > </alps/OfficialRelease/Of/alps/kernel- > 6.6/net/ipv4/tcp_output.c:2689> > > vmlinux tcp_tsq_write() + 172 > > </alps/OfficialRelease/Of/alps/kernel- > 6.6/net/ipv4/tcp_output.c:1033> > > vmlinux tcp_tsq_handler() + 104 > > </alps/OfficialRelease/Of/alps/kernel- > 6.6/net/ipv4/tcp_output.c:1042> > > vmlinux tcp_tasklet_func() + 208 > > > > When there is no pending skb in sk->sk_write_queue, tcp_send_head > > returns NULL. Directly dereference of skb->len will result crash. > > So it is necessary to evaluate the skb to be true here. > > > > Fixes: 808cf9e38cd7 ("tcp: Honor the eor bit in tcp_mtu_probe") > > Signed-off-by: Lena Wang <lena.wang@mediatek.com> > > --- > > I am not sure why tcp_send_head() can return NULL. > > Before tcp_can_coalesce_send_queue_head() is called, we have this > code : > > size_needed = probe_size + (tp->reordering + 1) * tp->mss_cache; > > /* Have enough data in the send queue to probe? */ > if (tp->write_seq - tp->snd_nxt < size_needed) > return -1; > > > > Do you have a repro ? Hi Eric, It just happens once in monkey test. I can't reproduce it. from the dump log, it can see these values: (gdb) p tp->reordering $16 = 4 (gdb) p tp->mss_cache $17 = 128 probe_size = 755 size_needed = 755 + (4+1)*128 = 1395 (gdb) p tp->write_seq $18 = 1571343838 (gdb) p tp->snd_nxt $19 = 1571336917 tp->write_seq - tp->snd_nxt = 1571343838 - 1571336917 = 6921 > 1395
On Thu, Sep 26, 2024 at 4:05 PM Lena Wang (王娜) <Lena.Wang@mediatek.com> wrote: > > On Thu, 2024-09-26 at 11:07 +0200, Eric Dumazet wrote: > > > > External email : Please do not click links or open attachments until > > you have verified the sender or the content. > > On Thu, Sep 26, 2024 at 9:55 AM Lena Wang <lena.wang@mediatek.com> > > wrote: > > > > > > A kernel NULL pointer dereference reported. > > > Backtrace: > > > vmlinux tcp_can_coalesce_send_queue_head(sk=0xFFFFFF80316D9400, > > len=755) > > > + 28 </alps/OfficialRelease/Of/alps/kernel- > > 6.6/net/ipv4/tcp_output.c:2315> > > > vmlinux tcp_mtu_probe(sk=0xFFFFFF80316D9400) + 3196 > > > </alps/OfficialRelease/Of/alps/kernel- > > 6.6/net/ipv4/tcp_output.c:2452> > > > vmlinux tcp_write_xmit(sk=0xFFFFFF80316D9400, mss_now=128, > > > nonagle=-2145862684, push_one=0, gfp=2080) + 3296 > > > </alps/OfficialRelease/Of/alps/kernel- > > 6.6/net/ipv4/tcp_output.c:2689> > > > vmlinux tcp_tsq_write() + 172 > > > </alps/OfficialRelease/Of/alps/kernel- > > 6.6/net/ipv4/tcp_output.c:1033> > > > vmlinux tcp_tsq_handler() + 104 > > > </alps/OfficialRelease/Of/alps/kernel- > > 6.6/net/ipv4/tcp_output.c:1042> > > > vmlinux tcp_tasklet_func() + 208 > > > > > > When there is no pending skb in sk->sk_write_queue, tcp_send_head > > > returns NULL. Directly dereference of skb->len will result crash. > > > So it is necessary to evaluate the skb to be true here. > > > > > > Fixes: 808cf9e38cd7 ("tcp: Honor the eor bit in tcp_mtu_probe") > > > Signed-off-by: Lena Wang <lena.wang@mediatek.com> > > > --- > > > > I am not sure why tcp_send_head() can return NULL. > > > > Before tcp_can_coalesce_send_queue_head() is called, we have this > > code : > > > > size_needed = probe_size + (tp->reordering + 1) * tp->mss_cache; > > > > /* Have enough data in the send queue to probe? */ > > if (tp->write_seq - tp->snd_nxt < size_needed) > > return -1; > > > > > > > > Do you have a repro ? > Hi Eric, > It just happens once in monkey test. I can't reproduce it. > > from the dump log, it can see these values: > (gdb) p tp->reordering > $16 = 4 > (gdb) p tp->mss_cache > $17 = 128 > probe_size = 755 > size_needed = 755 + (4+1)*128 = 1395 > > (gdb) p tp->write_seq > $18 = 1571343838 > (gdb) p tp->snd_nxt > $19 = 1571336917 > tp->write_seq - tp->snd_nxt = 1571343838 - 1571336917 = 6921 > 1395 OK thanks. Next question is : with 6921 bytes in the write queue, how tcp_send_head could possibly be NULL ? This would hint at a more serious bug breaking a fundamental invariant.
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 4fd746bd4d54..12cde5d879c5 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -2338,17 +2338,19 @@ static bool tcp_can_coalesce_send_queue_head(struct sock *sk, int len) struct sk_buff *skb, *next; skb = tcp_send_head(sk); - tcp_for_write_queue_from_safe(skb, next, sk) { - if (len <= skb->len) - break; + if (skb) { + tcp_for_write_queue_from_safe(skb, next, sk) { + if (len <= skb->len) + break; - if (unlikely(TCP_SKB_CB(skb)->eor) || - tcp_has_tx_tstamp(skb) || - !skb_pure_zcopy_same(skb, next) || - skb_frags_readable(skb) != skb_frags_readable(next)) - return false; + if (unlikely(TCP_SKB_CB(skb)->eor) || + tcp_has_tx_tstamp(skb) || + !skb_pure_zcopy_same(skb, next) || + skb_frags_readable(skb) != skb_frags_readable(next)) + return false; - len -= skb->len; + len -= skb->len; + } } return true;
A kernel NULL pointer dereference reported. Backtrace: vmlinux tcp_can_coalesce_send_queue_head(sk=0xFFFFFF80316D9400, len=755) + 28 </alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:2315> vmlinux tcp_mtu_probe(sk=0xFFFFFF80316D9400) + 3196 </alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:2452> vmlinux tcp_write_xmit(sk=0xFFFFFF80316D9400, mss_now=128, nonagle=-2145862684, push_one=0, gfp=2080) + 3296 </alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:2689> vmlinux tcp_tsq_write() + 172 </alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:1033> vmlinux tcp_tsq_handler() + 104 </alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:1042> vmlinux tcp_tasklet_func() + 208 When there is no pending skb in sk->sk_write_queue, tcp_send_head returns NULL. Directly dereference of skb->len will result crash. So it is necessary to evaluate the skb to be true here. Fixes: 808cf9e38cd7 ("tcp: Honor the eor bit in tcp_mtu_probe") Signed-off-by: Lena Wang <lena.wang@mediatek.com> --- net/ipv4/tcp_output.c | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-)