diff mbox series

[net] tcp: check if skb is true to avoid crash

Message ID 20240926075646.15592-1-lena.wang@mediatek.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [net] tcp: check if skb is true to avoid crash | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 16 this patch: 16
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers warning 4 maintainers not CCed: linux-arm-kernel@lists.infradead.org angelogioacchino.delregno@collabora.com linux-mediatek@lists.infradead.org matthias.bgg@gmail.com
netdev/build_clang success Errors and warnings before: 18 this patch: 18
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 16 this patch: 16
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 28 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 1 this patch: 1
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-09-26--21-00 (tests: 768)

Commit Message

Lena Wang (王娜) Sept. 26, 2024, 7:56 a.m. UTC
A kernel NULL pointer dereference reported.
Backtrace:
vmlinux tcp_can_coalesce_send_queue_head(sk=0xFFFFFF80316D9400, len=755)
+ 28 </alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:2315>
vmlinux  tcp_mtu_probe(sk=0xFFFFFF80316D9400) + 3196
</alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:2452>
vmlinux  tcp_write_xmit(sk=0xFFFFFF80316D9400, mss_now=128,
nonagle=-2145862684, push_one=0, gfp=2080) + 3296
</alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:2689>
vmlinux  tcp_tsq_write() + 172
</alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:1033>
vmlinux  tcp_tsq_handler() + 104
</alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:1042>
vmlinux  tcp_tasklet_func() + 208

When there is no pending skb in sk->sk_write_queue, tcp_send_head
returns NULL. Directly dereference of skb->len will result crash.
So it is necessary to evaluate the skb to be true here.

Fixes: 808cf9e38cd7 ("tcp: Honor the eor bit in tcp_mtu_probe")
Signed-off-by: Lena Wang <lena.wang@mediatek.com>
---
 net/ipv4/tcp_output.c | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

Comments

Vadim Fedorenko Sept. 26, 2024, 8:50 a.m. UTC | #1
On 26/09/2024 08:56, Lena Wang wrote:
> A kernel NULL pointer dereference reported.
> Backtrace:
> vmlinux tcp_can_coalesce_send_queue_head(sk=0xFFFFFF80316D9400, len=755)
> + 28 </alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:2315>
> vmlinux  tcp_mtu_probe(sk=0xFFFFFF80316D9400) + 3196
> </alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:2452>
> vmlinux  tcp_write_xmit(sk=0xFFFFFF80316D9400, mss_now=128,
> nonagle=-2145862684, push_one=0, gfp=2080) + 3296
> </alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:2689>
> vmlinux  tcp_tsq_write() + 172
> </alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:1033>
> vmlinux  tcp_tsq_handler() + 104
> </alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:1042>
> vmlinux  tcp_tasklet_func() + 208
> 
> When there is no pending skb in sk->sk_write_queue, tcp_send_head
> returns NULL. Directly dereference of skb->len will result crash.
> So it is necessary to evaluate the skb to be true here.
> 
> Fixes: 808cf9e38cd7 ("tcp: Honor the eor bit in tcp_mtu_probe")
> Signed-off-by: Lena Wang <lena.wang@mediatek.com>
> ---
>   net/ipv4/tcp_output.c | 20 +++++++++++---------
>   1 file changed, 11 insertions(+), 9 deletions(-)
> 
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 4fd746bd4d54..12cde5d879c5 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -2338,17 +2338,19 @@ static bool tcp_can_coalesce_send_queue_head(struct sock *sk, int len)
>   	struct sk_buff *skb, *next;
>   
>   	skb = tcp_send_head(sk);
> -	tcp_for_write_queue_from_safe(skb, next, sk) {
> -		if (len <= skb->len)
> -			break;
> +	if (skb) {
> +		tcp_for_write_queue_from_safe(skb, next, sk) {
> +			if (len <= skb->len)
> +				break;

Hi Lena!

I believe the patch can be simplified by using "fast return"

if (!skb)
	return true;

This will make less changes and can simplify further bisecting.

Thanks,
Vadim

> -		if (unlikely(TCP_SKB_CB(skb)->eor) ||
> -		    tcp_has_tx_tstamp(skb) ||
> -		    !skb_pure_zcopy_same(skb, next) ||
> -		    skb_frags_readable(skb) != skb_frags_readable(next))
> -			return false;
> +			if (unlikely(TCP_SKB_CB(skb)->eor) ||
> +			    tcp_has_tx_tstamp(skb) ||
> +			    !skb_pure_zcopy_same(skb, next) ||
> +			    skb_frags_readable(skb) != skb_frags_readable(next))
> +				return false;
>   
> -		len -= skb->len;
> +			len -= skb->len;
> +		}
>   	}
>   
>   	return true;
Vadim Fedorenko Sept. 26, 2024, 9:06 a.m. UTC | #2
On 26/09/2024 08:56, Lena Wang wrote:
> A kernel NULL pointer dereference reported.
> Backtrace:
> vmlinux tcp_can_coalesce_send_queue_head(sk=0xFFFFFF80316D9400, len=755)
> + 28 </alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:2315>
> vmlinux  tcp_mtu_probe(sk=0xFFFFFF80316D9400) + 3196
> </alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:2452>
> vmlinux  tcp_write_xmit(sk=0xFFFFFF80316D9400, mss_now=128,
> nonagle=-2145862684, push_one=0, gfp=2080) + 3296
> </alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:2689>
> vmlinux  tcp_tsq_write() + 172
> </alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:1033>
> vmlinux  tcp_tsq_handler() + 104
> </alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:1042>
> vmlinux  tcp_tasklet_func() + 208
> 
> When there is no pending skb in sk->sk_write_queue, tcp_send_head
> returns NULL. Directly dereference of skb->len will result crash.
> So it is necessary to evaluate the skb to be true here.
> 
> Fixes: 808cf9e38cd7 ("tcp: Honor the eor bit in tcp_mtu_probe")
> Signed-off-by: Lena Wang <lena.wang@mediatek.com>
> ---
>   net/ipv4/tcp_output.c | 20 +++++++++++---------
>   1 file changed, 11 insertions(+), 9 deletions(-)
> 
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 4fd746bd4d54..12cde5d879c5 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -2338,17 +2338,19 @@ static bool tcp_can_coalesce_send_queue_head(struct sock *sk, int len)
>   	struct sk_buff *skb, *next;
>   
>   	skb = tcp_send_head(sk);
> -	tcp_for_write_queue_from_safe(skb, next, sk) {
> -		if (len <= skb->len)
> -			break;
> +	if (skb) {

Thinking more of this, I don't really understand how is it possible to
reach tcp_can_coalesce_send_queue_head() with empty send queue, but
anyway, this patch will move NULL dereference further to tcp_mtu_probe()
where new skb is build with the same tcp_send_head() call.

I believe the proper way is to return false here in case of missing skb:

if (!skb)
	return false;

This will prevent tcp_mtu_probe() to continue execution and avoid
possible NULL dereference.

> +		tcp_for_write_queue_from_safe(skb, next, sk) {
> +			if (len <= skb->len)
> +				break;
>   
> -		if (unlikely(TCP_SKB_CB(skb)->eor) ||
> -		    tcp_has_tx_tstamp(skb) ||
> -		    !skb_pure_zcopy_same(skb, next) ||
> -		    skb_frags_readable(skb) != skb_frags_readable(next))
> -			return false;
> +			if (unlikely(TCP_SKB_CB(skb)->eor) ||
> +			    tcp_has_tx_tstamp(skb) ||
> +			    !skb_pure_zcopy_same(skb, next) ||
> +			    skb_frags_readable(skb) != skb_frags_readable(next))
> +				return false;
>   
> -		len -= skb->len;
> +			len -= skb->len;
> +		}
>   	}
>   
>   	return true;
Eric Dumazet Sept. 26, 2024, 9:07 a.m. UTC | #3
On Thu, Sep 26, 2024 at 9:55 AM Lena Wang <lena.wang@mediatek.com> wrote:
>
> A kernel NULL pointer dereference reported.
> Backtrace:
> vmlinux tcp_can_coalesce_send_queue_head(sk=0xFFFFFF80316D9400, len=755)
> + 28 </alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:2315>
> vmlinux  tcp_mtu_probe(sk=0xFFFFFF80316D9400) + 3196
> </alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:2452>
> vmlinux  tcp_write_xmit(sk=0xFFFFFF80316D9400, mss_now=128,
> nonagle=-2145862684, push_one=0, gfp=2080) + 3296
> </alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:2689>
> vmlinux  tcp_tsq_write() + 172
> </alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:1033>
> vmlinux  tcp_tsq_handler() + 104
> </alps/OfficialRelease/Of/alps/kernel-6.6/net/ipv4/tcp_output.c:1042>
> vmlinux  tcp_tasklet_func() + 208
>
> When there is no pending skb in sk->sk_write_queue, tcp_send_head
> returns NULL. Directly dereference of skb->len will result crash.
> So it is necessary to evaluate the skb to be true here.
>
> Fixes: 808cf9e38cd7 ("tcp: Honor the eor bit in tcp_mtu_probe")
> Signed-off-by: Lena Wang <lena.wang@mediatek.com>
> ---

I am not sure why tcp_send_head() can return NULL.

Before tcp_can_coalesce_send_queue_head() is called, we have this code :

size_needed = probe_size + (tp->reordering + 1) * tp->mss_cache;

/* Have enough data in the send queue to probe? */
if (tp->write_seq - tp->snd_nxt < size_needed)
    return -1;



Do you have a repro ?
Lena Wang (王娜) Sept. 26, 2024, 2:04 p.m. UTC | #4
On Thu, 2024-09-26 at 11:07 +0200, Eric Dumazet wrote:
>  	 
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
>  On Thu, Sep 26, 2024 at 9:55 AM Lena Wang <lena.wang@mediatek.com>
> wrote:
> >
> > A kernel NULL pointer dereference reported.
> > Backtrace:
> > vmlinux tcp_can_coalesce_send_queue_head(sk=0xFFFFFF80316D9400,
> len=755)
> > + 28 </alps/OfficialRelease/Of/alps/kernel-
> 6.6/net/ipv4/tcp_output.c:2315>
> > vmlinux  tcp_mtu_probe(sk=0xFFFFFF80316D9400) + 3196
> > </alps/OfficialRelease/Of/alps/kernel-
> 6.6/net/ipv4/tcp_output.c:2452>
> > vmlinux  tcp_write_xmit(sk=0xFFFFFF80316D9400, mss_now=128,
> > nonagle=-2145862684, push_one=0, gfp=2080) + 3296
> > </alps/OfficialRelease/Of/alps/kernel-
> 6.6/net/ipv4/tcp_output.c:2689>
> > vmlinux  tcp_tsq_write() + 172
> > </alps/OfficialRelease/Of/alps/kernel-
> 6.6/net/ipv4/tcp_output.c:1033>
> > vmlinux  tcp_tsq_handler() + 104
> > </alps/OfficialRelease/Of/alps/kernel-
> 6.6/net/ipv4/tcp_output.c:1042>
> > vmlinux  tcp_tasklet_func() + 208
> >
> > When there is no pending skb in sk->sk_write_queue, tcp_send_head
> > returns NULL. Directly dereference of skb->len will result crash.
> > So it is necessary to evaluate the skb to be true here.
> >
> > Fixes: 808cf9e38cd7 ("tcp: Honor the eor bit in tcp_mtu_probe")
> > Signed-off-by: Lena Wang <lena.wang@mediatek.com>
> > ---
> 
> I am not sure why tcp_send_head() can return NULL.
> 
> Before tcp_can_coalesce_send_queue_head() is called, we have this
> code :
> 
> size_needed = probe_size + (tp->reordering + 1) * tp->mss_cache;
> 
> /* Have enough data in the send queue to probe? */
> if (tp->write_seq - tp->snd_nxt < size_needed)
>     return -1;
> 
> 
> 
> Do you have a repro ?
Hi Eric,
It just happens once in monkey test. I can't reproduce it.

from the dump log, it can see these values:
(gdb) p tp->reordering
$16 = 4
(gdb) p tp->mss_cache
$17 = 128
probe_size = 755
size_needed = 755 + (4+1)*128 = 1395

(gdb) p tp->write_seq
$18 = 1571343838
(gdb) p tp->snd_nxt
$19 = 1571336917 
tp->write_seq - tp->snd_nxt = 1571343838 - 1571336917 = 6921 > 1395
Eric Dumazet Sept. 26, 2024, 2:21 p.m. UTC | #5
On Thu, Sep 26, 2024 at 4:05 PM Lena Wang (王娜) <Lena.Wang@mediatek.com> wrote:
>
> On Thu, 2024-09-26 at 11:07 +0200, Eric Dumazet wrote:
> >
> > External email : Please do not click links or open attachments until
> > you have verified the sender or the content.
> >  On Thu, Sep 26, 2024 at 9:55 AM Lena Wang <lena.wang@mediatek.com>
> > wrote:
> > >
> > > A kernel NULL pointer dereference reported.
> > > Backtrace:
> > > vmlinux tcp_can_coalesce_send_queue_head(sk=0xFFFFFF80316D9400,
> > len=755)
> > > + 28 </alps/OfficialRelease/Of/alps/kernel-
> > 6.6/net/ipv4/tcp_output.c:2315>
> > > vmlinux  tcp_mtu_probe(sk=0xFFFFFF80316D9400) + 3196
> > > </alps/OfficialRelease/Of/alps/kernel-
> > 6.6/net/ipv4/tcp_output.c:2452>
> > > vmlinux  tcp_write_xmit(sk=0xFFFFFF80316D9400, mss_now=128,
> > > nonagle=-2145862684, push_one=0, gfp=2080) + 3296
> > > </alps/OfficialRelease/Of/alps/kernel-
> > 6.6/net/ipv4/tcp_output.c:2689>
> > > vmlinux  tcp_tsq_write() + 172
> > > </alps/OfficialRelease/Of/alps/kernel-
> > 6.6/net/ipv4/tcp_output.c:1033>
> > > vmlinux  tcp_tsq_handler() + 104
> > > </alps/OfficialRelease/Of/alps/kernel-
> > 6.6/net/ipv4/tcp_output.c:1042>
> > > vmlinux  tcp_tasklet_func() + 208
> > >
> > > When there is no pending skb in sk->sk_write_queue, tcp_send_head
> > > returns NULL. Directly dereference of skb->len will result crash.
> > > So it is necessary to evaluate the skb to be true here.
> > >
> > > Fixes: 808cf9e38cd7 ("tcp: Honor the eor bit in tcp_mtu_probe")
> > > Signed-off-by: Lena Wang <lena.wang@mediatek.com>
> > > ---
> >
> > I am not sure why tcp_send_head() can return NULL.
> >
> > Before tcp_can_coalesce_send_queue_head() is called, we have this
> > code :
> >
> > size_needed = probe_size + (tp->reordering + 1) * tp->mss_cache;
> >
> > /* Have enough data in the send queue to probe? */
> > if (tp->write_seq - tp->snd_nxt < size_needed)
> >     return -1;
> >
> >
> >
> > Do you have a repro ?
> Hi Eric,
> It just happens once in monkey test. I can't reproduce it.
>
> from the dump log, it can see these values:
> (gdb) p tp->reordering
> $16 = 4
> (gdb) p tp->mss_cache
> $17 = 128
> probe_size = 755
> size_needed = 755 + (4+1)*128 = 1395
>
> (gdb) p tp->write_seq
> $18 = 1571343838
> (gdb) p tp->snd_nxt
> $19 = 1571336917
> tp->write_seq - tp->snd_nxt = 1571343838 - 1571336917 = 6921 > 1395

OK thanks.

Next question is : with 6921 bytes in the write queue, how
tcp_send_head could possibly be NULL ?

This would hint at a more serious bug breaking a fundamental invariant.
diff mbox series

Patch

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 4fd746bd4d54..12cde5d879c5 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2338,17 +2338,19 @@  static bool tcp_can_coalesce_send_queue_head(struct sock *sk, int len)
 	struct sk_buff *skb, *next;
 
 	skb = tcp_send_head(sk);
-	tcp_for_write_queue_from_safe(skb, next, sk) {
-		if (len <= skb->len)
-			break;
+	if (skb) {
+		tcp_for_write_queue_from_safe(skb, next, sk) {
+			if (len <= skb->len)
+				break;
 
-		if (unlikely(TCP_SKB_CB(skb)->eor) ||
-		    tcp_has_tx_tstamp(skb) ||
-		    !skb_pure_zcopy_same(skb, next) ||
-		    skb_frags_readable(skb) != skb_frags_readable(next))
-			return false;
+			if (unlikely(TCP_SKB_CB(skb)->eor) ||
+			    tcp_has_tx_tstamp(skb) ||
+			    !skb_pure_zcopy_same(skb, next) ||
+			    skb_frags_readable(skb) != skb_frags_readable(next))
+				return false;
 
-		len -= skb->len;
+			len -= skb->len;
+		}
 	}
 
 	return true;