diff mbox series

[net] gso: fix gso fraglist segmentation after pull from frag_list

Message ID 20240922150450.3873767-1-willemdebruijn.kernel@gmail.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [net] gso: fix gso fraglist segmentation after pull from frag_list | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net, async
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 16 this patch: 16
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers fail 1 blamed authors not CCed: steffen.klassert@secunet.com; 7 maintainers not CCed: bpf@vger.kernel.org angelogioacchino.delregno@collabora.com linux-mediatek@lists.infradead.org matthias.bgg@gmail.com linux-arm-kernel@lists.infradead.org steffen.klassert@secunet.com dsahern@kernel.org
netdev/build_clang success Errors and warnings before: 16 this patch: 16
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 21 this patch: 21
netdev/checkpatch warning WARNING: line length of 82 exceeds 80 columns WARNING: line length of 84 exceeds 80 columns WARNING: line length of 88 exceeds 80 columns
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-09-24--03-00 (tests: 762)

Commit Message

Willem de Bruijn Sept. 22, 2024, 3:03 p.m. UTC
From: Willem de Bruijn <willemb@google.com>

Detect gso fraglist skbs with corrupted geometry (see below) and
pass these to skb_segment instead of skb_segment_list, as the first
can segment them correctly.

Valid SKB_GSO_FRAGLIST skbs
- consist of two or more segments
- the head_skb holds the protocol headers plus first gso_size
- one or more frag_list skbs hold exactly one segment
- all but the last must be gso_size

Optional datapath hooks such as NAT and BPF (bpf_skb_pull_data) can
modify these skbs, breaking these invariants.

In extreme cases they pull all data into skb linear. For UDP, this
causes a NULL ptr deref in __udpv4_gso_segment_list_csum at
udp_hdr(seg->next)->dest.

Detect invalid geometry due to pull, by checking head_skb size.
Don't just drop, as this may blackhole a destination. Convert to be
able to pass to regular skb_segment.

Link: https://lore.kernel.org/netdev/20240428142913.18666-1-shiming.cheng@mediatek.com/
Fixes: 3a1296a38d0c ("net: Support GRO/GSO fraglist chaining.")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Cc: stable@vger.kernel.org

---

Tested:
- tools/testing/selftests/net/udpgro_fwd.sh
- kunit gso_test_func converted to calling __udp_gso_segment
- below manual end-to-end test:
  (which probably repeats a lot of udpgro_fwd.sh, in hindsight..)
  (won't repeat this on any resubmits, given how long it is)

  #!/bin/bash

  ip netns add test1
  ip netns add test2
  ip netns add test3

  ip link add dev veth0 netns test1 type veth peer name veth0 netns test2
  ip link add dev veth1 netns test2 type veth peer name veth1 netns test3

  ip netns exec test1 ip link set dev veth0 up
  ip netns exec test2 ip link set dev veth0 up

  ip netns exec test2 ip link set dev veth1 up
  ip netns exec test3 ip link set dev veth1 up

  ip netns exec test1 ip addr add 10.0.8.1/24 dev veth0
  ip netns exec test2 ip addr add 10.0.8.2/24 dev veth0

  ip netns exec test2 ip addr add 10.0.9.2/24 dev veth1
  ip netns exec test3 ip addr add 10.0.9.3/24 dev veth1

  ip -6 -netns test1 addr add fdaa::1 dev veth0
  ip -6 -netns test2 addr add fdaa::2 dev veth0

  ip -6 -netns test2 addr add fdbb::2 dev veth1
  ip -6 -netns test3 addr add fdbb::3 dev veth1

  ip netns exec test2 sysctl -w net.ipv4.ip_forward=1
  ip netns exec test2 sysctl -w net.ipv6.conf.all.forwarding=1

  ip -netns test1 route add default via 10.0.8.2
  ip -netns test3 route add default via 10.0.9.2

  ip -6 -netns test1 route add fdaa::2 dev veth0
  ip -6 -netns test2 route add fdaa::1 dev veth0
  ip -6 -netns test2 route add fdbb::3 dev veth1
  ip -6 -netns test3 route add fdbb::2 dev veth1

  ip -6 -netns test1 route add default via fdaa::2
  ip -6 -netns test3 route add default via fdbb::2

  ip netns exec test1 ethtool -K veth0 gso off tx-udp-segmentation off
  ip netns exec test2 ethtool -K veth0 gro on rx-gro-list on rx-udp-gro-forwarding on
  ip netns exec test2 ethtool -K veth1 gso off tx-udp-segmentation off

  ip netns exec test2 tc qdisc add dev veth0 clsact
  ip netns exec test2 tc filter add dev veth0 ingress bpf direct-action obj tc_pull.o sec tc

  ip netns exec test3 /mnt/shared/udpgso_bench_rx & \
  ip netns exec test1 /mnt/shared/udpgso_bench_tx -l 5 -4 -D 10.0.9.3 -s
  60000 -S 0 -z

  ip netns exec test3 /mnt/shared/udpgso_bench_rx & \
  ip netns exec test1 /mnt/shared/udpgso_bench_tx -l 5 -6 -D fdbb::3 -s
  60000 -S 0 -z

With trivial BPF program:

  $ cat ~/work/tc_pull.c
  // SPDX-License-Identifier: GPL-2.0

  #include <linux/bpf.h>
  #include <linux/pkt_cls.h>
  #include <linux/types.h>

  #include <bpf/bpf_helpers.h>

  __attribute__((section("tc")))

  int tc_cls_prog(struct __sk_buff *skb) {
  	bpf_skb_pull_data(skb, skb->len);
  	return TC_ACT_OK;
  }
---
 net/ipv4/udp_offload.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

Comments

Felix Fietkau Sept. 24, 2024, 5:47 p.m. UTC | #1
On 22.09.24 17:03, Willem de Bruijn wrote:
> From: Willem de Bruijn <willemb@google.com>
> 
> Detect gso fraglist skbs with corrupted geometry (see below) and
> pass these to skb_segment instead of skb_segment_list, as the first
> can segment them correctly.
> 
> Valid SKB_GSO_FRAGLIST skbs
> - consist of two or more segments
> - the head_skb holds the protocol headers plus first gso_size
> - one or more frag_list skbs hold exactly one segment
> - all but the last must be gso_size
> 
> Optional datapath hooks such as NAT and BPF (bpf_skb_pull_data) can
> modify these skbs, breaking these invariants.
> 
> In extreme cases they pull all data into skb linear. For UDP, this
> causes a NULL ptr deref in __udpv4_gso_segment_list_csum at
> udp_hdr(seg->next)->dest.
> 
> Detect invalid geometry due to pull, by checking head_skb size.
> Don't just drop, as this may blackhole a destination. Convert to be
> able to pass to regular skb_segment.
> 
> Link: https://lore.kernel.org/netdev/20240428142913.18666-1-shiming.cheng@mediatek.com/
> Fixes: 3a1296a38d0c ("net: Support GRO/GSO fraglist chaining.")
> Signed-off-by: Willem de Bruijn <willemb@google.com>
> Cc: stable@vger.kernel.org
> 
> ---
> diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
> index d842303587af..e457fa9143a6 100644
> --- a/net/ipv4/udp_offload.c
> +++ b/net/ipv4/udp_offload.c
> @@ -296,8 +296,16 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
>   		return NULL;
>   	}
>   
> -	if (skb_shinfo(gso_skb)->gso_type & SKB_GSO_FRAGLIST)
> -		return __udp_gso_segment_list(gso_skb, features, is_ipv6);
> +	if (skb_shinfo(gso_skb)->gso_type & SKB_GSO_FRAGLIST) {
> +		 /* Detect modified geometry and pass these to skb_segment. */
> +		if (skb_pagelen(gso_skb) - sizeof(*uh) == skb_shinfo(gso_skb)->gso_size)
> +			return __udp_gso_segment_list(gso_skb, features, is_ipv6);
> +
> +		 /* Setup csum, as fraglist skips this in udp4_gro_receive. */
> +		gso_skb->csum_start = skb_transport_header(gso_skb) - gso_skb->head;
> +		gso_skb->csum_offset = offsetof(struct udphdr, check);
> +		gso_skb->ip_summed = CHECKSUM_PARTIAL;
> +	}

It seems to me that the TCP code would need something similar. Do you 
think the same approach would work there as well?

Thanks,

- Felix
Willem de Bruijn Sept. 24, 2024, 7:50 p.m. UTC | #2
Felix Fietkau wrote:
> On 22.09.24 17:03, Willem de Bruijn wrote:
> > From: Willem de Bruijn <willemb@google.com>
> > 
> > Detect gso fraglist skbs with corrupted geometry (see below) and
> > pass these to skb_segment instead of skb_segment_list, as the first
> > can segment them correctly.
> > 
> > Valid SKB_GSO_FRAGLIST skbs
> > - consist of two or more segments
> > - the head_skb holds the protocol headers plus first gso_size
> > - one or more frag_list skbs hold exactly one segment
> > - all but the last must be gso_size
> > 
> > Optional datapath hooks such as NAT and BPF (bpf_skb_pull_data) can
> > modify these skbs, breaking these invariants.
> > 
> > In extreme cases they pull all data into skb linear. For UDP, this
> > causes a NULL ptr deref in __udpv4_gso_segment_list_csum at
> > udp_hdr(seg->next)->dest.
> > 
> > Detect invalid geometry due to pull, by checking head_skb size.
> > Don't just drop, as this may blackhole a destination. Convert to be
> > able to pass to regular skb_segment.
> > 
> > Link: https://lore.kernel.org/netdev/20240428142913.18666-1-shiming.cheng@mediatek.com/
> > Fixes: 3a1296a38d0c ("net: Support GRO/GSO fraglist chaining.")
> > Signed-off-by: Willem de Bruijn <willemb@google.com>
> > Cc: stable@vger.kernel.org
> > 
> > ---
> > diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
> > index d842303587af..e457fa9143a6 100644
> > --- a/net/ipv4/udp_offload.c
> > +++ b/net/ipv4/udp_offload.c
> > @@ -296,8 +296,16 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
> >   		return NULL;
> >   	}
> >   
> > -	if (skb_shinfo(gso_skb)->gso_type & SKB_GSO_FRAGLIST)
> > -		return __udp_gso_segment_list(gso_skb, features, is_ipv6);
> > +	if (skb_shinfo(gso_skb)->gso_type & SKB_GSO_FRAGLIST) {
> > +		 /* Detect modified geometry and pass these to skb_segment. */
> > +		if (skb_pagelen(gso_skb) - sizeof(*uh) == skb_shinfo(gso_skb)->gso_size)
> > +			return __udp_gso_segment_list(gso_skb, features, is_ipv6);
> > +
> > +		 /* Setup csum, as fraglist skips this in udp4_gro_receive. */
> > +		gso_skb->csum_start = skb_transport_header(gso_skb) - gso_skb->head;
> > +		gso_skb->csum_offset = offsetof(struct udphdr, check);
> > +		gso_skb->ip_summed = CHECKSUM_PARTIAL;
> > +	}
> 
> It seems to me that the TCP code would need something similar.

I think you're right, thanks.

Separate patch, as different Fixes, of course.

> Do you think the same approach would work there as well?

tcp4_gro_complete seems to mirror udp4_gro_receive in returning early
before setting up checksum offload. So likely yes.

The script I shared to reproduce for UDP can hopefully be reused
easily to also generate these packets with TCP.
 
> Thanks,
> 
> - Felix
Felix Fietkau Sept. 25, 2024, 8:10 a.m. UTC | #3
On 22.09.24 17:03, Willem de Bruijn wrote:
> From: Willem de Bruijn <willemb@google.com>
> 
> Detect gso fraglist skbs with corrupted geometry (see below) and
> pass these to skb_segment instead of skb_segment_list, as the first
> can segment them correctly.
> 
> Valid SKB_GSO_FRAGLIST skbs
> - consist of two or more segments
> - the head_skb holds the protocol headers plus first gso_size
> - one or more frag_list skbs hold exactly one segment
> - all but the last must be gso_size
> 
> Optional datapath hooks such as NAT and BPF (bpf_skb_pull_data) can
> modify these skbs, breaking these invariants.
> 
> In extreme cases they pull all data into skb linear. For UDP, this
> causes a NULL ptr deref in __udpv4_gso_segment_list_csum at
> udp_hdr(seg->next)->dest.
> 
> Detect invalid geometry due to pull, by checking head_skb size.
> Don't just drop, as this may blackhole a destination. Convert to be
> able to pass to regular skb_segment.
> 
> Link: https://lore.kernel.org/netdev/20240428142913.18666-1-shiming.cheng@mediatek.com/
> Fixes: 3a1296a38d0c ("net: Support GRO/GSO fraglist chaining.")
> Signed-off-by: Willem de Bruijn <willemb@google.com>
> Cc: stable@vger.kernel.org
> 
> ---
> diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
> index d842303587af..e457fa9143a6 100644
> --- a/net/ipv4/udp_offload.c
> +++ b/net/ipv4/udp_offload.c
> @@ -296,8 +296,16 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
>   		return NULL;
>   	}
>   
> -	if (skb_shinfo(gso_skb)->gso_type & SKB_GSO_FRAGLIST)
> -		return __udp_gso_segment_list(gso_skb, features, is_ipv6);
> +	if (skb_shinfo(gso_skb)->gso_type & SKB_GSO_FRAGLIST) {
> +		 /* Detect modified geometry and pass these to skb_segment. */
> +		if (skb_pagelen(gso_skb) - sizeof(*uh) == skb_shinfo(gso_skb)->gso_size)
> +			return __udp_gso_segment_list(gso_skb, features, is_ipv6);
> +
> +		 /* Setup csum, as fraglist skips this in udp4_gro_receive. */
> +		gso_skb->csum_start = skb_transport_header(gso_skb) - gso_skb->head;
> +		gso_skb->csum_offset = offsetof(struct udphdr, check);
> +		gso_skb->ip_summed = CHECKSUM_PARTIAL;

I also noticed this uh->check update done by udp4_gro_complete only in 
case of non-fraglist GRO:

     if (uh->check)
         uh->check = ~udp_v4_check(skb->len - nhoff, iph->saddr,
                       iph->daddr, 0);

I didn't see any equivalent in your patch. Is it missing or left out 
intentionally?

- Felix
Willem de Bruijn Sept. 25, 2024, 7:09 p.m. UTC | #4
Felix Fietkau wrote:
> On 22.09.24 17:03, Willem de Bruijn wrote:
> > From: Willem de Bruijn <willemb@google.com>
> > 
> > Detect gso fraglist skbs with corrupted geometry (see below) and
> > pass these to skb_segment instead of skb_segment_list, as the first
> > can segment them correctly.
> > 
> > Valid SKB_GSO_FRAGLIST skbs
> > - consist of two or more segments
> > - the head_skb holds the protocol headers plus first gso_size
> > - one or more frag_list skbs hold exactly one segment
> > - all but the last must be gso_size
> > 
> > Optional datapath hooks such as NAT and BPF (bpf_skb_pull_data) can
> > modify these skbs, breaking these invariants.
> > 
> > In extreme cases they pull all data into skb linear. For UDP, this
> > causes a NULL ptr deref in __udpv4_gso_segment_list_csum at
> > udp_hdr(seg->next)->dest.
> > 
> > Detect invalid geometry due to pull, by checking head_skb size.
> > Don't just drop, as this may blackhole a destination. Convert to be
> > able to pass to regular skb_segment.
> > 
> > Link: https://lore.kernel.org/netdev/20240428142913.18666-1-shiming.cheng@mediatek.com/
> > Fixes: 3a1296a38d0c ("net: Support GRO/GSO fraglist chaining.")
> > Signed-off-by: Willem de Bruijn <willemb@google.com>
> > Cc: stable@vger.kernel.org
> > 
> > ---
> > diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
> > index d842303587af..e457fa9143a6 100644
> > --- a/net/ipv4/udp_offload.c
> > +++ b/net/ipv4/udp_offload.c
> > @@ -296,8 +296,16 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
> >   		return NULL;
> >   	}
> >   
> > -	if (skb_shinfo(gso_skb)->gso_type & SKB_GSO_FRAGLIST)
> > -		return __udp_gso_segment_list(gso_skb, features, is_ipv6);
> > +	if (skb_shinfo(gso_skb)->gso_type & SKB_GSO_FRAGLIST) {
> > +		 /* Detect modified geometry and pass these to skb_segment. */
> > +		if (skb_pagelen(gso_skb) - sizeof(*uh) == skb_shinfo(gso_skb)->gso_size)
> > +			return __udp_gso_segment_list(gso_skb, features, is_ipv6);
> > +
> > +		 /* Setup csum, as fraglist skips this in udp4_gro_receive. */
> > +		gso_skb->csum_start = skb_transport_header(gso_skb) - gso_skb->head;
> > +		gso_skb->csum_offset = offsetof(struct udphdr, check);
> > +		gso_skb->ip_summed = CHECKSUM_PARTIAL;
> 
> I also noticed this uh->check update done by udp4_gro_complete only in 
> case of non-fraglist GRO:
> 
>      if (uh->check)
>          uh->check = ~udp_v4_check(skb->len - nhoff, iph->saddr,
>                        iph->daddr, 0);
> 
> I didn't see any equivalent in your patch. Is it missing or left out 
> intentionally?

Thanks. That was not intentional. I think you're right. Am a bit
concerned that all this testing did not catch it. Perhaps because
CHECKSUM_PARTIAL looped to ingress on the same machine is simply
interpreted as CHECKSUM_UNNECESSARY. Need to look into that.

If respinning this, I should also change the Fixes to

Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")

Analogous to the eventual TCP fix to

Fixes: bee88cd5bd83 ("net: add support for segmenting TCP fraglist GSO packets")
Felix Fietkau Sept. 25, 2024, 8:59 p.m. UTC | #5
On 25.09.24 21:09, Willem de Bruijn wrote:
> Felix Fietkau wrote:
>> On 22.09.24 17:03, Willem de Bruijn wrote:
>> > From: Willem de Bruijn <willemb@google.com>
>> > 
>> > Detect gso fraglist skbs with corrupted geometry (see below) and
>> > pass these to skb_segment instead of skb_segment_list, as the first
>> > can segment them correctly.
>> > 
>> > Valid SKB_GSO_FRAGLIST skbs
>> > - consist of two or more segments
>> > - the head_skb holds the protocol headers plus first gso_size
>> > - one or more frag_list skbs hold exactly one segment
>> > - all but the last must be gso_size
>> > 
>> > Optional datapath hooks such as NAT and BPF (bpf_skb_pull_data) can
>> > modify these skbs, breaking these invariants.
>> > 
>> > In extreme cases they pull all data into skb linear. For UDP, this
>> > causes a NULL ptr deref in __udpv4_gso_segment_list_csum at
>> > udp_hdr(seg->next)->dest.
>> > 
>> > Detect invalid geometry due to pull, by checking head_skb size.
>> > Don't just drop, as this may blackhole a destination. Convert to be
>> > able to pass to regular skb_segment.
>> > 
>> > Link: https://lore.kernel.org/netdev/20240428142913.18666-1-shiming.cheng@mediatek.com/
>> > Fixes: 3a1296a38d0c ("net: Support GRO/GSO fraglist chaining.")
>> > Signed-off-by: Willem de Bruijn <willemb@google.com>
>> > Cc: stable@vger.kernel.org
>> > 
>> > ---
>> > diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
>> > index d842303587af..e457fa9143a6 100644
>> > --- a/net/ipv4/udp_offload.c
>> > +++ b/net/ipv4/udp_offload.c
>> > @@ -296,8 +296,16 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
>> >   		return NULL;
>> >   	}
>> >   
>> > -	if (skb_shinfo(gso_skb)->gso_type & SKB_GSO_FRAGLIST)
>> > -		return __udp_gso_segment_list(gso_skb, features, is_ipv6);
>> > +	if (skb_shinfo(gso_skb)->gso_type & SKB_GSO_FRAGLIST) {
>> > +		 /* Detect modified geometry and pass these to skb_segment. */
>> > +		if (skb_pagelen(gso_skb) - sizeof(*uh) == skb_shinfo(gso_skb)->gso_size)
>> > +			return __udp_gso_segment_list(gso_skb, features, is_ipv6);
>> > +
>> > +		 /* Setup csum, as fraglist skips this in udp4_gro_receive. */
>> > +		gso_skb->csum_start = skb_transport_header(gso_skb) - gso_skb->head;
>> > +		gso_skb->csum_offset = offsetof(struct udphdr, check);
>> > +		gso_skb->ip_summed = CHECKSUM_PARTIAL;
>> 
>> I also noticed this uh->check update done by udp4_gro_complete only in 
>> case of non-fraglist GRO:
>> 
>>      if (uh->check)
>>          uh->check = ~udp_v4_check(skb->len - nhoff, iph->saddr,
>>                        iph->daddr, 0);
>> 
>> I didn't see any equivalent in your patch. Is it missing or left out 
>> intentionally?
> 
> Thanks. That was not intentional. I think you're right. Am a bit
> concerned that all this testing did not catch it. Perhaps because
> CHECKSUM_PARTIAL looped to ingress on the same machine is simply
> interpreted as CHECKSUM_UNNECESSARY. Need to look into that.
> 
> If respinning this, I should also change the Fixes to
> 
> Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
> 
> Analogous to the eventual TCP fix to
> 
> Fixes: bee88cd5bd83 ("net: add support for segmenting TCP fraglist GSO packets")

In the mean time, I've been working on the TCP side. I managed to
reproduce the issue on one of my devices by routing traffic from
Ethernet to Wifi using your BPF test program.

The following patch makes it work for me for TCP v4. Still need to
test and fix v6.

- Felix

---
--- a/net/ipv4/tcp_offload.c
+++ b/net/ipv4/tcp_offload.c
@@ -101,8 +101,20 @@ static struct sk_buff *tcp4_gso_segment(struct sk_buff *skb,
  	if (!pskb_may_pull(skb, sizeof(struct tcphdr)))
  		return ERR_PTR(-EINVAL);
  
-	if (skb_shinfo(skb)->gso_type & SKB_GSO_FRAGLIST)
-		return __tcp4_gso_segment_list(skb, features);
+	if (skb_shinfo(skb)->gso_type & SKB_GSO_FRAGLIST) {
+		struct tcphdr *th = tcp_hdr(skb);
+		const struct iphdr *iph;
+
+		if (skb_pagelen(skb) - th->doff * 4 == skb_shinfo(skb)->gso_size)
+			return __tcp4_gso_segment_list(skb, features);
+
+		iph = ip_hdr(skb);
+		skb_shinfo(skb)->gso_type &= ~SKB_GSO_FRAGLIST;
+		skb->csum_start = (unsigned char *)th - skb->head;
+		skb->csum_offset = offsetof(struct tcphdr, check);
+		skb->ip_summed = CHECKSUM_PARTIAL;
+		th->check = ~tcp_v4_check(skb->len, iph->saddr, iph->daddr, 0);
+	}
  
  	if (unlikely(skb->ip_summed != CHECKSUM_PARTIAL)) {
  		const struct iphdr *iph = ip_hdr(skb);
Felix Fietkau Sept. 25, 2024, 9:12 p.m. UTC | #6
On 25.09.24 22:59, Felix Fietkau wrote:
> On 25.09.24 21:09, Willem de Bruijn wrote:
>> Felix Fietkau wrote:
>>> On 22.09.24 17:03, Willem de Bruijn wrote:
>>> > From: Willem de Bruijn <willemb@google.com>
>>> > 
>>> > Detect gso fraglist skbs with corrupted geometry (see below) and
>>> > pass these to skb_segment instead of skb_segment_list, as the first
>>> > can segment them correctly.
>>> > 
>>> > Valid SKB_GSO_FRAGLIST skbs
>>> > - consist of two or more segments
>>> > - the head_skb holds the protocol headers plus first gso_size
>>> > - one or more frag_list skbs hold exactly one segment
>>> > - all but the last must be gso_size
>>> > 
>>> > Optional datapath hooks such as NAT and BPF (bpf_skb_pull_data) can
>>> > modify these skbs, breaking these invariants.
>>> > 
>>> > In extreme cases they pull all data into skb linear. For UDP, this
>>> > causes a NULL ptr deref in __udpv4_gso_segment_list_csum at
>>> > udp_hdr(seg->next)->dest.
>>> > 
>>> > Detect invalid geometry due to pull, by checking head_skb size.
>>> > Don't just drop, as this may blackhole a destination. Convert to be
>>> > able to pass to regular skb_segment.
>>> > 
>>> > Link: https://lore.kernel.org/netdev/20240428142913.18666-1-shiming.cheng@mediatek.com/
>>> > Fixes: 3a1296a38d0c ("net: Support GRO/GSO fraglist chaining.")
>>> > Signed-off-by: Willem de Bruijn <willemb@google.com>
>>> > Cc: stable@vger.kernel.org
>>> > 
>>> > ---
>>> > diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
>>> > index d842303587af..e457fa9143a6 100644
>>> > --- a/net/ipv4/udp_offload.c
>>> > +++ b/net/ipv4/udp_offload.c
>>> > @@ -296,8 +296,16 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
>>> >   		return NULL;
>>> >   	}
>>> >   
>>> > -	if (skb_shinfo(gso_skb)->gso_type & SKB_GSO_FRAGLIST)
>>> > -		return __udp_gso_segment_list(gso_skb, features, is_ipv6);
>>> > +	if (skb_shinfo(gso_skb)->gso_type & SKB_GSO_FRAGLIST) {
>>> > +		 /* Detect modified geometry and pass these to skb_segment. */
>>> > +		if (skb_pagelen(gso_skb) - sizeof(*uh) == skb_shinfo(gso_skb)->gso_size)
>>> > +			return __udp_gso_segment_list(gso_skb, features, is_ipv6);
>>> > +
>>> > +		 /* Setup csum, as fraglist skips this in udp4_gro_receive. */
>>> > +		gso_skb->csum_start = skb_transport_header(gso_skb) - gso_skb->head;
>>> > +		gso_skb->csum_offset = offsetof(struct udphdr, check);
>>> > +		gso_skb->ip_summed = CHECKSUM_PARTIAL;
>>> 
>>> I also noticed this uh->check update done by udp4_gro_complete only in 
>>> case of non-fraglist GRO:
>>> 
>>>      if (uh->check)
>>>          uh->check = ~udp_v4_check(skb->len - nhoff, iph->saddr,
>>>                        iph->daddr, 0);
>>> 
>>> I didn't see any equivalent in your patch. Is it missing or left out 
>>> intentionally?
>> 
>> Thanks. That was not intentional. I think you're right. Am a bit
>> concerned that all this testing did not catch it. Perhaps because
>> CHECKSUM_PARTIAL looped to ingress on the same machine is simply
>> interpreted as CHECKSUM_UNNECESSARY. Need to look into that.
>> 
>> If respinning this, I should also change the Fixes to
>> 
>> Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
>> 
>> Analogous to the eventual TCP fix to
>> 
>> Fixes: bee88cd5bd83 ("net: add support for segmenting TCP fraglist GSO packets")
> 
> In the mean time, I've been working on the TCP side. I managed to
> reproduce the issue on one of my devices by routing traffic from
> Ethernet to Wifi using your BPF test program.
> 
> The following patch makes it work for me for TCP v4. Still need to
> test and fix v6.

Actually, here is something even simpler that should work for both v4
and v6:

---
--- a/net/ipv4/tcp_offload.c
+++ b/net/ipv4/tcp_offload.c
@@ -101,8 +101,14 @@ static struct sk_buff *tcp4_gso_segment(struct sk_buff *skb,
  	if (!pskb_may_pull(skb, sizeof(struct tcphdr)))
  		return ERR_PTR(-EINVAL);
  
-	if (skb_shinfo(skb)->gso_type & SKB_GSO_FRAGLIST)
-		return __tcp4_gso_segment_list(skb, features);
+	if (skb_shinfo(skb)->gso_type & SKB_GSO_FRAGLIST) {
+		struct tcphdr *th = tcp_hdr(skb);
+
+		if (skb_pagelen(skb) - th->doff * 4 == skb_shinfo(skb)->gso_size)
+			return __tcp4_gso_segment_list(skb, features);
+
+		skb->ip_summed = CHECKSUM_NONE;
+	}
  
  	if (unlikely(skb->ip_summed != CHECKSUM_PARTIAL)) {
  		const struct iphdr *iph = ip_hdr(skb);
--- a/net/ipv6/tcpv6_offload.c
+++ b/net/ipv6/tcpv6_offload.c
@@ -159,8 +159,14 @@ static struct sk_buff *tcp6_gso_segment(struct sk_buff *skb,
  	if (!pskb_may_pull(skb, sizeof(*th)))
  		return ERR_PTR(-EINVAL);
  
-	if (skb_shinfo(skb)->gso_type & SKB_GSO_FRAGLIST)
-		return __tcp6_gso_segment_list(skb, features);
+	if (skb_shinfo(skb)->gso_type & SKB_GSO_FRAGLIST) {
+		struct tcphdr *th = tcp_hdr(skb);
+
+		if (skb_pagelen(skb) - th->doff * 4 == skb_shinfo(skb)->gso_size)
+			return __tcp6_gso_segment_list(skb, features);
+
+		skb->ip_summed = CHECKSUM_NONE;
+	}
  
  	if (unlikely(skb->ip_summed != CHECKSUM_PARTIAL)) {
  		const struct ipv6hdr *ipv6h = ipv6_hdr(skb);
Willem de Bruijn Sept. 26, 2024, 7:16 a.m. UTC | #7
Felix Fietkau wrote:
> On 25.09.24 21:09, Willem de Bruijn wrote:
> > Felix Fietkau wrote:
> >> On 22.09.24 17:03, Willem de Bruijn wrote:
> >> > From: Willem de Bruijn <willemb@google.com>
> >> > 
> >> > Detect gso fraglist skbs with corrupted geometry (see below) and
> >> > pass these to skb_segment instead of skb_segment_list, as the first
> >> > can segment them correctly.
> >> > 
> >> > Valid SKB_GSO_FRAGLIST skbs
> >> > - consist of two or more segments
> >> > - the head_skb holds the protocol headers plus first gso_size
> >> > - one or more frag_list skbs hold exactly one segment
> >> > - all but the last must be gso_size
> >> > 
> >> > Optional datapath hooks such as NAT and BPF (bpf_skb_pull_data) can
> >> > modify these skbs, breaking these invariants.
> >> > 
> >> > In extreme cases they pull all data into skb linear. For UDP, this
> >> > causes a NULL ptr deref in __udpv4_gso_segment_list_csum at
> >> > udp_hdr(seg->next)->dest.
> >> > 
> >> > Detect invalid geometry due to pull, by checking head_skb size.
> >> > Don't just drop, as this may blackhole a destination. Convert to be
> >> > able to pass to regular skb_segment.
> >> > 
> >> > Link: https://lore.kernel.org/netdev/20240428142913.18666-1-shiming.cheng@mediatek.com/
> >> > Fixes: 3a1296a38d0c ("net: Support GRO/GSO fraglist chaining.")
> >> > Signed-off-by: Willem de Bruijn <willemb@google.com>
> >> > Cc: stable@vger.kernel.org
> >> > 
> >> > ---
> >> > diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
> >> > index d842303587af..e457fa9143a6 100644
> >> > --- a/net/ipv4/udp_offload.c
> >> > +++ b/net/ipv4/udp_offload.c
> >> > @@ -296,8 +296,16 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
> >> >   		return NULL;
> >> >   	}
> >> >   
> >> > -	if (skb_shinfo(gso_skb)->gso_type & SKB_GSO_FRAGLIST)
> >> > -		return __udp_gso_segment_list(gso_skb, features, is_ipv6);
> >> > +	if (skb_shinfo(gso_skb)->gso_type & SKB_GSO_FRAGLIST) {
> >> > +		 /* Detect modified geometry and pass these to skb_segment. */
> >> > +		if (skb_pagelen(gso_skb) - sizeof(*uh) == skb_shinfo(gso_skb)->gso_size)
> >> > +			return __udp_gso_segment_list(gso_skb, features, is_ipv6);
> >> > +
> >> > +		 /* Setup csum, as fraglist skips this in udp4_gro_receive. */
> >> > +		gso_skb->csum_start = skb_transport_header(gso_skb) - gso_skb->head;
> >> > +		gso_skb->csum_offset = offsetof(struct udphdr, check);
> >> > +		gso_skb->ip_summed = CHECKSUM_PARTIAL;
> >> 
> >> I also noticed this uh->check update done by udp4_gro_complete only in 
> >> case of non-fraglist GRO:
> >> 
> >>      if (uh->check)
> >>          uh->check = ~udp_v4_check(skb->len - nhoff, iph->saddr,
> >>                        iph->daddr, 0);
> >> 
> >> I didn't see any equivalent in your patch. Is it missing or left out 
> >> intentionally?
> > 
> > Thanks. That was not intentional. I think you're right. Am a bit
> > concerned that all this testing did not catch it. Perhaps because
> > CHECKSUM_PARTIAL looped to ingress on the same machine is simply
> > interpreted as CHECKSUM_UNNECESSARY. Need to look into that.
> > 
> > If respinning this, I should also change the Fixes to
> > 
> > Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
> > 
> > Analogous to the eventual TCP fix to
> > 
> > Fixes: bee88cd5bd83 ("net: add support for segmenting TCP fraglist GSO packets")
> 
> In the mean time, I've been working on the TCP side. I managed to
> reproduce the issue on one of my devices by routing traffic from
> Ethernet to Wifi using your BPF test program.
> 
> The following patch makes it work for me for TCP v4. Still need to
> test and fix v6.
> 
> - Felix
> 
> ---
> --- a/net/ipv4/tcp_offload.c
> +++ b/net/ipv4/tcp_offload.c
> @@ -101,8 +101,20 @@ static struct sk_buff *tcp4_gso_segment(struct sk_buff *skb,
>   	if (!pskb_may_pull(skb, sizeof(struct tcphdr)))
>   		return ERR_PTR(-EINVAL);
>   
> -	if (skb_shinfo(skb)->gso_type & SKB_GSO_FRAGLIST)
> -		return __tcp4_gso_segment_list(skb, features);
> +	if (skb_shinfo(skb)->gso_type & SKB_GSO_FRAGLIST) {
> +		struct tcphdr *th = tcp_hdr(skb);
> +		const struct iphdr *iph;
> +
> +		if (skb_pagelen(skb) - th->doff * 4 == skb_shinfo(skb)->gso_size)
> +			return __tcp4_gso_segment_list(skb, features);
> +
> +		iph = ip_hdr(skb);
> +		skb_shinfo(skb)->gso_type &= ~SKB_GSO_FRAGLIST;

I left this out, because skb_segment does not care about this bit.

> +		skb->csum_start = (unsigned char *)th - skb->head;
> +		skb->csum_offset = offsetof(struct tcphdr, check);
> +		skb->ip_summed = CHECKSUM_PARTIAL;
> +		th->check = ~tcp_v4_check(skb->len, iph->saddr, iph->daddr, 0);
> +	}
>   
>   	if (unlikely(skb->ip_summed != CHECKSUM_PARTIAL)) {
>   		const struct iphdr *iph = ip_hdr(skb);
>
Willem de Bruijn Sept. 26, 2024, 7:19 a.m. UTC | #8
Felix Fietkau wrote:
> On 25.09.24 22:59, Felix Fietkau wrote:
> > On 25.09.24 21:09, Willem de Bruijn wrote:
> >> Felix Fietkau wrote:
> >>> On 22.09.24 17:03, Willem de Bruijn wrote:
> >>> > From: Willem de Bruijn <willemb@google.com>
> >>> > 
> >>> > Detect gso fraglist skbs with corrupted geometry (see below) and
> >>> > pass these to skb_segment instead of skb_segment_list, as the first
> >>> > can segment them correctly.
> >>> > 
> >>> > Valid SKB_GSO_FRAGLIST skbs
> >>> > - consist of two or more segments
> >>> > - the head_skb holds the protocol headers plus first gso_size
> >>> > - one or more frag_list skbs hold exactly one segment
> >>> > - all but the last must be gso_size
> >>> > 
> >>> > Optional datapath hooks such as NAT and BPF (bpf_skb_pull_data) can
> >>> > modify these skbs, breaking these invariants.
> >>> > 
> >>> > In extreme cases they pull all data into skb linear. For UDP, this
> >>> > causes a NULL ptr deref in __udpv4_gso_segment_list_csum at
> >>> > udp_hdr(seg->next)->dest.
> >>> > 
> >>> > Detect invalid geometry due to pull, by checking head_skb size.
> >>> > Don't just drop, as this may blackhole a destination. Convert to be
> >>> > able to pass to regular skb_segment.
> >>> > 
> >>> > Link: https://lore.kernel.org/netdev/20240428142913.18666-1-shiming.cheng@mediatek.com/
> >>> > Fixes: 3a1296a38d0c ("net: Support GRO/GSO fraglist chaining.")
> >>> > Signed-off-by: Willem de Bruijn <willemb@google.com>
> >>> > Cc: stable@vger.kernel.org
> >>> > 
> >>> > ---
> >>> > diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
> >>> > index d842303587af..e457fa9143a6 100644
> >>> > --- a/net/ipv4/udp_offload.c
> >>> > +++ b/net/ipv4/udp_offload.c
> >>> > @@ -296,8 +296,16 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
> >>> >   		return NULL;
> >>> >   	}
> >>> >   
> >>> > -	if (skb_shinfo(gso_skb)->gso_type & SKB_GSO_FRAGLIST)
> >>> > -		return __udp_gso_segment_list(gso_skb, features, is_ipv6);
> >>> > +	if (skb_shinfo(gso_skb)->gso_type & SKB_GSO_FRAGLIST) {
> >>> > +		 /* Detect modified geometry and pass these to skb_segment. */
> >>> > +		if (skb_pagelen(gso_skb) - sizeof(*uh) == skb_shinfo(gso_skb)->gso_size)
> >>> > +			return __udp_gso_segment_list(gso_skb, features, is_ipv6);
> >>> > +
> >>> > +		 /* Setup csum, as fraglist skips this in udp4_gro_receive. */
> >>> > +		gso_skb->csum_start = skb_transport_header(gso_skb) - gso_skb->head;
> >>> > +		gso_skb->csum_offset = offsetof(struct udphdr, check);
> >>> > +		gso_skb->ip_summed = CHECKSUM_PARTIAL;
> >>> 
> >>> I also noticed this uh->check update done by udp4_gro_complete only in 
> >>> case of non-fraglist GRO:
> >>> 
> >>>      if (uh->check)
> >>>          uh->check = ~udp_v4_check(skb->len - nhoff, iph->saddr,
> >>>                        iph->daddr, 0);
> >>> 
> >>> I didn't see any equivalent in your patch. Is it missing or left out 
> >>> intentionally?
> >> 
> >> Thanks. That was not intentional. I think you're right. Am a bit
> >> concerned that all this testing did not catch it. Perhaps because
> >> CHECKSUM_PARTIAL looped to ingress on the same machine is simply
> >> interpreted as CHECKSUM_UNNECESSARY. Need to look into that.
> >> 
> >> If respinning this, I should also change the Fixes to
> >> 
> >> Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
> >> 
> >> Analogous to the eventual TCP fix to
> >> 
> >> Fixes: bee88cd5bd83 ("net: add support for segmenting TCP fraglist GSO packets")
> > 
> > In the mean time, I've been working on the TCP side. I managed to
> > reproduce the issue on one of my devices by routing traffic from
> > Ethernet to Wifi using your BPF test program.
> > 
> > The following patch makes it work for me for TCP v4. Still need to
> > test and fix v6.
> 
> Actually, here is something even simpler that should work for both v4
> and v6:

Makes sense. It does come with higher cost of calling skb_checksum.
 
> ---
> --- a/net/ipv4/tcp_offload.c
> +++ b/net/ipv4/tcp_offload.c
> @@ -101,8 +101,14 @@ static struct sk_buff *tcp4_gso_segment(struct sk_buff *skb,
>   	if (!pskb_may_pull(skb, sizeof(struct tcphdr)))
>   		return ERR_PTR(-EINVAL);
>   
> -	if (skb_shinfo(skb)->gso_type & SKB_GSO_FRAGLIST)
> -		return __tcp4_gso_segment_list(skb, features);
> +	if (skb_shinfo(skb)->gso_type & SKB_GSO_FRAGLIST) {
> +		struct tcphdr *th = tcp_hdr(skb);
> +
> +		if (skb_pagelen(skb) - th->doff * 4 == skb_shinfo(skb)->gso_size)
> +			return __tcp4_gso_segment_list(skb, features);
> +
> +		skb->ip_summed = CHECKSUM_NONE;
> +	}
>   
>   	if (unlikely(skb->ip_summed != CHECKSUM_PARTIAL)) {
>   		const struct iphdr *iph = ip_hdr(skb);
> --- a/net/ipv6/tcpv6_offload.c
> +++ b/net/ipv6/tcpv6_offload.c
> @@ -159,8 +159,14 @@ static struct sk_buff *tcp6_gso_segment(struct sk_buff *skb,
>   	if (!pskb_may_pull(skb, sizeof(*th)))
>   		return ERR_PTR(-EINVAL);
>   
> -	if (skb_shinfo(skb)->gso_type & SKB_GSO_FRAGLIST)
> -		return __tcp6_gso_segment_list(skb, features);
> +	if (skb_shinfo(skb)->gso_type & SKB_GSO_FRAGLIST) {
> +		struct tcphdr *th = tcp_hdr(skb);
> +
> +		if (skb_pagelen(skb) - th->doff * 4 == skb_shinfo(skb)->gso_size)
> +			return __tcp6_gso_segment_list(skb, features);
> +
> +		skb->ip_summed = CHECKSUM_NONE;
> +	}
>   
>   	if (unlikely(skb->ip_summed != CHECKSUM_PARTIAL)) {
>   		const struct ipv6hdr *ipv6h = ipv6_hdr(skb);
> 
>
Felix Fietkau Sept. 26, 2024, 7:38 a.m. UTC | #9
On 26.09.24 09:19, Willem de Bruijn wrote:
> Felix Fietkau wrote:
>> On 25.09.24 22:59, Felix Fietkau wrote:
>> > On 25.09.24 21:09, Willem de Bruijn wrote:
>> >> Felix Fietkau wrote:
>> >>> On 22.09.24 17:03, Willem de Bruijn wrote:
>> >>> > From: Willem de Bruijn <willemb@google.com>
>> >>> > 
>> >>> > Detect gso fraglist skbs with corrupted geometry (see below) and
>> >>> > pass these to skb_segment instead of skb_segment_list, as the first
>> >>> > can segment them correctly.
>> >>> > 
>> >>> > Valid SKB_GSO_FRAGLIST skbs
>> >>> > - consist of two or more segments
>> >>> > - the head_skb holds the protocol headers plus first gso_size
>> >>> > - one or more frag_list skbs hold exactly one segment
>> >>> > - all but the last must be gso_size
>> >>> > 
>> >>> > Optional datapath hooks such as NAT and BPF (bpf_skb_pull_data) can
>> >>> > modify these skbs, breaking these invariants.
>> >>> > 
>> >>> > In extreme cases they pull all data into skb linear. For UDP, this
>> >>> > causes a NULL ptr deref in __udpv4_gso_segment_list_csum at
>> >>> > udp_hdr(seg->next)->dest.
>> >>> > 
>> >>> > Detect invalid geometry due to pull, by checking head_skb size.
>> >>> > Don't just drop, as this may blackhole a destination. Convert to be
>> >>> > able to pass to regular skb_segment.
>> >>> > 
>> >>> > Link: https://lore.kernel.org/netdev/20240428142913.18666-1-shiming.cheng@mediatek.com/
>> >>> > Fixes: 3a1296a38d0c ("net: Support GRO/GSO fraglist chaining.")
>> >>> > Signed-off-by: Willem de Bruijn <willemb@google.com>
>> >>> > Cc: stable@vger.kernel.org
>> >>> > 
>> >>> > ---
>> >>> > diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
>> >>> > index d842303587af..e457fa9143a6 100644
>> >>> > --- a/net/ipv4/udp_offload.c
>> >>> > +++ b/net/ipv4/udp_offload.c
>> >>> > @@ -296,8 +296,16 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
>> >>> >   		return NULL;
>> >>> >   	}
>> >>> >   
>> >>> > -	if (skb_shinfo(gso_skb)->gso_type & SKB_GSO_FRAGLIST)
>> >>> > -		return __udp_gso_segment_list(gso_skb, features, is_ipv6);
>> >>> > +	if (skb_shinfo(gso_skb)->gso_type & SKB_GSO_FRAGLIST) {
>> >>> > +		 /* Detect modified geometry and pass these to skb_segment. */
>> >>> > +		if (skb_pagelen(gso_skb) - sizeof(*uh) == skb_shinfo(gso_skb)->gso_size)
>> >>> > +			return __udp_gso_segment_list(gso_skb, features, is_ipv6);
>> >>> > +
>> >>> > +		 /* Setup csum, as fraglist skips this in udp4_gro_receive. */
>> >>> > +		gso_skb->csum_start = skb_transport_header(gso_skb) - gso_skb->head;
>> >>> > +		gso_skb->csum_offset = offsetof(struct udphdr, check);
>> >>> > +		gso_skb->ip_summed = CHECKSUM_PARTIAL;
>> >>> 
>> >>> I also noticed this uh->check update done by udp4_gro_complete only in 
>> >>> case of non-fraglist GRO:
>> >>> 
>> >>>      if (uh->check)
>> >>>          uh->check = ~udp_v4_check(skb->len - nhoff, iph->saddr,
>> >>>                        iph->daddr, 0);
>> >>> 
>> >>> I didn't see any equivalent in your patch. Is it missing or left out 
>> >>> intentionally?
>> >> 
>> >> Thanks. That was not intentional. I think you're right. Am a bit
>> >> concerned that all this testing did not catch it. Perhaps because
>> >> CHECKSUM_PARTIAL looped to ingress on the same machine is simply
>> >> interpreted as CHECKSUM_UNNECESSARY. Need to look into that.
>> >> 
>> >> If respinning this, I should also change the Fixes to
>> >> 
>> >> Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
>> >> 
>> >> Analogous to the eventual TCP fix to
>> >> 
>> >> Fixes: bee88cd5bd83 ("net: add support for segmenting TCP fraglist GSO packets")
>> > 
>> > In the mean time, I've been working on the TCP side. I managed to
>> > reproduce the issue on one of my devices by routing traffic from
>> > Ethernet to Wifi using your BPF test program.
>> > 
>> > The following patch makes it work for me for TCP v4. Still need to
>> > test and fix v6.
>> 
>> Actually, here is something even simpler that should work for both v4
>> and v6:
> 
> Makes sense. It does come with higher cost of calling skb_checksum.

But only if there is no checksum offload, right? Because the way I 
implemented it, the lines further below starting with
if (unlikely(skb->ip_summed != CHECKSUM_PARTIAL))
will initialize th->check and set ip_summed to CHECKSUM_PARTIAL
Or am I missing something?

Thanks,

- Felix
Willem de Bruijn Sept. 26, 2024, 7:45 a.m. UTC | #10
Felix Fietkau wrote:
> On 26.09.24 09:19, Willem de Bruijn wrote:
> > Felix Fietkau wrote:
> >> On 25.09.24 22:59, Felix Fietkau wrote:
> >> > On 25.09.24 21:09, Willem de Bruijn wrote:
> >> >> Felix Fietkau wrote:
> >> >>> On 22.09.24 17:03, Willem de Bruijn wrote:
> >> >>> > From: Willem de Bruijn <willemb@google.com>
> >> >>> > 
> >> >>> > Detect gso fraglist skbs with corrupted geometry (see below) and
> >> >>> > pass these to skb_segment instead of skb_segment_list, as the first
> >> >>> > can segment them correctly.
> >> >>> > 
> >> >>> > Valid SKB_GSO_FRAGLIST skbs
> >> >>> > - consist of two or more segments
> >> >>> > - the head_skb holds the protocol headers plus first gso_size
> >> >>> > - one or more frag_list skbs hold exactly one segment
> >> >>> > - all but the last must be gso_size
> >> >>> > 
> >> >>> > Optional datapath hooks such as NAT and BPF (bpf_skb_pull_data) can
> >> >>> > modify these skbs, breaking these invariants.
> >> >>> > 
> >> >>> > In extreme cases they pull all data into skb linear. For UDP, this
> >> >>> > causes a NULL ptr deref in __udpv4_gso_segment_list_csum at
> >> >>> > udp_hdr(seg->next)->dest.
> >> >>> > 
> >> >>> > Detect invalid geometry due to pull, by checking head_skb size.
> >> >>> > Don't just drop, as this may blackhole a destination. Convert to be
> >> >>> > able to pass to regular skb_segment.
> >> >>> > 
> >> >>> > Link: https://lore.kernel.org/netdev/20240428142913.18666-1-shiming.cheng@mediatek.com/
> >> >>> > Fixes: 3a1296a38d0c ("net: Support GRO/GSO fraglist chaining.")
> >> >>> > Signed-off-by: Willem de Bruijn <willemb@google.com>
> >> >>> > Cc: stable@vger.kernel.org
> >> >>> > 
> >> >>> > ---
> >> >>> > diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
> >> >>> > index d842303587af..e457fa9143a6 100644
> >> >>> > --- a/net/ipv4/udp_offload.c
> >> >>> > +++ b/net/ipv4/udp_offload.c
> >> >>> > @@ -296,8 +296,16 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
> >> >>> >   		return NULL;
> >> >>> >   	}
> >> >>> >   
> >> >>> > -	if (skb_shinfo(gso_skb)->gso_type & SKB_GSO_FRAGLIST)
> >> >>> > -		return __udp_gso_segment_list(gso_skb, features, is_ipv6);
> >> >>> > +	if (skb_shinfo(gso_skb)->gso_type & SKB_GSO_FRAGLIST) {
> >> >>> > +		 /* Detect modified geometry and pass these to skb_segment. */
> >> >>> > +		if (skb_pagelen(gso_skb) - sizeof(*uh) == skb_shinfo(gso_skb)->gso_size)
> >> >>> > +			return __udp_gso_segment_list(gso_skb, features, is_ipv6);
> >> >>> > +
> >> >>> > +		 /* Setup csum, as fraglist skips this in udp4_gro_receive. */
> >> >>> > +		gso_skb->csum_start = skb_transport_header(gso_skb) - gso_skb->head;
> >> >>> > +		gso_skb->csum_offset = offsetof(struct udphdr, check);
> >> >>> > +		gso_skb->ip_summed = CHECKSUM_PARTIAL;
> >> >>> 
> >> >>> I also noticed this uh->check update done by udp4_gro_complete only in 
> >> >>> case of non-fraglist GRO:
> >> >>> 
> >> >>>      if (uh->check)
> >> >>>          uh->check = ~udp_v4_check(skb->len - nhoff, iph->saddr,
> >> >>>                        iph->daddr, 0);
> >> >>> 
> >> >>> I didn't see any equivalent in your patch. Is it missing or left out 
> >> >>> intentionally?
> >> >> 
> >> >> Thanks. That was not intentional. I think you're right. Am a bit
> >> >> concerned that all this testing did not catch it. Perhaps because
> >> >> CHECKSUM_PARTIAL looped to ingress on the same machine is simply
> >> >> interpreted as CHECKSUM_UNNECESSARY. Need to look into that.
> >> >> 
> >> >> If respinning this, I should also change the Fixes to
> >> >> 
> >> >> Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
> >> >> 
> >> >> Analogous to the eventual TCP fix to
> >> >> 
> >> >> Fixes: bee88cd5bd83 ("net: add support for segmenting TCP fraglist GSO packets")
> >> > 
> >> > In the mean time, I've been working on the TCP side. I managed to
> >> > reproduce the issue on one of my devices by routing traffic from
> >> > Ethernet to Wifi using your BPF test program.
> >> > 
> >> > The following patch makes it work for me for TCP v4. Still need to
> >> > test and fix v6.
> >> 
> >> Actually, here is something even simpler that should work for both v4
> >> and v6:
> > 
> > Makes sense. It does come with higher cost of calling skb_checksum.
> 
> But only if there is no checksum offload, right? Because the way I 
> implemented it, the lines further below starting with
> if (unlikely(skb->ip_summed != CHECKSUM_PARTIAL))
> will initialize th->check and set ip_summed to CHECKSUM_PARTIAL
> Or am I missing something?

Oh you're right, __tcp_v4_send_check then does this, so no need to
open code it.
Felix Fietkau Sept. 26, 2024, 9:17 a.m. UTC | #11
On 22.09.24 17:03, Willem de Bruijn wrote:
> From: Willem de Bruijn <willemb@google.com>
> 
> Detect gso fraglist skbs with corrupted geometry (see below) and
> pass these to skb_segment instead of skb_segment_list, as the first
> can segment them correctly.
> 
> Valid SKB_GSO_FRAGLIST skbs
> - consist of two or more segments
> - the head_skb holds the protocol headers plus first gso_size
> - one or more frag_list skbs hold exactly one segment
> - all but the last must be gso_size
> 
> Optional datapath hooks such as NAT and BPF (bpf_skb_pull_data) can
> modify these skbs, breaking these invariants.
> 
> In extreme cases they pull all data into skb linear. For UDP, this
> causes a NULL ptr deref in __udpv4_gso_segment_list_csum at
> udp_hdr(seg->next)->dest.
> 
> Detect invalid geometry due to pull, by checking head_skb size.
> Don't just drop, as this may blackhole a destination. Convert to be
> able to pass to regular skb_segment.
> 
> Link: https://lore.kernel.org/netdev/20240428142913.18666-1-shiming.cheng@mediatek.com/
> Fixes: 3a1296a38d0c ("net: Support GRO/GSO fraglist chaining.")
> Signed-off-by: Willem de Bruijn <willemb@google.com>
> Cc: stable@vger.kernel.org

Reviewed-by: Felix Fietkau <nbd@nbd.name>
Willem de Bruijn Sept. 26, 2024, 9:22 a.m. UTC | #12
Felix Fietkau wrote:
> On 22.09.24 17:03, Willem de Bruijn wrote:
> > From: Willem de Bruijn <willemb@google.com>
> > 
> > Detect gso fraglist skbs with corrupted geometry (see below) and
> > pass these to skb_segment instead of skb_segment_list, as the first
> > can segment them correctly.
> > 
> > Valid SKB_GSO_FRAGLIST skbs
> > - consist of two or more segments
> > - the head_skb holds the protocol headers plus first gso_size
> > - one or more frag_list skbs hold exactly one segment
> > - all but the last must be gso_size
> > 
> > Optional datapath hooks such as NAT and BPF (bpf_skb_pull_data) can
> > modify these skbs, breaking these invariants.
> > 
> > In extreme cases they pull all data into skb linear. For UDP, this
> > causes a NULL ptr deref in __udpv4_gso_segment_list_csum at
> > udp_hdr(seg->next)->dest.
> > 
> > Detect invalid geometry due to pull, by checking head_skb size.
> > Don't just drop, as this may blackhole a destination. Convert to be
> > able to pass to regular skb_segment.
> > 
> > Link: https://lore.kernel.org/netdev/20240428142913.18666-1-shiming.cheng@mediatek.com/
> > Fixes: 3a1296a38d0c ("net: Support GRO/GSO fraglist chaining.")
> > Signed-off-by: Willem de Bruijn <willemb@google.com>
> > Cc: stable@vger.kernel.org
> 
> Reviewed-by: Felix Fietkau <nbd@nbd.name>

I will respin with initialization of uh->check
diff mbox series

Patch

diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index d842303587af..e457fa9143a6 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -296,8 +296,16 @@  struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
 		return NULL;
 	}
 
-	if (skb_shinfo(gso_skb)->gso_type & SKB_GSO_FRAGLIST)
-		return __udp_gso_segment_list(gso_skb, features, is_ipv6);
+	if (skb_shinfo(gso_skb)->gso_type & SKB_GSO_FRAGLIST) {
+		 /* Detect modified geometry and pass these to skb_segment. */
+		if (skb_pagelen(gso_skb) - sizeof(*uh) == skb_shinfo(gso_skb)->gso_size)
+			return __udp_gso_segment_list(gso_skb, features, is_ipv6);
+
+		 /* Setup csum, as fraglist skips this in udp4_gro_receive. */
+		gso_skb->csum_start = skb_transport_header(gso_skb) - gso_skb->head;
+		gso_skb->csum_offset = offsetof(struct udphdr, check);
+		gso_skb->ip_summed = CHECKSUM_PARTIAL;
+	}
 
 	skb_pull(gso_skb, sizeof(*uh));