Message ID | 20230601211732.1606062-1-edumazet@google.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] tcp: gso: really support BIG TCP | expand |
> -----Original Message----- > From: Eric Dumazet <edumazet@google.com> > Sent: Thursday, June 1, 2023 2:18 PM > To: David S . Miller <davem@davemloft.net>; Jakub Kicinski > <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com> > Cc: netdev@vger.kernel.org; Xin Long <lucien.xin@gmail.com>; David Ahern > <dsahern@kernel.org>; eric.dumazet@gmail.com; Eric Dumazet > <edumazet@google.com>; Alexander Duyck <alexanderduyck@meta.com> > Subject: [PATCH net] tcp: gso: really support BIG TCP > > > > We missed that tcp_gso_segment() was assuming skb->len was smaller than > 65535 : > > oldlen = (u16)~skb->len; > > This part came with commit 0718bcc09b35 ("[NET]: Fix CHECKSUM_HW GSO > problems.") > > This leads to wrong TCP checksum. > > Simply use csum_fold() to support 32bit packet lengthes. > > oldlen name is a bit misleading, as it is the contribution of skb->len on the > input skb TCP checksum. I added a comment to clarify this point. > > Fixes: 09f3d1a3a52c ("ipv6/gso: remove temporary HBH/jumbo header") > Signed-off-by: Eric Dumazet <edumazet@google.com> > Cc: Alexander Duyck <alexanderduyck@fb.com> > --- > net/ipv4/tcp_offload.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c index > 45dda788938704c3f762256266d9ea29b6ded4a5..5a1a163b2d859696df8f204b5 > 0e3fc76c14b64e9 100644 > --- a/net/ipv4/tcp_offload.c > +++ b/net/ipv4/tcp_offload.c > @@ -75,7 +75,8 @@ struct sk_buff *tcp_gso_segment(struct sk_buff *skb, > if (!pskb_may_pull(skb, thlen)) > goto out; > > - oldlen = (u16)~skb->len; > + /* Contribution of skb->len in current TCP checksum */ > + oldlen = (__force u32)csum_fold((__force __wsum)skb->len); > __skb_pull(skb, thlen); > > mss = skb_shinfo(skb)->gso_size; The only thing I am not sure about with this change is if we risk overflowing a u32 with all the math that may occur. The original code couldn't exceed a u16 for delta since we were essentially adding -oldlen + new header + mss. With us now allowing the use of a value larger than 16 bit we should be able to have the resultant value exceed 16b which means we might overflow when we add it to the current checksum. As such we may want to look at holding off on the csum_fold until after we have added the new header and mss and then store the folded value in delta.
On Thu, Jun 1, 2023 at 11:46 PM Alexander Duyck <alexanderduyck@meta.com> wrote: > > > > > -----Original Message----- > > From: Eric Dumazet <edumazet@google.com> > > Sent: Thursday, June 1, 2023 2:18 PM > > To: David S . Miller <davem@davemloft.net>; Jakub Kicinski > > <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com> > > Cc: netdev@vger.kernel.org; Xin Long <lucien.xin@gmail.com>; David Ahern > > <dsahern@kernel.org>; eric.dumazet@gmail.com; Eric Dumazet > > <edumazet@google.com>; Alexander Duyck <alexanderduyck@meta.com> > > Subject: [PATCH net] tcp: gso: really support BIG TCP > > > > > > > We missed that tcp_gso_segment() was assuming skb->len was smaller than > > 65535 : > > > > oldlen = (u16)~skb->len; > > > > This part came with commit 0718bcc09b35 ("[NET]: Fix CHECKSUM_HW GSO > > problems.") > > > > This leads to wrong TCP checksum. > > > > Simply use csum_fold() to support 32bit packet lengthes. > > > > oldlen name is a bit misleading, as it is the contribution of skb->len on the > > input skb TCP checksum. I added a comment to clarify this point. > > > > Fixes: 09f3d1a3a52c ("ipv6/gso: remove temporary HBH/jumbo header") > > Signed-off-by: Eric Dumazet <edumazet@google.com> > > Cc: Alexander Duyck <alexanderduyck@fb.com> > > --- > > net/ipv4/tcp_offload.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c index > > 45dda788938704c3f762256266d9ea29b6ded4a5..5a1a163b2d859696df8f204b5 > > 0e3fc76c14b64e9 100644 > > --- a/net/ipv4/tcp_offload.c > > +++ b/net/ipv4/tcp_offload.c > > @@ -75,7 +75,8 @@ struct sk_buff *tcp_gso_segment(struct sk_buff *skb, > > if (!pskb_may_pull(skb, thlen)) > > goto out; > > > > - oldlen = (u16)~skb->len; > > + /* Contribution of skb->len in current TCP checksum */ > > + oldlen = (__force u32)csum_fold((__force __wsum)skb->len); > > __skb_pull(skb, thlen); > > > > mss = skb_shinfo(skb)->gso_size; > > The only thing I am not sure about with this change is if we risk overflowing a u32 with all the math that may occur. The original code couldn't exceed a u16 for delta since we were essentially adding -oldlen + new header + mss. With us now allowing the use of a value larger than 16 bit we should be able to have the resultant value exceed 16b which means we might overflow when we add it to the current checksum. > > As such we may want to look at holding off on the csum_fold until after we have added the new header and mss and then store the folded value in delta. > I think you missed that csum_fold() result is also a 16bit value. Logic Herbert added years ago will work the same. Here is the pure C implementation of csum_fold(), I hope this will clear your concerns. static inline __sum16 csum_fold(__wsum csum) { u32 sum = (__force u32)csum; sum = (sum & 0xffff) + (sum >> 16); sum = (sum & 0xffff) + (sum >> 16); return (__force __sum16)~sum; } TCP ipv6 stack populates th->check with tcp_v6_send_check() -> __tcp_v6_send_check(skb, &sk->sk_v6_rcv_saddr, &sk->sk_v6_daddr); -> th->check = ~tcp_v6_check(skb->len, saddr, daddr, 0); Thanks !
On Fri, 2023-06-02 at 04:30 +0200, Eric Dumazet wrote: > On Thu, Jun 1, 2023 at 11:46 PM Alexander Duyck <alexanderduyck@meta.com> wrote: > > > > > > > > > -----Original Message----- > > > From: Eric Dumazet <edumazet@google.com> > > > Sent: Thursday, June 1, 2023 2:18 PM > > > To: David S . Miller <davem@davemloft.net>; Jakub Kicinski > > > <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com> > > > Cc: netdev@vger.kernel.org; Xin Long <lucien.xin@gmail.com>; David Ahern > > > <dsahern@kernel.org>; eric.dumazet@gmail.com; Eric Dumazet > > > <edumazet@google.com>; Alexander Duyck <alexanderduyck@meta.com> > > > Subject: [PATCH net] tcp: gso: really support BIG TCP > > > > > > > > > > We missed that tcp_gso_segment() was assuming skb->len was smaller than > > > 65535 : > > > > > > oldlen = (u16)~skb->len; > > > > > > This part came with commit 0718bcc09b35 ("[NET]: Fix CHECKSUM_HW GSO > > > problems.") > > > > > > This leads to wrong TCP checksum. > > > > > > Simply use csum_fold() to support 32bit packet lengthes. > > > > > > oldlen name is a bit misleading, as it is the contribution of skb->len on the > > > input skb TCP checksum. I added a comment to clarify this point. > > > > > > Fixes: 09f3d1a3a52c ("ipv6/gso: remove temporary HBH/jumbo header") > > > Signed-off-by: Eric Dumazet <edumazet@google.com> > > > Cc: Alexander Duyck <alexanderduyck@fb.com> > > > --- > > > net/ipv4/tcp_offload.c | 3 ++- > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c index > > > 45dda788938704c3f762256266d9ea29b6ded4a5..5a1a163b2d859696df8f204b5 > > > 0e3fc76c14b64e9 100644 > > > --- a/net/ipv4/tcp_offload.c > > > +++ b/net/ipv4/tcp_offload.c > > > @@ -75,7 +75,8 @@ struct sk_buff *tcp_gso_segment(struct sk_buff *skb, > > > if (!pskb_may_pull(skb, thlen)) > > > goto out; > > > > > > - oldlen = (u16)~skb->len; > > > + /* Contribution of skb->len in current TCP checksum */ > > > + oldlen = (__force u32)csum_fold((__force __wsum)skb->len); > > > __skb_pull(skb, thlen); > > > > > > mss = skb_shinfo(skb)->gso_size; > > > > The only thing I am not sure about with this change is if we risk overflowing a u32 with all the math that may occur. The original code couldn't exceed a u16 for delta since we were essentially adding -oldlen + new header + mss. With us now allowing the use of a value larger than 16 bit we should be able to have the resultant value exceed 16b which means we might overflow when we add it to the current checksum. > > > > As such we may want to look at holding off on the csum_fold until after we have added the new header and mss and then store the folded value in delta. > > > > I think you missed that csum_fold() result is also a 16bit value. I saw that. My concern was more about delta versus the oldlen value itself though. Specifically your folded value is added to thlen + mss which can then overflow past a 16b value, and when byteswapped and then added to the original checksum there is a risk of potential overflow. The general idea was that ~skb->len + (segment length) will always technically be less than 0 since the original skb->len should always be larger or equal to the new segmented length. So the code as it was would always generate a value 16 or less in length. This was important when we computed delta and added it to the original value since we were using htonl which would byteswap things so we could potentially generate a 32b value, but it couldn't overflow since the two addends consisted of the upper 16b and lower 16b. That is why I am thinking we are better off just dropping the "(u16)" cast and just passing ~skb->len as the old_len. To address this we then have a couple different approaches we could take: 1. use csum_fold on the "delta" value either before or after the htonl. 2. use csum_add instead "+" for the addition of (th->check + delta) I'm thinking option 2 may be the better way to go as it would just add 2 addc operations, one for newcheck and one at the end of the function for th->check.
On Fri, Jun 2, 2023 at 4:24 PM Alexander H Duyck <alexander.duyck@gmail.com> wrote: > > On Fri, 2023-06-02 at 04:30 +0200, Eric Dumazet wrote: > > On Thu, Jun 1, 2023 at 11:46 PM Alexander Duyck <alexanderduyck@meta.com> wrote: > > > > > > > > > > > > > -----Original Message----- > > > > From: Eric Dumazet <edumazet@google.com> > > > > Sent: Thursday, June 1, 2023 2:18 PM > > > > To: David S . Miller <davem@davemloft.net>; Jakub Kicinski > > > > <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com> > > > > Cc: netdev@vger.kernel.org; Xin Long <lucien.xin@gmail.com>; David Ahern > > > > <dsahern@kernel.org>; eric.dumazet@gmail.com; Eric Dumazet > > > > <edumazet@google.com>; Alexander Duyck <alexanderduyck@meta.com> > > > > Subject: [PATCH net] tcp: gso: really support BIG TCP > > > > > > > > > > > > > We missed that tcp_gso_segment() was assuming skb->len was smaller than > > > > 65535 : > > > > > > > > oldlen = (u16)~skb->len; > > > > > > > > This part came with commit 0718bcc09b35 ("[NET]: Fix CHECKSUM_HW GSO > > > > problems.") > > > > > > > > This leads to wrong TCP checksum. > > > > > > > > Simply use csum_fold() to support 32bit packet lengthes. > > > > > > > > oldlen name is a bit misleading, as it is the contribution of skb->len on the > > > > input skb TCP checksum. I added a comment to clarify this point. > > > > > > > > Fixes: 09f3d1a3a52c ("ipv6/gso: remove temporary HBH/jumbo header") > > > > Signed-off-by: Eric Dumazet <edumazet@google.com> > > > > Cc: Alexander Duyck <alexanderduyck@fb.com> > > > > --- > > > > net/ipv4/tcp_offload.c | 3 ++- > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c index > > > > 45dda788938704c3f762256266d9ea29b6ded4a5..5a1a163b2d859696df8f204b5 > > > > 0e3fc76c14b64e9 100644 > > > > --- a/net/ipv4/tcp_offload.c > > > > +++ b/net/ipv4/tcp_offload.c > > > > @@ -75,7 +75,8 @@ struct sk_buff *tcp_gso_segment(struct sk_buff *skb, > > > > if (!pskb_may_pull(skb, thlen)) > > > > goto out; > > > > > > > > - oldlen = (u16)~skb->len; > > > > + /* Contribution of skb->len in current TCP checksum */ > > > > + oldlen = (__force u32)csum_fold((__force __wsum)skb->len); > > > > __skb_pull(skb, thlen); > > > > > > > > mss = skb_shinfo(skb)->gso_size; > > > > > > The only thing I am not sure about with this change is if we risk overflowing a u32 with all the math that may occur. The original code couldn't exceed a u16 for delta since we were essentially adding -oldlen + new header + mss. With us now allowing the use of a value larger than 16 bit we should be able to have the resultant value exceed 16b which means we might overflow when we add it to the current checksum. > > > > > > As such we may want to look at holding off on the csum_fold until after we have added the new header and mss and then store the folded value in delta. > > > > > > > I think you missed that csum_fold() result is also a 16bit value. > > I saw that. My concern was more about delta versus the oldlen value > itself though. Specifically your folded value is added to thlen + mss > which can then overflow past a 16b value, and when byteswapped and then > added to the original checksum there is a risk of potential overflow. I do not think it matters. Herbert Xu said that what matters is that oldlen + (thlen + mss) would not overflow a 32bit value./ > > The general idea was that ~skb->len + (segment length) will always > technically be less than 0 since the original skb->len should always be > larger or equal to the new segmented length. So the code as it was > would always generate a value 16 or less in length. > > This was important when we computed delta and added it to the original > value since we were using htonl which would byteswap things so we could > potentially generate a 32b value, but it couldn't overflow since the > two addends consisted of the upper 16b and lower 16b. > > That is why I am thinking we are better off just dropping the "(u16)" > cast and just passing ~skb->len as the old_len. > > To address this we then have a couple different approaches we could > take: > 1. use csum_fold on the "delta" value either before or after the htonl. > 2. use csum_add instead "+" for the addition of (th->check + delta) > > I'm thinking option 2 may be the better way to go as it would just add > 2 addc operations, one for newcheck and one at the end of the function > for th->check. Are you working on a patch yourself ? What would be the ETA ? Thanks.
On Fri, Jun 2, 2023 at 5:21 PM Eric Dumazet <edumazet@google.com> wrote: > > On Fri, Jun 2, 2023 at 4:24 PM Alexander H Duyck > <alexander.duyck@gmail.com> wrote: > > > > On Fri, 2023-06-02 at 04:30 +0200, Eric Dumazet wrote: > > > On Thu, Jun 1, 2023 at 11:46 PM Alexander Duyck <alexanderduyck@meta.com> wrote: > > > > > > > > > > > > > > > > > -----Original Message----- > > > > > From: Eric Dumazet <edumazet@google.com> > > > > > Sent: Thursday, June 1, 2023 2:18 PM > > > > > To: David S . Miller <davem@davemloft.net>; Jakub Kicinski > > > > > <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com> > > > > > Cc: netdev@vger.kernel.org; Xin Long <lucien.xin@gmail.com>; David Ahern > > > > > <dsahern@kernel.org>; eric.dumazet@gmail.com; Eric Dumazet > > > > > <edumazet@google.com>; Alexander Duyck <alexanderduyck@meta.com> > > > > > Subject: [PATCH net] tcp: gso: really support BIG TCP > > > > > > > > > > > > > > > > We missed that tcp_gso_segment() was assuming skb->len was smaller than > > > > > 65535 : > > > > > > > > > > oldlen = (u16)~skb->len; > > > > > > > > > > This part came with commit 0718bcc09b35 ("[NET]: Fix CHECKSUM_HW GSO > > > > > problems.") > > > > > > > > > > This leads to wrong TCP checksum. > > > > > > > > > > Simply use csum_fold() to support 32bit packet lengthes. > > > > > > > > > > oldlen name is a bit misleading, as it is the contribution of skb->len on the > > > > > input skb TCP checksum. I added a comment to clarify this point. > > > > > > > > > > Fixes: 09f3d1a3a52c ("ipv6/gso: remove temporary HBH/jumbo header") > > > > > Signed-off-by: Eric Dumazet <edumazet@google.com> > > > > > Cc: Alexander Duyck <alexanderduyck@fb.com> > > > > > --- > > > > > net/ipv4/tcp_offload.c | 3 ++- > > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c index > > > > > 45dda788938704c3f762256266d9ea29b6ded4a5..5a1a163b2d859696df8f204b5 > > > > > 0e3fc76c14b64e9 100644 > > > > > --- a/net/ipv4/tcp_offload.c > > > > > +++ b/net/ipv4/tcp_offload.c > > > > > @@ -75,7 +75,8 @@ struct sk_buff *tcp_gso_segment(struct sk_buff *skb, > > > > > if (!pskb_may_pull(skb, thlen)) > > > > > goto out; > > > > > > > > > > - oldlen = (u16)~skb->len; > > > > > + /* Contribution of skb->len in current TCP checksum */ > > > > > + oldlen = (__force u32)csum_fold((__force __wsum)skb->len); > > > > > __skb_pull(skb, thlen); > > > > > > > > > > mss = skb_shinfo(skb)->gso_size; > > > > > > > > The only thing I am not sure about with this change is if we risk overflowing a u32 with all the math that may occur. The original code couldn't exceed a u16 for delta since we were essentially adding -oldlen + new header + mss. With us now allowing the use of a value larger than 16 bit we should be able to have the resultant value exceed 16b which means we might overflow when we add it to the current checksum. > > > > > > > > As such we may want to look at holding off on the csum_fold until after we have added the new header and mss and then store the folded value in delta. > > > > > > > > > > I think you missed that csum_fold() result is also a 16bit value. > > > > I saw that. My concern was more about delta versus the oldlen value > > itself though. Specifically your folded value is added to thlen + mss > > which can then overflow past a 16b value, and when byteswapped and then > > added to the original checksum there is a risk of potential overflow. > > I do not think it matters. Herbert Xu said that what matters is that > > oldlen + (thlen + mss) would not overflow a 32bit value./ > > > > > The general idea was that ~skb->len + (segment length) will always > > technically be less than 0 since the original skb->len should always be > > larger or equal to the new segmented length. So the code as it was > > would always generate a value 16 or less in length. > > > > This was important when we computed delta and added it to the original > > value since we were using htonl which would byteswap things so we could > > potentially generate a 32b value, but it couldn't overflow since the > > two addends consisted of the upper 16b and lower 16b. > > > > That is why I am thinking we are better off just dropping the "(u16)" > > cast and just passing ~skb->len as the old_len. > > > > To address this we then have a couple different approaches we could > > take: > > 1. use csum_fold on the "delta" value either before or after the htonl. > > 2. use csum_add instead "+" for the addition of (th->check + delta) > > > > I'm thinking option 2 may be the better way to go as it would just add > > 2 addc operations, one for newcheck and one at the end of the function > > for th->check. > > Are you working on a patch yourself ? What would be the ETA ? > > Thanks. To be clear, I spent a lot of time trying to code something like the following, then I chose a much simpler solution (patch as you see it) This was the WIP (and not working at all) diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c index 3568607588b107f06b8fb7b1143d5417bb2a3ac2..19bc2d9ae10d45aa5cbb35add4aa8e9f6b46a6ab 100644 --- a/net/ipv4/tcp_offload.c +++ b/net/ipv4/tcp_offload.c @@ -60,11 +60,11 @@ struct sk_buff *tcp_gso_segment(struct sk_buff *skb, { struct sk_buff *segs = ERR_PTR(-EINVAL); unsigned int sum_truesize = 0; + unsigned int oldlen, newlen; struct tcphdr *th; unsigned int thlen; unsigned int seq; - __be32 delta; - unsigned int oldlen; + __wsum delta; unsigned int mss; struct sk_buff *gso_skb = skb; __sum16 newcheck; @@ -78,7 +78,7 @@ struct sk_buff *tcp_gso_segment(struct sk_buff *skb, if (!pskb_may_pull(skb, thlen)) goto out; - oldlen = (u16)~skb->len; + oldlen = skb->len; __skb_pull(skb, thlen); mss = skb_shinfo(skb)->gso_size; @@ -113,7 +113,9 @@ struct sk_buff *tcp_gso_segment(struct sk_buff *skb, if (skb_is_gso(segs)) mss *= skb_shinfo(segs)->gso_segs; - delta = htonl(oldlen + (thlen + mss)); + newlen = thlen + mss; + delta = csum_sub(htonl(newlen), htonl(oldlen)); + newcheck = csum_fold(csum_add(csum_unfold(th->check), delta)); skb = segs; th = tcp_hdr(skb); @@ -122,8 +124,6 @@ struct sk_buff *tcp_gso_segment(struct sk_buff *skb, if (unlikely(skb_shinfo(gso_skb)->tx_flags & SKBTX_SW_TSTAMP)) tcp_gso_tstamp(segs, skb_shinfo(gso_skb)->tskey, seq, mss); - newcheck = ~csum_fold((__force __wsum)((__force u32)th->check + - (__force u32)delta)); while (skb->next) { th->fin = th->psh = 0; @@ -168,11 +168,11 @@ struct sk_buff *tcp_gso_segment(struct sk_buff *skb, WARN_ON_ONCE(refcount_sub_and_test(-delta, &skb->sk->sk_wmem_alloc)); } - delta = htonl(oldlen + (skb_tail_pointer(skb) - - skb_transport_header(skb)) + - skb->data_len); - th->check = ~csum_fold((__force __wsum)((__force u32)th->check + - (__force u32)delta)); + newlen = skb_tail_pointer(skb) - skb_transport_header(skb) + + skb->data_len; + delta = csum_sub(htonl(newlen), htonl(oldlen)); + th->check = csum_fold(csum_add(csum_unfold(th->check), delta)); + if (skb->ip_summed == CHECKSUM_PARTIAL) gso_reset_checksum(skb, ~th->check); else
On Fri, Jun 2, 2023 at 8:25 AM Eric Dumazet <edumazet@google.com> wrote: <...> > This was the WIP (and not working at all) This is way more than what I was thinking Some of what I had stated was just pseudocode. No need for the addition of newlen for instance. > diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c > index 3568607588b107f06b8fb7b1143d5417bb2a3ac2..19bc2d9ae10d45aa5cbb35add4aa8e9f6b46a6ab > 100644 > --- a/net/ipv4/tcp_offload.c > +++ b/net/ipv4/tcp_offload.c > @@ -60,11 +60,11 @@ struct sk_buff *tcp_gso_segment(struct sk_buff *skb, > { > struct sk_buff *segs = ERR_PTR(-EINVAL); > unsigned int sum_truesize = 0; > + unsigned int oldlen, newlen; > struct tcphdr *th; > unsigned int thlen; > unsigned int seq; > - __be32 delta; > - unsigned int oldlen; > + __wsum delta; > unsigned int mss; > struct sk_buff *gso_skb = skb; > __sum16 newcheck; I wouldn't even bother with any of these changes. > @@ -78,7 +78,7 @@ struct sk_buff *tcp_gso_segment(struct sk_buff *skb, > if (!pskb_may_pull(skb, thlen)) > goto out; > > - oldlen = (u16)~skb->len; > + oldlen = skb->len; > __skb_pull(skb, thlen); > > mss = skb_shinfo(skb)->gso_size; As I stated before I would just drop the "(u16)". We are expanding the oldlen to a u32, but don't have to worry about it overflowing because skb->len should always be greater than or equal to our segmented length. So the most it could reach is ~0 if skb->len == segmented length. > @@ -113,7 +113,9 @@ struct sk_buff *tcp_gso_segment(struct sk_buff *skb, > if (skb_is_gso(segs)) > mss *= skb_shinfo(segs)->gso_segs; > > - delta = htonl(oldlen + (thlen + mss)); > + newlen = thlen + mss; > + delta = csum_sub(htonl(newlen), htonl(oldlen)); > + newcheck = csum_fold(csum_add(csum_unfold(th->check), delta)); > > skb = segs; > th = tcp_hdr(skb) I think all of this can just be dropped. > @@ -122,8 +124,6 @@ struct sk_buff *tcp_gso_segment(struct sk_buff *skb, > if (unlikely(skb_shinfo(gso_skb)->tx_flags & SKBTX_SW_TSTAMP)) > tcp_gso_tstamp(segs, skb_shinfo(gso_skb)->tskey, seq, mss); > > - newcheck = ~csum_fold((__force __wsum)((__force u32)th->check + > - (__force u32)delta)); > > while (skb->next) { > th->fin = th->psh = 0; So the change I would make here is: newcheck = ~csum_fold(csum_add(csum_unfold(th->check, ()), (__force __wsum)delta); > @@ -168,11 +168,11 @@ struct sk_buff *tcp_gso_segment(struct sk_buff *skb, > WARN_ON_ONCE(refcount_sub_and_test(-delta, > &skb->sk->sk_wmem_alloc)); > } > > - delta = htonl(oldlen + (skb_tail_pointer(skb) - > - skb_transport_header(skb)) + > - skb->data_len); > - th->check = ~csum_fold((__force __wsum)((__force u32)th->check + > - (__force u32)delta)); > + newlen = skb_tail_pointer(skb) - skb_transport_header(skb) + > + skb->data_len; > + delta = csum_sub(htonl(newlen), htonl(oldlen)); > + th->check = csum_fold(csum_add(csum_unfold(th->check), delta)); > + > if (skb->ip_summed == CHECKSUM_PARTIAL) > gso_reset_checksum(skb, ~th->check); > else Likewise here, the only thing that has to be replaced is the th->check line so it would be: th->check = ~csum_fold(csum_add(csum_unfold(th->check, ()), (__force __wsum)delta);
From: Alexander Duyck > Sent: 02 June 2023 16:39 ... > I wouldn't even bother with any of these changes. > > > @@ -78,7 +78,7 @@ struct sk_buff *tcp_gso_segment(struct sk_buff *skb, > > if (!pskb_may_pull(skb, thlen)) > > goto out; > > > > - oldlen = (u16)~skb->len; > > + oldlen = skb->len; > > __skb_pull(skb, thlen); > > > > mss = skb_shinfo(skb)->gso_size; > > As I stated before I would just drop the "(u16)". We are expanding the > oldlen to a u32, but don't have to worry about it overflowing because > skb->len should always be greater than or equal to our segmented > length. So the most it could reach is ~0 if skb->len == segmented > length. Have you missed the ~ ? The (u16) makes it equivalent to skb->len ^ 0xffff. Otherwise all the high bit are set and any arithmetic will generate a carry and the wrong value. FWIW this C version of csum_fold() (eg from arch/arc) generates better code than the existing versions on pretty much every architecture. return (~sum - ror32(sum, 16)) >> 16; The main exception will be sparc (which has a carry flag but no rotate) and arm with its barrel shifter. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c index 45dda788938704c3f762256266d9ea29b6ded4a5..5a1a163b2d859696df8f204b50e3fc76c14b64e9 100644 --- a/net/ipv4/tcp_offload.c +++ b/net/ipv4/tcp_offload.c @@ -75,7 +75,8 @@ struct sk_buff *tcp_gso_segment(struct sk_buff *skb, if (!pskb_may_pull(skb, thlen)) goto out; - oldlen = (u16)~skb->len; + /* Contribution of skb->len in current TCP checksum */ + oldlen = (__force u32)csum_fold((__force __wsum)skb->len); __skb_pull(skb, thlen); mss = skb_shinfo(skb)->gso_size;
We missed that tcp_gso_segment() was assuming skb->len was smaller than 65535 : oldlen = (u16)~skb->len; This part came with commit 0718bcc09b35 ("[NET]: Fix CHECKSUM_HW GSO problems.") This leads to wrong TCP checksum. Simply use csum_fold() to support 32bit packet lengthes. oldlen name is a bit misleading, as it is the contribution of skb->len on the input skb TCP checksum. I added a comment to clarify this point. Fixes: 09f3d1a3a52c ("ipv6/gso: remove temporary HBH/jumbo header") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Alexander Duyck <alexanderduyck@fb.com> --- net/ipv4/tcp_offload.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)