diff mbox series

[net,v2,3/4] udp: do not transition UDP fraglist to unnecessary checksum

Message ID 20240319093140.499123-4-atenart@kernel.org (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series gro: various fixes related to UDP tunnels | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit fail Errors and warnings before: 949 this patch: 949
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers warning 1 maintainers not CCed: dsahern@kernel.org
netdev/build_clang fail Errors and warnings before: 962 this patch: 962
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn fail Errors and warnings before: 966 this patch: 966
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 28 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Antoine Tenart March 19, 2024, 9:31 a.m. UTC
udp4/6_gro_complete transition fraglist packets to CHECKSUM_UNNECESSARY
and sets their checksum level based on if the packet is recognized to be
a tunneled one. However there is no safe way to detect a packet is a
tunneled one and in case such packet is GROed at the UDP level, setting
a wrong checksum level will lead to later errors. For example if those
packets are forwarded to the Tx path they could produce the following
dump:

  gen01: hw csum failure
  skb len=3008 headroom=160 headlen=1376 tailroom=0
  mac=(106,14) net=(120,40) trans=160
  shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
  csum(0xffff232e ip_summed=2 complete_sw=0 valid=0 level=0)
  hash(0x77e3d716 sw=1 l4=1) proto=0x86dd pkttype=0 iif=12
  ...

Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
---
 net/ipv4/udp_offload.c | 8 --------
 net/ipv6/udp_offload.c | 8 --------
 2 files changed, 16 deletions(-)

Comments

Willem de Bruijn March 19, 2024, 1:38 p.m. UTC | #1
Antoine Tenart wrote:
> udp4/6_gro_complete transition fraglist packets to CHECKSUM_UNNECESSARY
> and sets their checksum level based on if the packet is recognized to be
> a tunneled one. However there is no safe way to detect a packet is a
> tunneled one and in case such packet is GROed at the UDP level, setting
> a wrong checksum level will lead to later errors. For example if those
> packets are forwarded to the Tx path they could produce the following
> dump:
> 
>   gen01: hw csum failure
>   skb len=3008 headroom=160 headlen=1376 tailroom=0
>   mac=(106,14) net=(120,40) trans=160
>   shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
>   csum(0xffff232e ip_summed=2 complete_sw=0 valid=0 level=0)
>   hash(0x77e3d716 sw=1 l4=1) proto=0x86dd pkttype=0 iif=12
>   ...
> 
> Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
> Signed-off-by: Antoine Tenart <atenart@kernel.org>

The original patch converted to CHECKSUM_UNNECESSARY for a reason.
The skb->csum of the main gso_skb is not valid?

Should instead only the csum_level be adjusted, to always keep
csum_level == 0?
Antoine Tenart March 19, 2024, 4:01 p.m. UTC | #2
Quoting Willem de Bruijn (2024-03-19 14:38:20)
> Antoine Tenart wrote:
> > udp4/6_gro_complete transition fraglist packets to CHECKSUM_UNNECESSARY
> > and sets their checksum level based on if the packet is recognized to be
> > a tunneled one. However there is no safe way to detect a packet is a
> > tunneled one and in case such packet is GROed at the UDP level, setting
> > a wrong checksum level will lead to later errors. For example if those
> > packets are forwarded to the Tx path they could produce the following
> > dump:
> > 
> >   gen01: hw csum failure
> >   skb len=3008 headroom=160 headlen=1376 tailroom=0
> >   mac=(106,14) net=(120,40) trans=160
> >   shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
> >   csum(0xffff232e ip_summed=2 complete_sw=0 valid=0 level=0)
> >   hash(0x77e3d716 sw=1 l4=1) proto=0x86dd pkttype=0 iif=12
> >   ...
> > 
> > Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
> > Signed-off-by: Antoine Tenart <atenart@kernel.org>
> 
> The original patch converted to CHECKSUM_UNNECESSARY for a reason.
> The skb->csum of the main gso_skb is not valid?
> 
> Should instead only the csum_level be adjusted, to always keep
> csum_level == 0?

The above trace is an ICMPv6 packet being tunneled and GROed at the UDP
level, thus we have:
  UDP(CHECKSUM_PARTIAL)/Geneve/ICMPv6(was CHECKSUM_NONE)
csum_level would need to be 1 here; but we can't know that.

There is another issue (no kernel trace): if a packet has partial csum
and is being GROed that information is lost and the packet ends up with
an invalid csum.

Packets with CHECKSUM_UNNECESSARY should end up with the same info. My
impression is this checksum conversion is at best setting the same info
and otherwise is overriding valuable csum information.

Or would packets with CSUM_NONE being GROed would benefit from the
CHECKSUM_UNNECESSARY conversion?

For reference, original commit says:
"""
After validating the csum,  we mark ip_summed as
CHECKSUM_UNNECESSARY for fraglist GRO packets to
make sure that the csum is not touched.
"""

But I'm failing to see where that would happen and how the none to
unnecessary conversion would help. WDYT?

Thanks,
Antoine
Willem de Bruijn March 20, 2024, 1 p.m. UTC | #3
Antoine Tenart wrote:
> Quoting Willem de Bruijn (2024-03-19 14:38:20)
> > Antoine Tenart wrote:
> > > udp4/6_gro_complete transition fraglist packets to CHECKSUM_UNNECESSARY
> > > and sets their checksum level based on if the packet is recognized to be
> > > a tunneled one. However there is no safe way to detect a packet is a
> > > tunneled one and in case such packet is GROed at the UDP level, setting
> > > a wrong checksum level will lead to later errors. For example if those
> > > packets are forwarded to the Tx path they could produce the following
> > > dump:
> > > 
> > >   gen01: hw csum failure
> > >   skb len=3008 headroom=160 headlen=1376 tailroom=0
> > >   mac=(106,14) net=(120,40) trans=160
> > >   shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
> > >   csum(0xffff232e ip_summed=2 complete_sw=0 valid=0 level=0)
> > >   hash(0x77e3d716 sw=1 l4=1) proto=0x86dd pkttype=0 iif=12
> > >   ...
> > > 
> > > Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
> > > Signed-off-by: Antoine Tenart <atenart@kernel.org>
> > 
> > The original patch converted to CHECKSUM_UNNECESSARY for a reason.
> > The skb->csum of the main gso_skb is not valid?
> > 
> > Should instead only the csum_level be adjusted, to always keep
> > csum_level == 0?
> 
> The above trace is an ICMPv6 packet being tunneled and GROed at the UDP
> level, thus we have:
>   UDP(CHECKSUM_PARTIAL)/Geneve/ICMPv6(was CHECKSUM_NONE)
> csum_level would need to be 1 here; but we can't know that.

Is this a packet looped internally? Else it is not CHECKSUM_PARTIAL.
Looped packets can trivially be converted to CHECKSUM_UNNECESSARY.
It just has to be level 0 if only the outer checksum is known.

> There is another issue (no kernel trace): if a packet has partial csum
> and is being GROed that information is lost and the packet ends up with
> an invalid csum.

CHECKSUM_PARTIAL should be converted to CHECKSUM_UNNECESSARY for this
reason. CHECKSUM_PARTIAL implies the header is prepared with pseudo
header checksum. Similarly CHECKSUM_COMPLETE implies skb csum is valid.
CHECKSUM_UNNECESSARY has neither expectations.
 
> Packets with CHECKSUM_UNNECESSARY should end up with the same info. My
> impression is this checksum conversion is at best setting the same info
> and otherwise is overriding valuable csum information.
> 
> Or would packets with CSUM_NONE being GROed would benefit from the
> CHECKSUM_UNNECESSARY conversion?

Definitely. If the packet has CHECKSUM_NONE and GRO checks its
validity in software, converting it to CHECKSUM_UNNECESSARY avoids
potential additional checks at later stages in the packet path.

> 
> For reference, original commit says:
> """
> After validating the csum,  we mark ip_summed as
> CHECKSUM_UNNECESSARY for fraglist GRO packets to
> make sure that the csum is not touched.
> """
> 
> But I'm failing to see where that would happen and how the none to
> unnecessary conversion would help. WDYT?

I would appreciate the GRO and fraglist GRO authors to also review
this series and chime in.

> 
> Thanks,
> Antoine
Antoine Tenart March 20, 2024, 3:08 p.m. UTC | #4
Quoting Willem de Bruijn (2024-03-20 14:00:48)
> Antoine Tenart wrote:
> > Quoting Willem de Bruijn (2024-03-19 14:38:20)
> > > 
> > > The original patch converted to CHECKSUM_UNNECESSARY for a reason.
> > > The skb->csum of the main gso_skb is not valid?
> > > 
> > > Should instead only the csum_level be adjusted, to always keep
> > > csum_level == 0?
> > 
> > The above trace is an ICMPv6 packet being tunneled and GROed at the UDP
> > level, thus we have:
> >   UDP(CHECKSUM_PARTIAL)/Geneve/ICMPv6(was CHECKSUM_NONE)
> > csum_level would need to be 1 here; but we can't know that.
> 
> Is this a packet looped internally? Else it is not CHECKSUM_PARTIAL.

I'm not sure to follow, CHECKSUM_NONE packets going in a tunnel will be
encapsulated and the outer UDP header will be CHECKSUM_PARTIAL. The
packet can be looped internally or going to a remote host.

> > There is another issue (no kernel trace): if a packet has partial csum
> > and is being GROed that information is lost and the packet ends up with
> > an invalid csum.
> 
> CHECKSUM_PARTIAL should be converted to CHECKSUM_UNNECESSARY for this
> reason. CHECKSUM_PARTIAL implies the header is prepared with pseudo
> header checksum. Similarly CHECKSUM_COMPLETE implies skb csum is valid.
> CHECKSUM_UNNECESSARY has neither expectations.

But not if the packet is sent to a remote host. Otherwise an inner
partial csum is never fixed by the stack/NIC before going out.

> > Packets with CHECKSUM_UNNECESSARY should end up with the same info. My
> > impression is this checksum conversion is at best setting the same info
> > and otherwise is overriding valuable csum information.
> > 
> > Or would packets with CSUM_NONE being GROed would benefit from the
> > CHECKSUM_UNNECESSARY conversion?
> 
> Definitely. If the packet has CHECKSUM_NONE and GRO checks its
> validity in software, converting it to CHECKSUM_UNNECESSARY avoids
> potential additional checks at later stages in the packet path.

Makes sense. The current code really looks like
__skb_incr_checksum_unnecessary, w/o the CHECKSUM_NONE check to only
convert those packets.

Thanks!
Antoine
Willem de Bruijn March 20, 2024, 8:43 p.m. UTC | #5
Antoine Tenart wrote:
> Quoting Willem de Bruijn (2024-03-20 14:00:48)
> > Antoine Tenart wrote:
> > > Quoting Willem de Bruijn (2024-03-19 14:38:20)
> > > > 
> > > > The original patch converted to CHECKSUM_UNNECESSARY for a reason.
> > > > The skb->csum of the main gso_skb is not valid?
> > > > 
> > > > Should instead only the csum_level be adjusted, to always keep
> > > > csum_level == 0?
> > > 
> > > The above trace is an ICMPv6 packet being tunneled and GROed at the UDP
> > > level, thus we have:
> > >   UDP(CHECKSUM_PARTIAL)/Geneve/ICMPv6(was CHECKSUM_NONE)
> > > csum_level would need to be 1 here; but we can't know that.
> > 
> > Is this a packet looped internally? Else it is not CHECKSUM_PARTIAL.
> 
> I'm not sure to follow, CHECKSUM_NONE packets going in a tunnel will be
> encapsulated and the outer UDP header will be CHECKSUM_PARTIAL. The
> packet can be looped internally or going to a remote host.

That is on transmit. To come into contact with UDP_GRO while having
CHECKSUM_PARTIAL the packet will have to loop into the receive path,
in some way that triggers GRO. Perhaps through gro_cells, as other
GRO paths are hardware NIC drivers.
 
> > > There is another issue (no kernel trace): if a packet has partial csum
> > > and is being GROed that information is lost and the packet ends up with
> > > an invalid csum.
> > 
> > CHECKSUM_PARTIAL should be converted to CHECKSUM_UNNECESSARY for this
> > reason. CHECKSUM_PARTIAL implies the header is prepared with pseudo
> > header checksum. Similarly CHECKSUM_COMPLETE implies skb csum is valid.
> > CHECKSUM_UNNECESSARY has neither expectations.
> 
> But not if the packet is sent to a remote host. Otherwise an inner
> partial csum is never fixed by the stack/NIC before going out.

The stack will only offload a single checksum. With local checksum
offload, this can be the inner checksum and the outer can be cheaply
computed in software. udp_set_csum() handles this. It indeed sets lco
if the inner packet has CHECKSUM_PARTIAL. Otherwise it sets ip_summed
to CHECKSUM_PARTIAL, now pointing to the outer UDP header.

You're right. Regardless of whether it points to the inner or outer
checksum, a conversion of CHECKSUM_PARTIAL to CHECKSUM_UNNECESSARY
will break checksum offload in the forwarding case.

> > > Packets with CHECKSUM_UNNECESSARY should end up with the same info. My
> > > impression is this checksum conversion is at best setting the same info
> > > and otherwise is overriding valuable csum information.
> > > 
> > > Or would packets with CSUM_NONE being GROed would benefit from the
> > > CHECKSUM_UNNECESSARY conversion?
> > 
> > Definitely. If the packet has CHECKSUM_NONE and GRO checks its
> > validity in software, converting it to CHECKSUM_UNNECESSARY avoids
> > potential additional checks at later stages in the packet path.
> 
> Makes sense. The current code really looks like
> __skb_incr_checksum_unnecessary, w/o the CHECKSUM_NONE check to only
> convert those packets.
> 
> Thanks!
> Antoine
Antoine Tenart March 21, 2024, 8:48 a.m. UTC | #6
Quoting Willem de Bruijn (2024-03-20 21:43:55)
> Antoine Tenart wrote:
> > Quoting Willem de Bruijn (2024-03-20 14:00:48)
> > > Antoine Tenart wrote:
> > > > Quoting Willem de Bruijn (2024-03-19 14:38:20)
> > > > > 
> > > > > The original patch converted to CHECKSUM_UNNECESSARY for a reason.
> > > > > The skb->csum of the main gso_skb is not valid?
> > > > > 
> > > > > Should instead only the csum_level be adjusted, to always keep
> > > > > csum_level == 0?
> > > > 
> > > > The above trace is an ICMPv6 packet being tunneled and GROed at the UDP
> > > > level, thus we have:
> > > >   UDP(CHECKSUM_PARTIAL)/Geneve/ICMPv6(was CHECKSUM_NONE)
> > > > csum_level would need to be 1 here; but we can't know that.
> > > 
> > > Is this a packet looped internally? Else it is not CHECKSUM_PARTIAL.
> > 
> > I'm not sure to follow, CHECKSUM_NONE packets going in a tunnel will be
> > encapsulated and the outer UDP header will be CHECKSUM_PARTIAL. The
> > packet can be looped internally or going to a remote host.
> 
> That is on transmit. To come into contact with UDP_GRO while having
> CHECKSUM_PARTIAL the packet will have to loop into the receive path,
> in some way that triggers GRO. Perhaps through gro_cells, as other
> GRO paths are hardware NIC drivers.

I get what you meant now, thanks. Yes, those Tx packets loop into the Rx
path. One easy way is through veth pairs, eg. packet get tunneled in a
netns, connected to another one via a veth pair.

> > > > There is another issue (no kernel trace): if a packet has partial csum
> > > > and is being GROed that information is lost and the packet ends up with
> > > > an invalid csum.
> > > 
> > > CHECKSUM_PARTIAL should be converted to CHECKSUM_UNNECESSARY for this
> > > reason. CHECKSUM_PARTIAL implies the header is prepared with pseudo
> > > header checksum. Similarly CHECKSUM_COMPLETE implies skb csum is valid.
> > > CHECKSUM_UNNECESSARY has neither expectations.
> > 
> > But not if the packet is sent to a remote host. Otherwise an inner
> > partial csum is never fixed by the stack/NIC before going out.
> 
> The stack will only offload a single checksum. With local checksum
> offload, this can be the inner checksum and the outer can be cheaply
> computed in software. udp_set_csum() handles this. It indeed sets lco
> if the inner packet has CHECKSUM_PARTIAL. Otherwise it sets ip_summed
> to CHECKSUM_PARTIAL, now pointing to the outer UDP header.
> 
> You're right. Regardless of whether it points to the inner or outer
> checksum, a conversion of CHECKSUM_PARTIAL to CHECKSUM_UNNECESSARY
> will break checksum offload in the forwarding case.
> 
> > > > Packets with CHECKSUM_UNNECESSARY should end up with the same info. My
> > > > impression is this checksum conversion is at best setting the same info
> > > > and otherwise is overriding valuable csum information.
> > > > 
> > > > Or would packets with CSUM_NONE being GROed would benefit from the
> > > > CHECKSUM_UNNECESSARY conversion?
> > > 
> > > Definitely. If the packet has CHECKSUM_NONE and GRO checks its
> > > validity in software, converting it to CHECKSUM_UNNECESSARY avoids
> > > potential additional checks at later stages in the packet path.
> > 
> > Makes sense. The current code really looks like
> > __skb_incr_checksum_unnecessary, w/o the CHECKSUM_NONE check to only
> > convert those packets.

If I sum up our discussion CHECKSUM_NONE conversion is wanted,
CHECKSUM_UNNECESSARY conversion is a no-op and CHECKSUM_PARTIAL
conversion breaks things. What about we just convert CHECKSUM_NONE to
CHECKSUM_UNNECESSARY?

diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c
index 50a8a65fad23..44779d4c538b 100644
--- a/net/ipv6/udp_offload.c
+++ b/net/ipv6/udp_offload.c
@@ -174,7 +174,7 @@ INDIRECT_CALLABLE_SCOPE int udp6_gro_complete(struct sk_buff *skb, int nhoff)
                if (skb->ip_summed == CHECKSUM_UNNECESSARY) {
                        if (skb->csum_level < SKB_MAX_CSUM_LEVEL)
                                skb->csum_level++;
-               } else {
+               } else if (skb->ip_summed == CHECKSUM_NONE) {
                        skb->ip_summed = CHECKSUM_UNNECESSARY;
                        skb->csum_level = 0;
                }

Or directly call __skb_incr_checksum_unnecessary.

Thanks,
Antoine
Paolo Abeni March 21, 2024, 12:42 p.m. UTC | #7
On Thu, 2024-03-21 at 09:48 +0100, Antoine Tenart wrote:
> Quoting Willem de Bruijn (2024-03-20 21:43:55)
> > Antoine Tenart wrote:
> > > Quoting Willem de Bruijn (2024-03-20 14:00:48)
> > > > Antoine Tenart wrote:
> > > > > Quoting Willem de Bruijn (2024-03-19 14:38:20)
> > > > > > 
> > > > > > The original patch converted to CHECKSUM_UNNECESSARY for a reason.
> > > > > > The skb->csum of the main gso_skb is not valid?
> > > > > > 
> > > > > > Should instead only the csum_level be adjusted, to always keep
> > > > > > csum_level == 0?
> > > > > 
> > > > > The above trace is an ICMPv6 packet being tunneled and GROed at the UDP
> > > > > level, thus we have:
> > > > >   UDP(CHECKSUM_PARTIAL)/Geneve/ICMPv6(was CHECKSUM_NONE)
> > > > > csum_level would need to be 1 here; but we can't know that.
> > > > 
> > > > Is this a packet looped internally? Else it is not CHECKSUM_PARTIAL.
> > > 
> > > I'm not sure to follow, CHECKSUM_NONE packets going in a tunnel will be
> > > encapsulated and the outer UDP header will be CHECKSUM_PARTIAL. The
> > > packet can be looped internally or going to a remote host.
> > 
> > That is on transmit. To come into contact with UDP_GRO while having
> > CHECKSUM_PARTIAL the packet will have to loop into the receive path,
> > in some way that triggers GRO. Perhaps through gro_cells, as other
> > GRO paths are hardware NIC drivers.
> 
> I get what you meant now, thanks. Yes, those Tx packets loop into the Rx
> path. One easy way is through veth pairs, eg. packet get tunneled in a
> netns, connected to another one via a veth pair.
> 
> > > > > There is another issue (no kernel trace): if a packet has partial csum
> > > > > and is being GROed that information is lost and the packet ends up with
> > > > > an invalid csum.
> > > > 
> > > > CHECKSUM_PARTIAL should be converted to CHECKSUM_UNNECESSARY for this
> > > > reason. CHECKSUM_PARTIAL implies the header is prepared with pseudo
> > > > header checksum. Similarly CHECKSUM_COMPLETE implies skb csum is valid.
> > > > CHECKSUM_UNNECESSARY has neither expectations.
> > > 
> > > But not if the packet is sent to a remote host. Otherwise an inner
> > > partial csum is never fixed by the stack/NIC before going out.
> > 
> > The stack will only offload a single checksum. With local checksum
> > offload, this can be the inner checksum and the outer can be cheaply
> > computed in software. udp_set_csum() handles this. It indeed sets lco
> > if the inner packet has CHECKSUM_PARTIAL. Otherwise it sets ip_summed
> > to CHECKSUM_PARTIAL, now pointing to the outer UDP header.
> > 
> > You're right. Regardless of whether it points to the inner or outer
> > checksum, a conversion of CHECKSUM_PARTIAL to CHECKSUM_UNNECESSARY
> > will break checksum offload in the forwarding case.
> > 
> > > > > Packets with CHECKSUM_UNNECESSARY should end up with the same info. My
> > > > > impression is this checksum conversion is at best setting the same info
> > > > > and otherwise is overriding valuable csum information.
> > > > > 
> > > > > Or would packets with CSUM_NONE being GROed would benefit from the
> > > > > CHECKSUM_UNNECESSARY conversion?
> > > > 
> > > > Definitely. If the packet has CHECKSUM_NONE and GRO checks its
> > > > validity in software, converting it to CHECKSUM_UNNECESSARY avoids
> > > > potential additional checks at later stages in the packet path.
> > > 
> > > Makes sense. The current code really looks like
> > > __skb_incr_checksum_unnecessary, w/o the CHECKSUM_NONE check to only
> > > convert those packets.
> 
> If I sum up our discussion CHECKSUM_NONE conversion is wanted,
> CHECKSUM_UNNECESSARY conversion is a no-op and CHECKSUM_PARTIAL
> conversion breaks things. What about we just convert CHECKSUM_NONE to
> CHECKSUM_UNNECESSARY?
> 
> diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c
> index 50a8a65fad23..44779d4c538b 100644
> --- a/net/ipv6/udp_offload.c
> +++ b/net/ipv6/udp_offload.c
> @@ -174,7 +174,7 @@ INDIRECT_CALLABLE_SCOPE int udp6_gro_complete(struct sk_buff *skb, int nhoff)
>                 if (skb->ip_summed == CHECKSUM_UNNECESSARY) {
>                         if (skb->csum_level < SKB_MAX_CSUM_LEVEL)
>                                 skb->csum_level++;
> -               } else {
> +               } else if (skb->ip_summed == CHECKSUM_NONE) {
>                         skb->ip_summed = CHECKSUM_UNNECESSARY;
>                         skb->csum_level = 0;
>                 }
> 
> Or directly call __skb_incr_checksum_unnecessary

I think calling __skb_incr_checksum_unnecessary() would be the better
option.

@Willem: FTR, Antoine and me discussed this series internally for a
bit, and the above variant was also discussed. I was unable to find a
good reason for the CHECKSUM_NONE -> CHECKSUM_UNNECESSARY conversion
back then, so it seemed more clean to drop the whole chunk.

Cheers,

Paolo
Willem de Bruijn March 21, 2024, 2:58 p.m. UTC | #8
Antoine Tenart wrote:
> Quoting Willem de Bruijn (2024-03-20 21:43:55)
> > Antoine Tenart wrote:
> > > Quoting Willem de Bruijn (2024-03-20 14:00:48)
> > > > Antoine Tenart wrote:
> > > > > Quoting Willem de Bruijn (2024-03-19 14:38:20)
> > > > > > 
> > > > > > The original patch converted to CHECKSUM_UNNECESSARY for a reason.
> > > > > > The skb->csum of the main gso_skb is not valid?
> > > > > > 
> > > > > > Should instead only the csum_level be adjusted, to always keep
> > > > > > csum_level == 0?
> > > > > 
> > > > > The above trace is an ICMPv6 packet being tunneled and GROed at the UDP
> > > > > level, thus we have:
> > > > >   UDP(CHECKSUM_PARTIAL)/Geneve/ICMPv6(was CHECKSUM_NONE)
> > > > > csum_level would need to be 1 here; but we can't know that.
> > > > 
> > > > Is this a packet looped internally? Else it is not CHECKSUM_PARTIAL.
> > > 
> > > I'm not sure to follow, CHECKSUM_NONE packets going in a tunnel will be
> > > encapsulated and the outer UDP header will be CHECKSUM_PARTIAL. The
> > > packet can be looped internally or going to a remote host.
> > 
> > That is on transmit. To come into contact with UDP_GRO while having
> > CHECKSUM_PARTIAL the packet will have to loop into the receive path,
> > in some way that triggers GRO. Perhaps through gro_cells, as other
> > GRO paths are hardware NIC drivers.
> 
> I get what you meant now, thanks. Yes, those Tx packets loop into the Rx
> path. One easy way is through veth pairs, eg. packet get tunneled in a
> netns, connected to another one via a veth pair.
> 
> > > > > There is another issue (no kernel trace): if a packet has partial csum
> > > > > and is being GROed that information is lost and the packet ends up with
> > > > > an invalid csum.
> > > > 
> > > > CHECKSUM_PARTIAL should be converted to CHECKSUM_UNNECESSARY for this
> > > > reason. CHECKSUM_PARTIAL implies the header is prepared with pseudo
> > > > header checksum. Similarly CHECKSUM_COMPLETE implies skb csum is valid.
> > > > CHECKSUM_UNNECESSARY has neither expectations.
> > > 
> > > But not if the packet is sent to a remote host. Otherwise an inner
> > > partial csum is never fixed by the stack/NIC before going out.
> > 
> > The stack will only offload a single checksum. With local checksum
> > offload, this can be the inner checksum and the outer can be cheaply
> > computed in software. udp_set_csum() handles this. It indeed sets lco
> > if the inner packet has CHECKSUM_PARTIAL. Otherwise it sets ip_summed
> > to CHECKSUM_PARTIAL, now pointing to the outer UDP header.
> > 
> > You're right. Regardless of whether it points to the inner or outer
> > checksum, a conversion of CHECKSUM_PARTIAL to CHECKSUM_UNNECESSARY
> > will break checksum offload in the forwarding case.
> > 
> > > > > Packets with CHECKSUM_UNNECESSARY should end up with the same info. My
> > > > > impression is this checksum conversion is at best setting the same info
> > > > > and otherwise is overriding valuable csum information.
> > > > > 
> > > > > Or would packets with CSUM_NONE being GROed would benefit from the
> > > > > CHECKSUM_UNNECESSARY conversion?
> > > > 
> > > > Definitely. If the packet has CHECKSUM_NONE and GRO checks its
> > > > validity in software, converting it to CHECKSUM_UNNECESSARY avoids
> > > > potential additional checks at later stages in the packet path.
> > > 
> > > Makes sense. The current code really looks like
> > > __skb_incr_checksum_unnecessary, w/o the CHECKSUM_NONE check to only
> > > convert those packets.
> 
> If I sum up our discussion CHECKSUM_NONE conversion is wanted,
> CHECKSUM_UNNECESSARY conversion is a no-op and CHECKSUM_PARTIAL
> conversion breaks things. What about we just convert CHECKSUM_NONE to
> CHECKSUM_UNNECESSARY?

CHECKSUM_NONE cannot be converted to CHECKSUM_UNNECESSARY in the
receive path. Unless it is known to have been locally generated,
this means that the packet has not been verified yet.

> diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c
> index 50a8a65fad23..44779d4c538b 100644
> --- a/net/ipv6/udp_offload.c
> +++ b/net/ipv6/udp_offload.c
> @@ -174,7 +174,7 @@ INDIRECT_CALLABLE_SCOPE int udp6_gro_complete(struct sk_buff *skb, int nhoff)
>                 if (skb->ip_summed == CHECKSUM_UNNECESSARY) {
>                         if (skb->csum_level < SKB_MAX_CSUM_LEVEL)
>                                 skb->csum_level++;
> -               } else {
> +               } else if (skb->ip_summed == CHECKSUM_NONE) {
>                         skb->ip_summed = CHECKSUM_UNNECESSARY;
>                         skb->csum_level = 0;
>                 }
> 
> Or directly call __skb_incr_checksum_unnecessary.
> 
> Thanks,
> Antoine
Antoine Tenart March 21, 2024, 5:22 p.m. UTC | #9
Quoting Willem de Bruijn (2024-03-21 15:58:17)
> Antoine Tenart wrote:
> > 
> > If I sum up our discussion CHECKSUM_NONE conversion is wanted,
> > CHECKSUM_UNNECESSARY conversion is a no-op and CHECKSUM_PARTIAL
> > conversion breaks things. What about we just convert CHECKSUM_NONE to
> > CHECKSUM_UNNECESSARY?
> 
> CHECKSUM_NONE cannot be converted to CHECKSUM_UNNECESSARY in the
> receive path. Unless it is known to have been locally generated,
> this means that the packet has not been verified yet.

I'm not sure to follow, non-partial checksums are being verified by
skb_gro_checksum_validate_zero_check in udp4/6_gro_receive before ending
up in udp4/6_gro_complete. That's also probably what the original commit
msg refers to: "After validating the csum, we mark ip_summed as
CHECKSUM_UNNECESSARY for fraglist GRO packets".

With fraglist, the csum can then be converted to CHECKSUM_UNNECESSARY.
Except for CHECKSUM_PARTIAL, as we discussed.

Does that make sense? Anything we can do to help moving this forward?

Thanks!
Antoine
Willem de Bruijn March 21, 2024, 6:13 p.m. UTC | #10
Antoine Tenart wrote:
> Quoting Willem de Bruijn (2024-03-21 15:58:17)
> > Antoine Tenart wrote:
> > > 
> > > If I sum up our discussion CHECKSUM_NONE conversion is wanted,
> > > CHECKSUM_UNNECESSARY conversion is a no-op and CHECKSUM_PARTIAL
> > > conversion breaks things. What about we just convert CHECKSUM_NONE to
> > > CHECKSUM_UNNECESSARY?
> > 
> > CHECKSUM_NONE cannot be converted to CHECKSUM_UNNECESSARY in the
> > receive path. Unless it is known to have been locally generated,
> > this means that the packet has not been verified yet.
> 
> I'm not sure to follow, non-partial checksums are being verified by
> skb_gro_checksum_validate_zero_check in udp4/6_gro_receive before ending
> up in udp4/6_gro_complete. That's also probably what the original commit
> msg refers to: "After validating the csum, we mark ip_summed as
> CHECKSUM_UNNECESSARY for fraglist GRO packets".
> 
> With fraglist, the csum can then be converted to CHECKSUM_UNNECESSARY.

Oh yes, of course.

> Except for CHECKSUM_PARTIAL, as we discussed.

Because that is treated as equivalent to CHECKSUM_UNNECESSARY in the
ingress path, and if forwarded to an egress path, csum_start and
csum_off are set correctly. Also for all segs after segmentation.
Okay, that sounds fine then.

There are two cases here: csum_start points to the outer header or it
points to the inner header. I suppose that does not matter for
correctness post segmentation.

> Does that make sense? Anything we can do to help moving this forward?
> 
> Thanks!
> Antoine
Antoine Tenart March 22, 2024, 10:48 a.m. UTC | #11
Quoting Willem de Bruijn (2024-03-21 19:13:35)
> Antoine Tenart wrote:
> > Quoting Willem de Bruijn (2024-03-21 15:58:17)
> > > Antoine Tenart wrote:
> > > > 
> > > > If I sum up our discussion CHECKSUM_NONE conversion is wanted,
> > > > CHECKSUM_UNNECESSARY conversion is a no-op and CHECKSUM_PARTIAL
> > > > conversion breaks things. What about we just convert CHECKSUM_NONE to
> > > > CHECKSUM_UNNECESSARY?
> > > 
> > > CHECKSUM_NONE cannot be converted to CHECKSUM_UNNECESSARY in the
> > > receive path. Unless it is known to have been locally generated,
> > > this means that the packet has not been verified yet.
> > 
> > I'm not sure to follow, non-partial checksums are being verified by
> > skb_gro_checksum_validate_zero_check in udp4/6_gro_receive before ending
> > up in udp4/6_gro_complete. That's also probably what the original commit
> > msg refers to: "After validating the csum, we mark ip_summed as
> > CHECKSUM_UNNECESSARY for fraglist GRO packets".
> > 
> > With fraglist, the csum can then be converted to CHECKSUM_UNNECESSARY.
> 
> Oh yes, of course.
> 
> > Except for CHECKSUM_PARTIAL, as we discussed.
> 
> Because that is treated as equivalent to CHECKSUM_UNNECESSARY in the
> ingress path, and if forwarded to an egress path, csum_start and
> csum_off are set correctly. Also for all segs after segmentation.
> Okay, that sounds fine then.
> 
> There are two cases here: csum_start points to the outer header or it
> points to the inner header. I suppose that does not matter for
> correctness post segmentation.

Right. I'll send a v3 to keep the none -> unnecessary conversion and
fix build issue in patch 1.

Thanks for the review!
Antoine
diff mbox series

Patch

diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index 3bb69464930b..3263ebcaa3f4 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -722,14 +722,6 @@  INDIRECT_CALLABLE_SCOPE int udp4_gro_complete(struct sk_buff *skb, int nhoff)
 		skb_shinfo(skb)->gso_type |= (SKB_GSO_FRAGLIST|SKB_GSO_UDP_L4);
 		skb_shinfo(skb)->gso_segs = NAPI_GRO_CB(skb)->count;
 
-		if (skb->ip_summed == CHECKSUM_UNNECESSARY) {
-			if (skb->csum_level < SKB_MAX_CSUM_LEVEL)
-				skb->csum_level++;
-		} else {
-			skb->ip_summed = CHECKSUM_UNNECESSARY;
-			skb->csum_level = 0;
-		}
-
 		return 0;
 	}
 
diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c
index 312bcaeea96f..9289384cb7d0 100644
--- a/net/ipv6/udp_offload.c
+++ b/net/ipv6/udp_offload.c
@@ -174,14 +174,6 @@  INDIRECT_CALLABLE_SCOPE int udp6_gro_complete(struct sk_buff *skb, int nhoff)
 		skb_shinfo(skb)->gso_type |= (SKB_GSO_FRAGLIST|SKB_GSO_UDP_L4);
 		skb_shinfo(skb)->gso_segs = NAPI_GRO_CB(skb)->count;
 
-		if (skb->ip_summed == CHECKSUM_UNNECESSARY) {
-			if (skb->csum_level < SKB_MAX_CSUM_LEVEL)
-				skb->csum_level++;
-		} else {
-			skb->ip_summed = CHECKSUM_UNNECESSARY;
-			skb->csum_level = 0;
-		}
-
 		return 0;
 	}