diff mbox series

[3/3] net: xilinx: axienet: Relax partial rx checksum checks

Message ID 20240903184334.4150843-4-sean.anderson@linux.dev (mailing list archive)
State New, archived
Headers show
Series net: xilinx: axienet: Partial checksum offload improvements | expand

Commit Message

Sean Anderson Sept. 3, 2024, 6:43 p.m. UTC
The partial rx checksum feature computes a checksum over the entire
packet, regardless of the L3 protocol. Remove the check for IPv4.
Additionally, packets under 64 bytes should have been dropped by the
MAC, so we can remove the length check as well.

Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
---

 drivers/net/ethernet/xilinx/xilinx_axienet_main.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

Comments

Simon Horman Sept. 4, 2024, 4:20 p.m. UTC | #1
On Tue, Sep 03, 2024 at 02:43:34PM -0400, Sean Anderson wrote:
> The partial rx checksum feature computes a checksum over the entire
> packet, regardless of the L3 protocol. Remove the check for IPv4.
> Additionally, packets under 64 bytes should have been dropped by the
> MAC, so we can remove the length check as well.
> 
> Signed-off-by: Sean Anderson <sean.anderson@linux.dev>

Reviewed-by: Simon Horman <horms@kernel.org>
Eric Dumazet Sept. 4, 2024, 4:30 p.m. UTC | #2
On Tue, Sep 3, 2024 at 8:43 PM Sean Anderson <sean.anderson@linux.dev> wrote:
>
> The partial rx checksum feature computes a checksum over the entire
> packet, regardless of the L3 protocol. Remove the check for IPv4.
> Additionally, packets under 64 bytes should have been dropped by the
> MAC, so we can remove the length check as well.

Some packets have a smaller len (than 64).

For instance, TCP pure ACK and no options over IPv4 would be 54 bytes long.

Presumably they are not dropped by the MAC ?
Sean Anderson Sept. 5, 2024, 2:24 p.m. UTC | #3
On 9/4/24 12:30, Eric Dumazet wrote:
> On Tue, Sep 3, 2024 at 8:43 PM Sean Anderson <sean.anderson@linux.dev> wrote:
>>
>> The partial rx checksum feature computes a checksum over the entire
>> packet, regardless of the L3 protocol. Remove the check for IPv4.
>> Additionally, packets under 64 bytes should have been dropped by the
>> MAC, so we can remove the length check as well.
> 
> Some packets have a smaller len (than 64).
> 
> For instance, TCP pure ACK and no options over IPv4 would be 54 bytes long.
> 
> Presumably they are not dropped by the MAC ?

Ethernet frames have a minimum size on the wire of 64 bytes. From 802.3
section 4.2.4.2.2:

| The shortest valid transmission in full duplex mode must be at least
| minFrameSize in length. While collisions do not occur in full duplex
| mode MACs, a full duplex MAC nevertheless discards received frames
| containing less than minFrameSize bits. The discarding of such a frame
| by a MAC is not reported as an error.

where minFrameSize is 512 bits (64 bytes).

On the transmit side, undersize frames are padded. From 802.3 section
4.2.3.3:

| The CSMA/CD Media Access mechanism requires that a minimum frame
| length of minFrameSize bits be transmitted. If frameSize is less than
| minFrameSize, then the CSMA/CD MAC sublayer shall append extra bits in
| units of octets (Pad), after the end of the MAC Client Data field but
| prior to calculating and appending the FCS (if not provided by the MAC
| client).

That said, I could not find any mention of a minimum frame size
limitation for partial checksums in the AXI Ethernet documentation.
RX_CSRAW is calculated over the whole packet, so it's possible that this
check is trying to avoid passing it to the net subsystem when the frame
has been padded. However, skb->len is the length of the Ethernet packet,
so we can't tell how long the original packet was at this point. That
can only be determined from the L3 header, which isn't parsed yet. I
assume this is handled by the net subsystem.

--Sean
Eric Dumazet Sept. 5, 2024, 2:59 p.m. UTC | #4
On Thu, Sep 5, 2024 at 4:24 PM Sean Anderson <sean.anderson@linux.dev> wrote:
>
> On 9/4/24 12:30, Eric Dumazet wrote:
> > On Tue, Sep 3, 2024 at 8:43 PM Sean Anderson <sean.anderson@linux.dev> wrote:
> >>
> >> The partial rx checksum feature computes a checksum over the entire
> >> packet, regardless of the L3 protocol. Remove the check for IPv4.
> >> Additionally, packets under 64 bytes should have been dropped by the
> >> MAC, so we can remove the length check as well.
> >
> > Some packets have a smaller len (than 64).
> >
> > For instance, TCP pure ACK and no options over IPv4 would be 54 bytes long.
> >
> > Presumably they are not dropped by the MAC ?
>
> Ethernet frames have a minimum size on the wire of 64 bytes. From 802.3
> section 4.2.4.2.2:
>
> | The shortest valid transmission in full duplex mode must be at least
> | minFrameSize in length. While collisions do not occur in full duplex
> | mode MACs, a full duplex MAC nevertheless discards received frames
> | containing less than minFrameSize bits. The discarding of such a frame
> | by a MAC is not reported as an error.
>
> where minFrameSize is 512 bits (64 bytes).
>
> On the transmit side, undersize frames are padded. From 802.3 section
> 4.2.3.3:
>
> | The CSMA/CD Media Access mechanism requires that a minimum frame
> | length of minFrameSize bits be transmitted. If frameSize is less than
> | minFrameSize, then the CSMA/CD MAC sublayer shall append extra bits in
> | units of octets (Pad), after the end of the MAC Client Data field but
> | prior to calculating and appending the FCS (if not provided by the MAC
> | client).
>
> That said, I could not find any mention of a minimum frame size
> limitation for partial checksums in the AXI Ethernet documentation.
> RX_CSRAW is calculated over the whole packet, so it's possible that this
> check is trying to avoid passing it to the net subsystem when the frame
> has been padded. However, skb->len is the length of the Ethernet packet,
> so we can't tell how long the original packet was at this point. That
> can only be determined from the L3 header, which isn't parsed yet. I
> assume this is handled by the net subsystem.
>

The fact there was a check in the driver hints about something.

It is possible the csum is incorrect if a 'padding' is added at the
receiver, if the padding has non zero bytes, and is not included in
the csum.

Look at this relevant patch :

Author: Saeed Mahameed <saeedm@mellanox.com>
Date:   Mon Feb 11 18:04:17 2019 +0200

    net/mlx4_en: Force CHECKSUM_NONE for short ethernet frames
Sean Anderson Sept. 5, 2024, 4:32 p.m. UTC | #5
On 9/5/24 10:59, Eric Dumazet wrote:
> On Thu, Sep 5, 2024 at 4:24 PM Sean Anderson <sean.anderson@linux.dev> wrote:
>>
>> On 9/4/24 12:30, Eric Dumazet wrote:
>> > On Tue, Sep 3, 2024 at 8:43 PM Sean Anderson <sean.anderson@linux.dev> wrote:
>> >>
>> >> The partial rx checksum feature computes a checksum over the entire
>> >> packet, regardless of the L3 protocol. Remove the check for IPv4.
>> >> Additionally, packets under 64 bytes should have been dropped by the
>> >> MAC, so we can remove the length check as well.
>> >
>> > Some packets have a smaller len (than 64).
>> >
>> > For instance, TCP pure ACK and no options over IPv4 would be 54 bytes long.
>> >
>> > Presumably they are not dropped by the MAC ?
>>
>> Ethernet frames have a minimum size on the wire of 64 bytes. From 802.3
>> section 4.2.4.2.2:
>>
>> | The shortest valid transmission in full duplex mode must be at least
>> | minFrameSize in length. While collisions do not occur in full duplex
>> | mode MACs, a full duplex MAC nevertheless discards received frames
>> | containing less than minFrameSize bits. The discarding of such a frame
>> | by a MAC is not reported as an error.
>>
>> where minFrameSize is 512 bits (64 bytes).
>>
>> On the transmit side, undersize frames are padded. From 802.3 section
>> 4.2.3.3:
>>
>> | The CSMA/CD Media Access mechanism requires that a minimum frame
>> | length of minFrameSize bits be transmitted. If frameSize is less than
>> | minFrameSize, then the CSMA/CD MAC sublayer shall append extra bits in
>> | units of octets (Pad), after the end of the MAC Client Data field but
>> | prior to calculating and appending the FCS (if not provided by the MAC
>> | client).
>>
>> That said, I could not find any mention of a minimum frame size
>> limitation for partial checksums in the AXI Ethernet documentation.
>> RX_CSRAW is calculated over the whole packet, so it's possible that this
>> check is trying to avoid passing it to the net subsystem when the frame
>> has been padded. However, skb->len is the length of the Ethernet packet,
>> so we can't tell how long the original packet was at this point. That
>> can only be determined from the L3 header, which isn't parsed yet. I
>> assume this is handled by the net subsystem.
>>
> 
> The fact there was a check in the driver hints about something.
> 
> It is possible the csum is incorrect if a 'padding' is added at the
> receiver, if the padding has non zero bytes, and is not included in
> the csum.
> 
> Look at this relevant patch :
> 
> Author: Saeed Mahameed <saeedm@mellanox.com>
> Date:   Mon Feb 11 18:04:17 2019 +0200
> 
>     net/mlx4_en: Force CHECKSUM_NONE for short ethernet frames

Well, I tested UDP and it appears to be working fine. First I ran

# nc -lu

on the DUT. On the other host I used scapy to send a packet with some
non-zero padding:

  >>> port = RandShort()
  >>> send(IP(dst="10.0.0.2")/UDP(sport=port, dport=4444)/Raw(b'data\r\n')/Padding(load=b'padding'))

I verified that the packet was received correctly, both in netcat and
with tcpdump:

    # tcpdump -i net4 -xXn 
    tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
    listening on net4, link-type EN10MB (Ethernet), snapshot length 262144 bytes
    16:07:45.083795 IP 10.0.0.1.27365 > 10.0.0.2.4444: UDP, length 6
            0x0000:  4500 0022 0001 0000 4011 66c8 0a00 0001  E.."....@.f.....
            0x0010:  0a00 0002 6ae5 115c 000e 0005 6461 7461  ....j..\....data
            0x0020:  0d0a 7061 6464 696e 6700 0000 0000       ..padding.....

and also checked for checksum errors:

  # netstat -s | grep InCsumErrors
      InCsumErrors: 0

to verify that checksums were being checked properly, I also sent a
packet with an invalid checksum:

  >>> send(IP(dst="10.0.0.2")/UDP(sport=port, dport=4444, chksum=5)/Raw(b'data\r\n')/Padding(load=b'padding'))

and confirmed that there was no output on netcat, and that I had gotten
a UDP checksum error:

  # netstat -s | grep InCsumErrors
      InCsumErrors: 1

I can try to test TCP as well, but it is a bit trickier due to the 3-way
handshake. From the documentation, partial checksums should be agnostic
to the L3 protocol, so I don't think there should be any difference.

--Sean
Sean Anderson Sept. 6, 2024, 9:37 p.m. UTC | #6
Hi Eric,

On 9/5/24 12:32, Sean Anderson wrote:
> On 9/5/24 10:59, Eric Dumazet wrote:
>> On Thu, Sep 5, 2024 at 4:24 PM Sean Anderson <sean.anderson@linux.dev> wrote:
>>>
>>> On 9/4/24 12:30, Eric Dumazet wrote:
>>> > On Tue, Sep 3, 2024 at 8:43 PM Sean Anderson <sean.anderson@linux.dev> wrote:
>>> >>
>>> >> The partial rx checksum feature computes a checksum over the entire
>>> >> packet, regardless of the L3 protocol. Remove the check for IPv4.
>>> >> Additionally, packets under 64 bytes should have been dropped by the
>>> >> MAC, so we can remove the length check as well.
>>> >
>>> > Some packets have a smaller len (than 64).
>>> >
>>> > For instance, TCP pure ACK and no options over IPv4 would be 54 bytes long.
>>> >
>>> > Presumably they are not dropped by the MAC ?
>>>
>>> Ethernet frames have a minimum size on the wire of 64 bytes. From 802.3
>>> section 4.2.4.2.2:
>>>
>>> | The shortest valid transmission in full duplex mode must be at least
>>> | minFrameSize in length. While collisions do not occur in full duplex
>>> | mode MACs, a full duplex MAC nevertheless discards received frames
>>> | containing less than minFrameSize bits. The discarding of such a frame
>>> | by a MAC is not reported as an error.
>>>
>>> where minFrameSize is 512 bits (64 bytes).
>>>
>>> On the transmit side, undersize frames are padded. From 802.3 section
>>> 4.2.3.3:
>>>
>>> | The CSMA/CD Media Access mechanism requires that a minimum frame
>>> | length of minFrameSize bits be transmitted. If frameSize is less than
>>> | minFrameSize, then the CSMA/CD MAC sublayer shall append extra bits in
>>> | units of octets (Pad), after the end of the MAC Client Data field but
>>> | prior to calculating and appending the FCS (if not provided by the MAC
>>> | client).
>>>
>>> That said, I could not find any mention of a minimum frame size
>>> limitation for partial checksums in the AXI Ethernet documentation.
>>> RX_CSRAW is calculated over the whole packet, so it's possible that this
>>> check is trying to avoid passing it to the net subsystem when the frame
>>> has been padded. However, skb->len is the length of the Ethernet packet,
>>> so we can't tell how long the original packet was at this point. That
>>> can only be determined from the L3 header, which isn't parsed yet. I
>>> assume this is handled by the net subsystem.
>>>
>> 
>> The fact there was a check in the driver hints about something.
>> 
>> It is possible the csum is incorrect if a 'padding' is added at the
>> receiver, if the padding has non zero bytes, and is not included in
>> the csum.
>> 
>> Look at this relevant patch :
>> 
>> Author: Saeed Mahameed <saeedm@mellanox.com>
>> Date:   Mon Feb 11 18:04:17 2019 +0200
>> 
>>     net/mlx4_en: Force CHECKSUM_NONE for short ethernet frames
> 
> Well, I tested UDP and it appears to be working fine. First I ran
> 
> # nc -lu
> 
> on the DUT. On the other host I used scapy to send a packet with some
> non-zero padding:
> 
>   >>> port = RandShort()
>   >>> send(IP(dst="10.0.0.2")/UDP(sport=port, dport=4444)/Raw(b'data\r\n')/Padding(load=b'padding'))
> 
> I verified that the packet was received correctly, both in netcat and
> with tcpdump:
> 
>     # tcpdump -i net4 -xXn 
>     tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
>     listening on net4, link-type EN10MB (Ethernet), snapshot length 262144 bytes
>     16:07:45.083795 IP 10.0.0.1.27365 > 10.0.0.2.4444: UDP, length 6
>             0x0000:  4500 0022 0001 0000 4011 66c8 0a00 0001  E.."....@.f.....
>             0x0010:  0a00 0002 6ae5 115c 000e 0005 6461 7461  ....j..\....data
>             0x0020:  0d0a 7061 6464 696e 6700 0000 0000       ..padding.....
> 
> and also checked for checksum errors:
> 
>   # netstat -s | grep InCsumErrors
>       InCsumErrors: 0
> 
> to verify that checksums were being checked properly, I also sent a
> packet with an invalid checksum:
> 
>   >>> send(IP(dst="10.0.0.2")/UDP(sport=port, dport=4444, chksum=5)/Raw(b'data\r\n')/Padding(load=b'padding'))
> 
> and confirmed that there was no output on netcat, and that I had gotten
> a UDP checksum error:
> 
>   # netstat -s | grep InCsumErrors
>       InCsumErrors: 1
> 
> I can try to test TCP as well, but it is a bit trickier due to the 3-way
> handshake. From the documentation, partial checksums should be agnostic
> to the L3 protocol, so I don't think there should be any difference.
> 
> --Sean

I saw that there was a checksum selftest today, so I went back and ran
that as well. I managed to get it to pass:

# NETIF=net LOCAL_V4=10.0.0.1 LOCAL_V6=fc00::1 REMOTE_V4=10.0.0.2 REMOTE_V6=fc00::2 REMOTE_TYPE=netns REMOTE_ARGS=ns2 ip netns exec ns1 kselftest_install/drivers/net/hw/csum.py
KTAP version 1
1..12
ok 1 csum.ipv4_rx_tcp
ok 2 csum.ipv4_rx_tcp_invalid
ok 3 csum.ipv4_rx_udp
ok 4 csum.ipv4_rx_udp_invalid
ok 5 csum.ipv4_tx_udp_csum_offload
ok 6 csum.ipv4_tx_udp_zero_checksum
ok 7 csum.ipv6_rx_tcp
ok 8 csum.ipv6_rx_tcp_invalid
ok 9 csum.ipv6_rx_udp
ok 10 csum.ipv6_rx_udp_invalid
ok 11 csum.ipv6_tx_udp_csum_offload
ok 12 csum.ipv6_tx_udp_zero_checksum
# Totals: pass:12 fail:0 xfail:0 xpass:0 skip:0 error:0

But ended up having to modify the test [1] to handle exactly this
situation (but in the test's reference checksum). I also had to add
another patch to set NETIF_F_RXCSUM for this driver. I think this shows
that there should be no hardware issue with removing the length check.
I'll send a v2 on Monday with the RXCSUM patch unless you have any
objections.

--Sean

[1] https://lore.kernel.org/netdev/20240906210743.627413-1-sean.anderson@linux.dev
diff mbox series

Patch

diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
index 74fade5a95c2..99d08a775520 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
@@ -1188,9 +1188,7 @@  static int axienet_rx_poll(struct napi_struct *napi, int budget)
 				    csumstatus == XAE_IP_UDP_CSUM_VALIDATED) {
 					skb->ip_summed = CHECKSUM_UNNECESSARY;
 				}
-			} else if ((lp->features & XAE_FEATURE_PARTIAL_RX_CSUM) != 0 &&
-				   skb->protocol == htons(ETH_P_IP) &&
-				   skb->len > 64) {
+			} else if (lp->features & XAE_FEATURE_PARTIAL_RX_CSUM) {
 				skb->csum = be32_to_cpu(cur_p->app3 & 0xFFFF);
 				skb->ip_summed = CHECKSUM_COMPLETE;
 			}