diff mbox series

[net,v2,2/2] net: ethernet: cortina: Bypass checksumming engine of alien ethertypes

Message ID 20231216-new-gemini-ethernet-regression-v2-2-64c269413dfa@linaro.org (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series Fix a regression in the Gemini ethernet controller. | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net
netdev/ynl success SINGLE THREAD; Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1115 this patch: 1115
netdev/cc_maintainers fail 1 blamed authors not CCed: olteanv@gmail.com; 2 maintainers not CCed: olteanv@gmail.com linux-arm-kernel@lists.infradead.org
netdev/build_clang success Errors and warnings before: 1142 this patch: 1142
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 1142 this patch: 1142
netdev/checkpatch warning WARNING: Possible repeated word: 'the' WARNING: line length of 82 exceeds 80 columns
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Linus Walleij Dec. 16, 2023, 7:36 p.m. UTC
We had workarounds were the ethernet checksumming engine would be bypassed
for larger frames, this fixed devices using DSA, but regressed devices
where the ethernet was connected directly to a PHY.

The devices with a PHY connected directly can't handle large frames
either way, with or without bypass. Looking at the size of the frame
is probably just wrong.

Rework the workaround such that we just bypass the checksumming engine if
the ethertype inside the actual frame is something else than 0x0800
(IPv4) or 0x86dd (IPv6). These are the only frames the checksumming engine
can actually handle. VLAN framing (0x8100) also works fine.

We can't inspect skb->protocol because DSA frames will sometimes have a
custom ethertype despite skb->protocol is e.g. 0x0800.

After this both devices with direct ethernet attached such as D-Link
DNS-313 and devices with a DSA switch with a custom ethertype such as
D-Link DIR-685 work fine.

Fixes: d4d0c5b4d279 ("net: ethernet: cortina: Handle large frames")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
 drivers/net/ethernet/cortina/gemini.c | 21 ++++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)

Comments

Eric Dumazet Dec. 18, 2023, 2:50 p.m. UTC | #1
On Sat, Dec 16, 2023 at 8:36 PM Linus Walleij <linus.walleij@linaro.org> wrote:
>
> We had workarounds were the ethernet checksumming engine would be bypassed
> for larger frames, this fixed devices using DSA, but regressed devices
> where the ethernet was connected directly to a PHY.
>
> The devices with a PHY connected directly can't handle large frames
> either way, with or without bypass. Looking at the size of the frame
> is probably just wrong.
>
> Rework the workaround such that we just bypass the checksumming engine if
> the ethertype inside the actual frame is something else than 0x0800
> (IPv4) or 0x86dd (IPv6). These are the only frames the checksumming engine
> can actually handle. VLAN framing (0x8100) also works fine.
>
> We can't inspect skb->protocol because DSA frames will sometimes have a
> custom ethertype despite skb->protocol is e.g. 0x0800.
>
> After this both devices with direct ethernet attached such as D-Link
> DNS-313 and devices with a DSA switch with a custom ethertype such as
> D-Link DIR-685 work fine.
>
> Fixes: d4d0c5b4d279 ("net: ethernet: cortina: Handle large frames")
> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
> ---
>  drivers/net/ethernet/cortina/gemini.c | 21 ++++++++++++++++++++-
>  1 file changed, 20 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/cortina/gemini.c b/drivers/net/ethernet/cortina/gemini.c
> index 6a7ea051391a..1400f19bf05b 100644
> --- a/drivers/net/ethernet/cortina/gemini.c
> +++ b/drivers/net/ethernet/cortina/gemini.c
> @@ -1143,7 +1143,9 @@ static int gmac_map_tx_bufs(struct net_device *netdev, struct sk_buff *skb,
>         skb_frag_t *skb_frag;
>         dma_addr_t mapping;
>         unsigned short mtu;
> +       u16 ethertype;
>         void *buffer;
> +       __be16 *p;
>
>         mtu  = ETH_HLEN;
>         mtu += netdev->mtu;
> @@ -1158,7 +1160,24 @@ static int gmac_map_tx_bufs(struct net_device *netdev, struct sk_buff *skb,
>                 word3 |= mtu;
>         }
>
> -       if (skb->ip_summed == CHECKSUM_PARTIAL) {
> +       /* Dig out the the ethertype actually in the buffer and not what the
> +        * protocol claims to be. This is the raw data that the checksumming
> +        * offload engine will have to deal with.
> +        */
> +       p = (__be16 *)(skb->data + 2 * ETH_ALEN);
> +       ethertype = ntohs(*p);
> +       if (ethertype == ETH_P_8021Q) {
> +               p += 2; /* +2 sizeof(__be16) */
> +               ethertype = ntohs(*p);
> +       }

Presumably all you need is to call vlan_get_protocol() ?

> +
> +       if (ethertype != ETH_P_IP && ethertype != ETH_P_IPV6) {
> +               /* Hardware offloaded checksumming isn't working on non-IP frames.
> +                * This happens for example on some DSA switches using a custom
> +                * ethertype. Just bypass the engine for those.
> +                */
> +               word1 |= TSS_BYPASS_BIT;
> +       } else if (skb->ip_summed == CHECKSUM_PARTIAL) {
>                 int tcp = 0;
>
>                 /* We do not switch off the checksumming on non TCP/UDP
>
> --
> 2.34.1
>
Linus Walleij Dec. 18, 2023, 11:41 p.m. UTC | #2
On Mon, Dec 18, 2023 at 3:50 PM Eric Dumazet <edumazet@google.com> wrote:
> On Sat, Dec 16, 2023 at 8:36 PM Linus Walleij <linus.walleij@linaro.org> wrote:

> > +       /* Dig out the the ethertype actually in the buffer and not what the
> > +        * protocol claims to be. This is the raw data that the checksumming
> > +        * offload engine will have to deal with.
> > +        */
> > +       p = (__be16 *)(skb->data + 2 * ETH_ALEN);
> > +       ethertype = ntohs(*p);
> > +       if (ethertype == ETH_P_8021Q) {
> > +               p += 2; /* +2 sizeof(__be16) */
> > +               ethertype = ntohs(*p);
> > +       }
>
> Presumably all you need is to call vlan_get_protocol() ?

Sadly no. As the comment says: we want the ethertype that is actually in the
skb, not what is in skb->protocol, and the code in vlan_get_protocol() just
trusts skb->protocol to be the ethertype in the frame, especially if vlan
is not used.

This is often what we want: DSA switches will "wash" custom ethertypes
before they go out, but in this case the custom ethertype upsets the
ethernet checksum engine used as conduit interface toward the DSA
switch.

Yours,
Linus Walleij
Eric Dumazet Dec. 19, 2023, 9:14 a.m. UTC | #3
On Tue, Dec 19, 2023 at 12:42 AM Linus Walleij <linus.walleij@linaro.org> wrote:
>
> On Mon, Dec 18, 2023 at 3:50 PM Eric Dumazet <edumazet@google.com> wrote:
> > On Sat, Dec 16, 2023 at 8:36 PM Linus Walleij <linus.walleij@linaro.org> wrote:
>
> > > +       /* Dig out the the ethertype actually in the buffer and not what the
> > > +        * protocol claims to be. This is the raw data that the checksumming
> > > +        * offload engine will have to deal with.
> > > +        */
> > > +       p = (__be16 *)(skb->data + 2 * ETH_ALEN);
> > > +       ethertype = ntohs(*p);
> > > +       if (ethertype == ETH_P_8021Q) {
> > > +               p += 2; /* +2 sizeof(__be16) */
> > > +               ethertype = ntohs(*p);
> > > +       }
> >
> > Presumably all you need is to call vlan_get_protocol() ?
>
> Sadly no. As the comment says: we want the ethertype that is actually in the
> skb, not what is in skb->protocol, and the code in vlan_get_protocol() just
> trusts skb->protocol to be the ethertype in the frame, especially if vlan
> is not used.
>
> This is often what we want: DSA switches will "wash" custom ethertypes
> before they go out, but in this case the custom ethertype upsets the
> ethernet checksum engine used as conduit interface toward the DSA
> switch.

 Problem is that your code misses skb_header_pointer() or
pskb_may_pull() call...
Second "ethertype = ntohs(*p);" might access uninitialized data.

If this is a common operation, perhaps use a common helper from all drivers,
this would help code review a bit...
Linus Walleij Dec. 19, 2023, 2:22 p.m. UTC | #4
On Tue, Dec 19, 2023 at 10:15 AM Eric Dumazet <edumazet@google.com> wrote:
> On Tue, Dec 19, 2023 at 12:42 AM Linus Walleij <linus.walleij@linaro.org> wrote:

> > This is often what we want: DSA switches will "wash" custom ethertypes
> > before they go out, but in this case the custom ethertype upsets the
> > ethernet checksum engine used as conduit interface toward the DSA
> > switch.
>
>  Problem is that your code misses skb_header_pointer() or
> pskb_may_pull() call...
> Second "ethertype = ntohs(*p);" might access uninitialized data.

Yeah, needs to be done properly and look at skb->len etc.

> If this is a common operation, perhaps use a common helper from all drivers,
> this would help code review a bit...

You are right, Maxime opened a discussion on it in a parallel,
I'll cook something up!

Yours,
Linus Walleij
diff mbox series

Patch

diff --git a/drivers/net/ethernet/cortina/gemini.c b/drivers/net/ethernet/cortina/gemini.c
index 6a7ea051391a..1400f19bf05b 100644
--- a/drivers/net/ethernet/cortina/gemini.c
+++ b/drivers/net/ethernet/cortina/gemini.c
@@ -1143,7 +1143,9 @@  static int gmac_map_tx_bufs(struct net_device *netdev, struct sk_buff *skb,
 	skb_frag_t *skb_frag;
 	dma_addr_t mapping;
 	unsigned short mtu;
+	u16 ethertype;
 	void *buffer;
+	__be16 *p;
 
 	mtu  = ETH_HLEN;
 	mtu += netdev->mtu;
@@ -1158,7 +1160,24 @@  static int gmac_map_tx_bufs(struct net_device *netdev, struct sk_buff *skb,
 		word3 |= mtu;
 	}
 
-	if (skb->ip_summed == CHECKSUM_PARTIAL) {
+	/* Dig out the the ethertype actually in the buffer and not what the
+	 * protocol claims to be. This is the raw data that the checksumming
+	 * offload engine will have to deal with.
+	 */
+	p = (__be16 *)(skb->data + 2 * ETH_ALEN);
+	ethertype = ntohs(*p);
+	if (ethertype == ETH_P_8021Q) {
+		p += 2; /* +2 sizeof(__be16) */
+		ethertype = ntohs(*p);
+	}
+
+	if (ethertype != ETH_P_IP && ethertype != ETH_P_IPV6) {
+		/* Hardware offloaded checksumming isn't working on non-IP frames.
+		 * This happens for example on some DSA switches using a custom
+		 * ethertype. Just bypass the engine for those.
+		 */
+		word1 |= TSS_BYPASS_BIT;
+	} else if (skb->ip_summed == CHECKSUM_PARTIAL) {
 		int tcp = 0;
 
 		/* We do not switch off the checksumming on non TCP/UDP