diff mbox series

dsa: tag_rtl4_a: Bump min packet size

Message ID 20231027-fix-rtl8366rb-v1-1-d565d905535a@linaro.org (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series dsa: tag_rtl4_a: Bump min packet size | expand

Checks

Context Check Description
netdev/series_format warning Single patches do not need cover letters; Target tree name not specified in the subject
netdev/tree_selection success Guessed tree name to be net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1362 this patch: 1362
netdev/cc_maintainers success CCed 9 of 9 maintainers
netdev/build_clang success Errors and warnings before: 1387 this patch: 1387
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 1387 this patch: 1387
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 13 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Linus Walleij Oct. 27, 2023, 8:21 p.m. UTC
It was reported that the "LuCI" web UI was not working properly
with a device using the RTL8366RB switch. Disabling the egress
port tagging code made the switch work again, but this is not
a good solution as we want to be able to direct traffic to a
certain port.

It turns out that sometimes, but not always, small packets are
dropped by the switch for no reason.

If we pad the ethernet frames to a minimum of ETH_FRAME_LEN + FCS
(1518 bytes) everything starts working fine.

As we completely lack datasheet or any insight into how this
switch works, this trial-and-error solution is the best we
can do.

I have tested smaller sizes, ETH_FRAME_LEN doesn't work for
example.

Fixes: 0e90dfa7a8d8 ("net: dsa: tag_rtl4_a: Fix egress tags")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
 net/dsa/tag_rtl4_a.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)


---
base-commit: 58720809f52779dc0f08e53e54b014209d13eebb
change-id: 20231027-fix-rtl8366rb-e752bd5901ca

Best regards,

Comments

Florian Fainelli Oct. 27, 2023, 9:23 p.m. UTC | #1
You would want your subject to be:

net: dsa: tag_rtl4_a: Bump min packet size

On 10/27/23 13:21, Linus Walleij wrote:
> It was reported that the "LuCI" web UI was not working properly
> with a device using the RTL8366RB switch. Disabling the egress
> port tagging code made the switch work again, but this is not
> a good solution as we want to be able to direct traffic to a
> certain port.
> 
> It turns out that sometimes, but not always, small packets are
> dropped by the switch for no reason.

And we are positive that the Ethernet MAC is also properly padding 
frames before having them ingress the switch?

> 
> If we pad the ethernet frames to a minimum of ETH_FRAME_LEN + FCS
> (1518 bytes) everything starts working fine.

That is quite unprecedented, either the switch is very bogus or there is 
something else we do not fully understand...
Andrew Lunn Oct. 27, 2023, 11:03 p.m. UTC | #2
On Fri, Oct 27, 2023 at 02:23:13PM -0700, Florian Fainelli wrote:
> You would want your subject to be:
> 
> net: dsa: tag_rtl4_a: Bump min packet size
> 
> On 10/27/23 13:21, Linus Walleij wrote:
> > It was reported that the "LuCI" web UI was not working properly
> > with a device using the RTL8366RB switch. Disabling the egress
> > port tagging code made the switch work again, but this is not
> > a good solution as we want to be able to direct traffic to a
> > certain port.
> > 
> > It turns out that sometimes, but not always, small packets are
> > dropped by the switch for no reason.
> 
> And we are positive that the Ethernet MAC is also properly padding frames
> before having them ingress the switch?

It might be interesting to run a script which systematically does a
ping, or similar, for all frame sizes.

> > If we pad the ethernet frames to a minimum of ETH_FRAME_LEN + FCS
> > (1518 bytes) everything starts working fine.
> 
> That is quite unprecedented, either the switch is very bogus or there is
> something else we do not fully understand...

It would also be interesting to know if the frames on the wire have
the padding removed when needed. Its not going to be good for
performance if a TCP ACK is 1500bytes in size, rather than the usual
~64.

Have problems also been noticed with traffic going from user port to
user port?

	Andrew
Vladimir Oltean Oct. 28, 2023, 10:04 p.m. UTC | #3
On Fri, Oct 27, 2023 at 10:21:39PM +0200, Linus Walleij wrote:
> It was reported that the "LuCI" web UI was not working properly
> with a device using the RTL8366RB switch. Disabling the egress
> port tagging code made the switch work again, but this is not
> a good solution as we want to be able to direct traffic to a
> certain port.
> 
> It turns out that sometimes, but not always, small packets are
> dropped by the switch for no reason.

"For no reason" is a technical statement which means "an unspecific/inconclusive
drop reason in the ethtool -S output on the conduit interface (which also
shows the hardware counters of the CPU port", or is it just a figure of
speech? If just a figure of speech, could you please determine which
counter gets incremented when the switch drops packets?

What user port is being targeted when the switch drops packets? Any user
port, or just specific ones?

What protocol headers do those packets that are dropped have? Is it size
that they have in common, I wonder (given that you say that small
packets are not always dropped), or is it something else?

> If we pad the ethernet frames to a minimum of ETH_FRAME_LEN + FCS
> (1518 bytes) everything starts working fine.
> 
> As we completely lack datasheet or any insight into how this
> switch works, this trial-and-error solution is the best we
> can do.
> 
> I have tested smaller sizes, ETH_FRAME_LEN doesn't work for
> example.
> 
> Fixes: 0e90dfa7a8d8 ("net: dsa: tag_rtl4_a: Fix egress tags")

Have you actually checked out this sha1sum and confirmed that the packet
drop can be reproduced there? Ideally you could also go back to a bit
earlier, to commit 9eb8bc593a5e ("net: dsa: tag_rtl4_a: fix egress tags")
(this is a different commit from Qingfang with the same description) and
test on user port 0 only?

> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
> ---
>  net/dsa/tag_rtl4_a.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/net/dsa/tag_rtl4_a.c b/net/dsa/tag_rtl4_a.c
> index c327314b95e3..8b1e81a6377b 100644
> --- a/net/dsa/tag_rtl4_a.c
> +++ b/net/dsa/tag_rtl4_a.c
> @@ -41,8 +41,11 @@ static struct sk_buff *rtl4a_tag_xmit(struct sk_buff *skb,
>  	u8 *tag;
>  	u16 out;
>  
> -	/* Pad out to at least 60 bytes */
> -	if (unlikely(__skb_put_padto(skb, ETH_ZLEN, false)))
> +	/* We need to pad out to at least ETH_FRAME_LEN + FCS bytes. This was
> +	 * found by trial-and-error. Sometimes smaller frames work, but sadly
> +	 * not always.
> +	 */
> +	if (unlikely(__skb_put_padto(skb, ETH_FRAME_LEN + ETH_FCS_LEN, false)))
>  		return NULL;
>  
>  	netdev_dbg(dev, "add realtek tag to package to port %d\n",
> 
> ---
> base-commit: 58720809f52779dc0f08e53e54b014209d13eebb
> change-id: 20231027-fix-rtl8366rb-e752bd5901ca
> 
> Best regards,
> -- 
> Linus Walleij <linus.walleij@linaro.org>
> 

Until more is known:

pw-bot: cr
Linus Walleij Oct. 29, 2023, 3:38 p.m. UTC | #4
On Fri, Oct 27, 2023 at 11:23 PM Florian Fainelli <f.fainelli@gmail.com> wrote:

> > It turns out that sometimes, but not always, small packets are
> > dropped by the switch for no reason.
>
> And we are positive that the Ethernet MAC is also properly padding
> frames before having them ingress the switch?

I don't fully follow, this code is the one adding the padding isn't it?
Then the result is transmitted to the switch from the ethernet
MAC (drivers/net/ethernet/cortina/gemini.c).

What am I getting wrong here...

> > If we pad the ethernet frames to a minimum of ETH_FRAME_LEN + FCS
> > (1518 bytes) everything starts working fine.
>
> That is quite unprecedented, either the switch is very bogus or there is
> something else we do not fully understand...

The switch is pretty bogus, all documentation we have of it is a vendor
code drop, no data sheet. The format for ingress and egress tags
was discovered using trial-and-error.

Yours,
Linus Walleij
Andrew Lunn Oct. 29, 2023, 5:35 p.m. UTC | #5
> The switch is pretty bogus, all documentation we have of it is a vendor
> code drop, no data sheet. The format for ingress and egress tags
> was discovered using trial-and-error.

The code drop does not include tagging? Does the code drop also pad to
full size Ethernet frames?

I still think it would be good to do some systematic testing to figure
out if there is any pattern to the drops.

     Andrew
Linus Walleij Oct. 29, 2023, 10:15 p.m. UTC | #6
Hi Vladimir,

first: THANKS, because you ALWAYS ask the *right* questions,
and I quickly get closer to the real solution!

On Sun, Oct 29, 2023 at 12:04 AM Vladimir Oltean <olteanv@gmail.com> wrote:
> On Fri, Oct 27, 2023 at 10:21:39PM +0200, Linus Walleij wrote:

> > It was reported that the "LuCI" web UI was not working properly
> > with a device using the RTL8366RB switch. Disabling the egress
> > port tagging code made the switch work again, but this is not
> > a good solution as we want to be able to direct traffic to a
> > certain port.
> >
> > It turns out that sometimes, but not always, small packets are
> > dropped by the switch for no reason.
>
> "For no reason" is a technical statement which means "an unspecific/inconclusive
> drop reason in the ethtool -S output on the conduit interface (which also
> shows the hardware counters of the CPU port", or is it just a figure of
> speech? If just a figure of speech, could you please determine which
> counter gets incremented when the switch drops packets?

I ran ethtool -S on the switch port before and after trying to access the web
UI (LuCI) on the router with the chip, i.e. putting http://192.168.1.1 into
the address bar of a browser and hitting enter three times.

Then I ran diff -ur on the two outputs (before and after) and get this
disturbing and symmetric stat:

-     Dot1dTpPortInDiscards: 0
+     Dot1dTpPortInDiscards: 3

(Counters defined in drivers/net/dsa/realtek/rtl8366rb.c)

> What user port is being targeted when the switch drops packets? Any user
> port, or just specific ones?

I tried on lan0, lan1, lan2 and lan3 (DSA ports 0,1,2,3): same result.
On each of the ports, the same Dot1dTpPortInDiscards counter
goes up.

> What protocol headers do those packets that are dropped have?

HTTP

> Is it size
> that they have in common, I wonder (given that you say that small
> packets are not always dropped), or is it something else?

I got it wrong, it's big packets getting dropped, not small ones... :(

Some tcpdump:ing gives at hand that the problem is that the
httpd is sending 1500 byte packages. Anything over 1496 fails
in ping tests.

1496 is suspiciously much 1500 - DSA tag size. However the
MTU of the parent ethernet is bumped nicely to 1504 and the
device MTU is set up to accomodate it as well.

Modifying the patch to just pad out packets >= 1496 bytes
solves the problem in a better way, but maybe that is not the
last thing we try here...

I'll resend the patch with more elaborate commit log and test
description in the commit.

> > Fixes: 0e90dfa7a8d8 ("net: dsa: tag_rtl4_a: Fix egress tags")
>
> Have you actually checked out this sha1sum and confirmed that the packet
> drop can be reproduced there? Ideally you could also go back to a bit
> earlier, to commit 9eb8bc593a5e ("net: dsa: tag_rtl4_a: fix egress tags")
> (this is a different commit from Qingfang with the same description) and
> test on user port 0 only?

Yes it should be an earlier commit indeed. I'll fix!

Yours,
Linus Walleij
David Laight Oct. 29, 2023, 10:18 p.m. UTC | #7
From: Andrew Lunn
> Sent: 28 October 2023 00:04
> 
> On Fri, Oct 27, 2023 at 02:23:13PM -0700, Florian Fainelli wrote:
> > You would want your subject to be:
> >
> > net: dsa: tag_rtl4_a: Bump min packet size
> >
> > On 10/27/23 13:21, Linus Walleij wrote:
> > > It was reported that the "LuCI" web UI was not working properly
> > > with a device using the RTL8366RB switch. Disabling the egress
> > > port tagging code made the switch work again, but this is not
> > > a good solution as we want to be able to direct traffic to a
> > > certain port.
> > >
> > > It turns out that sometimes, but not always, small packets are
> > > dropped by the switch for no reason.
> >
> > And we are positive that the Ethernet MAC is also properly padding frames
> > before having them ingress the switch?
> 
> It might be interesting to run a script which systematically does a
> ping, or similar, for all frame sizes.
> 
> > > If we pad the ethernet frames to a minimum of ETH_FRAME_LEN + FCS
> > > (1518 bytes) everything starts working fine.

Thought - is that just because it slows everything down??

> > That is quite unprecedented, either the switch is very bogus or there is
> > something else we do not fully understand...
> 
> It would also be interesting to know if the frames on the wire have
> the padding removed when needed. Its not going to be good for
> performance if a TCP ACK is 1500bytes in size, rather than the usual
> ~64.

Non IP protocols are very likely to object to unexpected frame padding.
I'm also sure I've seen systems do (the equivalent of) printk for
overlong UDP packets.

If you search the right place you'll find reports of packets
being discarded before one of the VM network implementations
padded ethernet frames to an even byte length.
(I can't remember which, but have some fpga logic that adjusts
the MAC address based on the TCP port number and manages to
require there be no unexpected padding - yes it is broken.)

	David

> 
> Have problems also been noticed with traffic going from user port to
> user port?
> 
> 	Andrew

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Andrew Lunn Oct. 29, 2023, 11 p.m. UTC | #8
> 1496 is suspiciously much 1500 - DSA tag size. However the
> MTU of the parent ethernet is bumped nicely to 1504 and the
> device MTU is set up to accomodate it as well.
> 
> Modifying the patch to just pad out packets >= 1496 bytes
> solves the problem in a better way, but maybe that is not the
> last thing we try here...

Have you tried playing with RTL8366RB_SGCR in rtl8366rb_change_mtu()?

I had an annoying bug in the mv88e6xxx driver where the MTU
configuration register was up to, but not including... So i had to
change a <= to <.

	Andrew
Linus Walleij Oct. 30, 2023, 7:39 a.m. UTC | #9
On Mon, Oct 30, 2023 at 12:00 AM Andrew Lunn <andrew@lunn.ch> wrote:

> > 1496 is suspiciously much 1500 - DSA tag size. However the
> > MTU of the parent ethernet is bumped nicely to 1504 and the
> > device MTU is set up to accomodate it as well.
> >
> > Modifying the patch to just pad out packets >= 1496 bytes
> > solves the problem in a better way, but maybe that is not the
> > last thing we try here...
>
> Have you tried playing with RTL8366RB_SGCR in rtl8366rb_change_mtu()?

Yes I set the MTU to increasingly larger sizes.

Sadly MTU has nothing to do with this, it's really annoying,
I have a v2 patch that is better and working, I'll include some test
text so you see where the problem is.

Yours,
Linus Walleij
Vladimir Oltean Oct. 30, 2023, 12:51 p.m. UTC | #10
On Sun, Oct 29, 2023 at 04:38:04PM +0100, Linus Walleij wrote:
> On Fri, Oct 27, 2023 at 11:23 PM Florian Fainelli <f.fainelli@gmail.com> wrote:
> 
> > > It turns out that sometimes, but not always, small packets are
> > > dropped by the switch for no reason.
> >
> > And we are positive that the Ethernet MAC is also properly padding
> > frames before having them ingress the switch?
> 
> I don't fully follow, this code is the one adding the padding isn't it?
> Then the result is transmitted to the switch from the ethernet
> MAC (drivers/net/ethernet/cortina/gemini.c).
> 
> What am I getting wrong here...

What you are missing is that the existing padding done by rtl4a_tag_xmit()
shouldn't be normally needed except for exceptional cases.

Socket buffers smaller than ETH_ZLEN can be passed to any network
device, and it is expected that either the driver or the hardware pads
up to ETH_ZLEN automatically. Thus, the conduit driver should already
know that it needs to pad packets to ETH_ZLEN.

The exceptional cases are:
- This is a tail tag (not the case here), which by definition needs to
  be located at the end of the skb. If you first put the tag then let
  the conduit interface pad, then the tail tag is no longer at the tail.
  So in that case, DSA pads first in generic code - dsa_user_xmit().
- The switch must handle the case where, after stripping the DSA tag
  from a ETH_ZLEN sized packet coming from the CPU port, it re-pads the
  packet on user port egress. Some switches don't handle that properly,
  and thus, we have isolated __skb_put_padto() calls within certain
  tagging protocols which address just that case.

So, what Florian was asking is whether the conduit interface is not
doing its expected job properly. You clarified that the problem is big
rather than small packets, but we still need an explanation for the
existing __skb_put_padto() call, given that it seems that it was placed
there due to a misunderstanding rather than due to an explicit need for
an exceptional case.
Linus Walleij Oct. 30, 2023, 2:29 p.m. UTC | #11
On Mon, Oct 30, 2023 at 1:51 PM Vladimir Oltean <olteanv@gmail.com> wrote:

> So, what Florian was asking is whether the conduit interface is not
> doing its expected job properly. You clarified that the problem is big
> rather than small packets, but we still need an explanation for the
> existing __skb_put_padto() call, given that it seems that it was placed
> there due to a misunderstanding rather than due to an explicit need for
> an exceptional case.

Hm Deng added that and it was because the same was needed for
some other tagger IIRC, let's see what he says.

I will do some tests to just remove it and see what happens.

Yours,
Linus Walleij
Linus Walleij Oct. 30, 2023, 2:32 p.m. UTC | #12
Hi Deng,

do you have some comments on the below, pertaining to
commit 9eb8bc593a5eed167dac2029abef343854c5ba75
"net: dsa: tag_rtl4_a: fix egress tags"?

I plan to test without the ZLEN padding and see what happens.
IIRC it wasn't working without that, but I may just misremember
the whole thing so let's rehash this.

On Mon, Oct 30, 2023 at 1:51 PM Vladimir Oltean <olteanv@gmail.com> wrote:

> What you are missing is that the existing padding done by rtl4a_tag_xmit()
> shouldn't be normally needed except for exceptional cases.
>
> Socket buffers smaller than ETH_ZLEN can be passed to any network
> device, and it is expected that either the driver or the hardware pads
> up to ETH_ZLEN automatically. Thus, the conduit driver should already
> know that it needs to pad packets to ETH_ZLEN.
>
> The exceptional cases are:
> - This is a tail tag (not the case here), which by definition needs to
>   be located at the end of the skb. If you first put the tag then let
>   the conduit interface pad, then the tail tag is no longer at the tail.
>   So in that case, DSA pads first in generic code - dsa_user_xmit().
> - The switch must handle the case where, after stripping the DSA tag
>   from a ETH_ZLEN sized packet coming from the CPU port, it re-pads the
>   packet on user port egress. Some switches don't handle that properly,
>   and thus, we have isolated __skb_put_padto() calls within certain
>   tagging protocols which address just that case.
>
> So, what Florian was asking is whether the conduit interface is not
> doing its expected job properly. You clarified that the problem is big
> rather than small packets, but we still need an explanation for the
> existing __skb_put_padto() call, given that it seems that it was placed
> there due to a misunderstanding rather than due to an explicit need for
> an exceptional case.

Yours,
Linus Walleij
diff mbox series

Patch

diff --git a/net/dsa/tag_rtl4_a.c b/net/dsa/tag_rtl4_a.c
index c327314b95e3..8b1e81a6377b 100644
--- a/net/dsa/tag_rtl4_a.c
+++ b/net/dsa/tag_rtl4_a.c
@@ -41,8 +41,11 @@  static struct sk_buff *rtl4a_tag_xmit(struct sk_buff *skb,
 	u8 *tag;
 	u16 out;
 
-	/* Pad out to at least 60 bytes */
-	if (unlikely(__skb_put_padto(skb, ETH_ZLEN, false)))
+	/* We need to pad out to at least ETH_FRAME_LEN + FCS bytes. This was
+	 * found by trial-and-error. Sometimes smaller frames work, but sadly
+	 * not always.
+	 */
+	if (unlikely(__skb_put_padto(skb, ETH_FRAME_LEN + ETH_FCS_LEN, false)))
 		return NULL;
 
 	netdev_dbg(dev, "add realtek tag to package to port %d\n",