Message ID | 20231030-fix-rtl8366rb-v2-1-e66e1ef7dbd2@linaro.org (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net,v2] net: dsa: tag_rtl4_a: Bump min packet size | expand |
On Mon, Oct 30, 2023 at 10:26:50AM +0100, Linus Walleij wrote: > It was reported that the "LuCI" web UI was not working properly > with a device using the RTL8366RB switch. Disabling the egress > port tagging code made the switch work again, but this is not > a good solution as we want to be able to direct traffic to a > certain port. > > It turns out that packets between 1496 and 1500 bytes are > dropped unless padded out to 1518 bytes. > > If we pad the ethernet frames to a minimum of ETH_FRAME_LEN + FCS > (1518 bytes) everything starts working fine. > > 1496 is the size of a normal data frame minus the internal DSA > tag. The number is intuitive, the explanation evades me. > > As we completely lack datasheet or any insight into how this > switch works, this trial-and-error solution is the best we > can do. FWIW this bug may very well be the reason why Realteks > code drops are not using CPU tagging. The vendor routers using > this switch are all hardwired to disable CPU tagging and all > packets are pushed to all ports on TX. This is also the case > in the old OpenWrt driver, derived from the vendor code drops. > > I have tested smaller sizes, only 1518 or bigger padding works. > > Modifying the MTU on the switch (one setting affecting all > ports) has no effect. > > Before this patch: > > PING 192.168.1.1 (192.168.1.1) 1470(1498) bytes of data. > ^C > --- 192.168.1.1 ping statistics --- > 2 packets transmitted, 0 received, 100% packet loss, time 1048ms > > PING 192.168.1.1 (192.168.1.1) 1472(1500) bytes of data. > ^C > --- 192.168.1.1 ping statistics --- > 12 packets transmitted, 0 received, 100% packet loss, time 11267ms > > After this patch: > > PING 192.168.1.1 (192.168.1.1) 1470(1498) bytes of data. > 1478 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.533 ms > 1478 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.630 ms > > PING 192.168.1.1 (192.168.1.1) 1472(1500) bytes of data. > 1480 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.527 ms > 1480 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.562 ms > > Also LuCI starts working with the 1500 bytes frames it produces > from the HTTP server. > > Fixes: 9eb8bc593a5e ("net: dsa: tag_rtl4_a: fix egress tags") > Signed-off-by: Linus Walleij <linus.walleij@linaro.org> > --- > Changes in v2: > - Pad packages >= 1496 bytes after some further tests > - Use more to-the-point description > - Provide ping results in the commit message > - Link to v1: https://lore.kernel.org/r/20231027-fix-rtl8366rb-v1-1-d565d905535a@linaro.org > --- > net/dsa/tag_rtl4_a.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/net/dsa/tag_rtl4_a.c b/net/dsa/tag_rtl4_a.c > index c327314b95e3..3292bc49b158 100644 > --- a/net/dsa/tag_rtl4_a.c > +++ b/net/dsa/tag_rtl4_a.c > @@ -45,6 +45,16 @@ static struct sk_buff *rtl4a_tag_xmit(struct sk_buff *skb, > if (unlikely(__skb_put_padto(skb, ETH_ZLEN, false))) > return NULL; > > + /* Packets over 1496 bytes get dropped unless they get padded > + * out to 1518 bytes. 1496 is ETH_DATA_LEN - tag which is hardly > + * a coinicidence, and 1518 is ETH_FRAME_LEN + FCS so we define > + * the threshold size and padding like this. > + */ > + if (skb->len >= (ETH_DATA_LEN - RTL4_A_HDR_LEN)) { > + if (unlikely(__skb_put_padto(skb, ETH_FRAME_LEN + ETH_FCS_LEN, false))) > + return NULL; > + } De-obfuscating the macros: if (skb->len >= 1496) __skb_put_padto(skb, 1518, false); (...) skb_push(skb, 4); which means that here, skb->len will be 1522, if it was originally 1496. So the code adds 26 extra octets, and only 4 of those are legitimate (a tag). The rest is absolutely unexplained, which means that until there is a valid explanation for them: pw-bot: cr (sorry, but if it works and we don't know why it works, then at some point it will break and we won't know why it stopped working) In the discussion on the previous thread: https://lore.kernel.org/netdev/CACRpkdbq03ZXcB-TaBp5Udo3M47rb-o+LfkEkC-gA1+=x1Zd-g@mail.gmail.com/ you said that what increments is Dot1dTpPortInDiscards. 802.1Q-2018 says about it: "Count of received valid frames that were discarded (i.e., filtered) by the Forwarding Process." which is odd enough to me, since packets sent by rtl4a_tag_xmit() should *not* be processed by the forwarding layer of the switch, but rather, force-delivered to the specified egress port. Could you please try to revert the effect of commit 339133f6c318 ("net: dsa: tag_rtl4_a: Drop bit 9 from egress frames") by setting that bit in the egress tag again? Who knows, maybe it is the bit which tells the switch to bypass the forwarding process. Or maybe there is a bit in a different position which does this. You could try to fill in all bits in unknown positions, check that there are no regressions with packet sizes < 1496, and then see if that made any changes to packet sizes >= 1496. If it did, try to see which bit made the difference. > + > netdev_dbg(dev, "add realtek tag to package to port %d\n", > dp->index); > skb_push(skb, RTL4_A_HDR_LEN); > > --- > base-commit: d9e164e4199bc465b3540d5fe3ffc8a9a4fc6157 > change-id: 20231027-fix-rtl8366rb-e752bd5901ca > > Best regards, > -- > Linus Walleij <linus.walleij@linaro.org> >
On Mon, Oct 30, 2023 at 3:16 PM Vladimir Oltean <olteanv@gmail.com> wrote: > Could you please try to revert the effect of commit 339133f6c318 ("net: > dsa: tag_rtl4_a: Drop bit 9 from egress frames") by setting that bit in > the egress tag again? Who knows, maybe it is the bit which tells the > switch to bypass the forwarding process. I have already tried that, it was one of the first things I tried, just looking over the git log and looking for usual suspects. Sadly it has no effect whatsoever, the problem persists :( > Or maybe there is a bit in a > different position which does this. You could try to fill in all bits in > unknown positions, check that there are no regressions with packet sizes > < 1496, and then see if that made any changes to packet sizes >= 1496. > If it did, try to see which bit made the difference. Hehe now we're talking :D I did something similar before, I think just switching a different bit every 10 packets or so and running a persistent ping until it succeeds or not. I'll see what I can come up with. Yours, Linus Walleij
On Mon, Oct 30, 2023 at 03:37:24PM +0100, Linus Walleij wrote: > On Mon, Oct 30, 2023 at 3:16 PM Vladimir Oltean <olteanv@gmail.com> wrote: > > > Could you please try to revert the effect of commit 339133f6c318 ("net: > > dsa: tag_rtl4_a: Drop bit 9 from egress frames") by setting that bit in > > the egress tag again? Who knows, maybe it is the bit which tells the > > switch to bypass the forwarding process. > > I have already tried that, it was one of the first things I tried, > just looking over the git log and looking for usual suspects. > > Sadly it has no effect whatsoever, the problem persists :( > > > Or maybe there is a bit in a > > different position which does this. You could try to fill in all bits in > > unknown positions, check that there are no regressions with packet sizes > > < 1496, and then see if that made any changes to packet sizes >= 1496. > > If it did, try to see which bit made the difference. > > Hehe now we're talking :D > > I did something similar before, I think just switching a different bit > every 10 packets or so and running a persistent ping until it succeeds > or not. > > I'll see what I can come up with. > > Yours, > Linus Walleij And the drop reason in ethtool -S also stays unchanged despite all the extra bits set in the tag? It still behaves as if the packet goes through the forwarding path?
On Mon, Oct 30, 2023 at 05:23:25PM +0200, Vladimir Oltean wrote: > On Mon, Oct 30, 2023 at 03:37:24PM +0100, Linus Walleij wrote: > > On Mon, Oct 30, 2023 at 3:16 PM Vladimir Oltean <olteanv@gmail.com> wrote: > > > > > Could you please try to revert the effect of commit 339133f6c318 ("net: > > > dsa: tag_rtl4_a: Drop bit 9 from egress frames") by setting that bit in > > > the egress tag again? Who knows, maybe it is the bit which tells the > > > switch to bypass the forwarding process. > > > > I have already tried that, it was one of the first things I tried, > > just looking over the git log and looking for usual suspects. > > > > Sadly it has no effect whatsoever, the problem persists :( > > > > > Or maybe there is a bit in a > > > different position which does this. You could try to fill in all bits in > > > unknown positions, check that there are no regressions with packet sizes > > > < 1496, and then see if that made any changes to packet sizes >= 1496. > > > If it did, try to see which bit made the difference. > > > > Hehe now we're talking :D > > > > I did something similar before, I think just switching a different bit > > every 10 packets or so and running a persistent ping until it succeeds > > or not. > > > > I'll see what I can come up with. > > > > Yours, > > Linus Walleij > > And the drop reason in ethtool -S also stays unchanged despite all the > extra bits set in the tag? It still behaves as if the packet goes > through the forwarding path? Could you please place these skb_dump() calls before and after the magic __skb_put_padto() call, for us to see if anything unexpected changes? Maybe the socket buffers have some unusual geometry which the conduit interface doesn't like, for some reason. The number of skb dumps that you provide back should be even, and ideally the first one should be the unaltered packet, to avoid confusion :) diff --git a/net/dsa/tag_rtl4_a.c b/net/dsa/tag_rtl4_a.c index 25238093686c..2ca189b5125e 100644 --- a/net/dsa/tag_rtl4_a.c +++ b/net/dsa/tag_rtl4_a.c @@ -41,18 +41,22 @@ static struct sk_buff *rtl4a_tag_xmit(struct sk_buff *skb, u8 *tag; u16 out; - /* Pad out to at least 60 bytes */ - if (unlikely(__skb_put_padto(skb, ETH_ZLEN, false))) - return NULL; - /* Packets over 1496 bytes get dropped unless they get padded * out to 1518 bytes. 1496 is ETH_DATA_LEN - tag which is hardly * a coinicidence, and 1518 is ETH_FRAME_LEN + FCS so we define * the threshold size and padding like this. */ if (skb->len >= (ETH_DATA_LEN - RTL4_A_HDR_LEN)) { + skb_dump(KERN_ERR, skb, true); + if (unlikely(__skb_put_padto(skb, ETH_FRAME_LEN + ETH_FCS_LEN, false))) return NULL; + + skb_dump(KERN_ERR, skb, true); + } else { + /* Pad out to at least 60 bytes */ + if (unlikely(__skb_put_padto(skb, ETH_ZLEN, false))) + return NULL; } netdev_dbg(dev, "add realtek tag to package to port %d\n",
On Mon, Oct 30, 2023 at 3:16 PM Vladimir Oltean <olteanv@gmail.com> wrote: > which means that here, skb->len will be 1522, if it was originally 1496. > So the code adds 26 extra octets, and only 4 of those are legitimate (a tag). Yeah I know :/ > The rest is absolutely unexplained, which means that until there is a > valid explanation for them: > > pw-bot: cr > > (sorry, but if it works and we don't know why it works, then at some > point it will break and we won't know why it stopped working) Yeah it broke now and we don't know why... > you said that what increments is Dot1dTpPortInDiscards. 802.1Q-2018 says > about it: "Count of received valid frames that were discarded (i.e., > filtered) by the Forwarding Process." which is odd enough to me, since > packets sent by rtl4a_tag_xmit() should *not* be processed by the forwarding > layer of the switch, but rather, force-delivered to the specified egress > port. No this was a coincidence, we can rule this out. There are always a few (2-3) Dot1dTpPortInDiscards on the switch port when it is connected, sorry for getting this wrong :/ What happens is way more disturbing: packets are dropped *silently* if not padded. I added the following patch: @@ -37,6 +37,8 @@ static struct sk_buff *rtl4a_tag_xmit(struct sk_buff *skb, struct net_device *dev) { struct dsa_port *dp = dsa_slave_to_port(dev); + static u16 mask = BIT(6); + static int cnt = 0; __be16 *p; u8 *tag; u16 out; @@ -60,6 +62,19 @@ static struct sk_buff *rtl4a_tag_xmit(struct sk_buff *skb, /* The lower bits indicate the port number */ out |= BIT(dp->index); + if (skb->len >= (ETH_DATA_LEN - RTL4_A_HDR_LEN)) { + /* Test bits... */ + out |= mask; + netdev_info(dev, "add mask %04x to big package\n", mask); + cnt ++; + if (cnt == 10) { + cnt = 0; + mask <<= 1; + if (mask == BIT(15)) + mask = BIT(6); + } + } + p = (__be16 *)(tag + 2); *p = htons(out); This loops over all the bits not used by the port mask and test them one by one to see if any of them help. Then ran a few rounds of ping -s 1472 and ping -s 1470. There are console prints: realtek-smi switch lan0: add mask 0040 to big package realtek-smi switch lan0: add mask 0040 to big package realtek-smi switch lan0: add mask 0040 to big package (...) Then bits 6,7,8,9,10,11,12,13,14 and 15 are tested in succession. No error counters increase in ethtool -S lan0. I can see the big packets leave the eth0 interface (from ethtool -S eth0) p05_EtherStatsPkts1024to1518Octe: 370 But they do not appear in the targeted switch port stats: EtherStatsPkts1024to1518Octets: 22 (these 22 are some unrelated -s 1400 packets I sent to test) Yours, Linus Walleij
On Mon, Oct 30, 2023 at 10:50:31PM +0100, Linus Walleij wrote: > > you said that what increments is Dot1dTpPortInDiscards > No this was a coincidence, we can rule this out. I see commit 86dd9868b878 ("net: dsa: tag_rtl4_a: Support also egress tags") also mentions: "Qingfang came up with the solution: we need to pad the ethernet frame to 60 bytes using eth_skb_pad(), then the switch will happily accept frames with custom tags.". So the __skb_put_padto() was something very empirical in the first place. Since it's all problematic, would you mind removing the __skb_put_padto() altogether from rtl4a_tag_xmit(), and let me know what is the output for the following sweep through packet sizes? I truly wonder if it's just for small and large packets that we see packet drops, or if it's something repetitive throughout the range as well. for size in $(seq 0 1476); do if ping 10.0.0.56 -s $size -W 1 -c 1 -q >/dev/null; then echo "$((size + 42)): OK"; else echo "$((size + 42)): NOK"; fi; done
On Mon, Oct 30, 2023 at 11:20 PM Vladimir Oltean <olteanv@gmail.com> wrote: > I see commit 86dd9868b878 ("net: dsa: tag_rtl4_a: Support also egress tags") > also mentions: "Qingfang came up with the solution: we need to pad the > ethernet frame to 60 bytes using eth_skb_pad(), then the switch will > happily accept frames with custom tags.". So the __skb_put_padto() was > something very empirical in the first place. > > Since it's all problematic, would you mind removing the __skb_put_padto() > altogether from rtl4a_tag_xmit(), and let me know what is the output for > the following sweep through packet sizes? I truly wonder if it's just > for small and large packets that we see packet drops, or if it's something > repetitive throughout the range as well. > > for size in $(seq 0 1476); do if ping 10.0.0.56 -s $size -W 1 -c 1 -q >/dev/null; then echo "$((size + 42)): OK"; else echo "$((size + 42)): NOK"; fi; done The weird thing is that if I remove the __skb_put_padto() call, ping doesn't work at all. Somehow the packets are corrupted, because they sure get out of the switch and I can see them arriving with tcpdump on the host. root@OpenWrt:/# for size in $(seq 0 1476); do if ping 192.168.1.137 -s $size -W 1 -c 1 -q >/dev/null; then echo "$((size + 42)): OK"; else echo "$((size + 42)): NOK"; fi; done 42: NOK 43: NOK 44: NOK 45: NOK 46: NOK 47: NOK 48: NOK 49: NOK 50: NOK 51: NOK (...) 1509: NOK 1510: NOK 1511: NOK 1512: NOK 1513: NOK 1514: NOK 1515: NOK 1516: NOK 1517: NOK 1518: NOK This of course make no sense, since the padding function should do nothing when the packet is bigger than 60 bytes. So what we are seeing is some kind of side effect from the usage of __skb_put_padto() I suppose? But I have no idea what that is, I looked at the function and what it calls down to __skb_pad(). I'm testing skb_linearize(), which seems to be called on this path... TCPdump on the host looks like this: # tcpdump -i enp7s0 dropped privs to tcpdump tcpdump: verbose output suppressed, use -v[v]... for full protocol decode listening on enp7s0, link-type EN10MB (Ethernet), snapshot length 262144 bytes 23:28:55.184019 IP _gateway > fedora: ICMP echo request, id 2461, seq 0, length 27 23:28:56.205294 IP _gateway > fedora: ICMP echo request, id 2462, seq 0, length 28 23:28:57.226495 IP _gateway > fedora: ICMP echo request, id 2463, seq 0, length 29 23:28:58.248013 IP _gateway > fedora: ICMP echo request, id 2464, seq 0, length 30 23:28:59.269157 IP _gateway > fedora: ICMP echo request, id 2465, seq 0, length 31 23:29:00.290443 IP _gateway > fedora: ICMP echo request, id 2466, seq 0, length 32 23:29:01.698700 IP _gateway > fedora: ICMP echo request, id 2467, seq 0, length 33 23:29:02.332131 IP _gateway > fedora: ICMP echo request, id 2468, seq 0, length 34 23:29:03.352442 IP _gateway > fedora: ICMP echo request, id 2469, seq 0, length 35 (...) 23:53:33.834706 IP _gateway > fedora: ICMP echo request, id 4000, seq 0, length 1475 23:53:34.854946 IP _gateway > fedora: ICMP echo request, id 4001, seq 0, length 1476 23:53:36.258777 IP truncated-ip - 1 bytes missing! _gateway > fedora: ICMP echo request, id 4002, seq 0, length 1477 23:53:36.896654 IP truncated-ip - 2 bytes missing! _gateway > fedora: ICMP echo request, id 4003, seq 0, length 1478 23:53:37.918022 IP truncated-ip - 3 bytes missing! _gateway > fedora: ICMP echo request, id 4004, seq 0, length 1479 23:53:38.938355 IP truncated-ip - 4 bytes missing! _gateway > fedora: ICMP echo request, id 4005, seq 0, length 1480 23:53:39.958451 IP truncated-ip - 4 bytes missing! _gateway > fedora: ICMP echo request, id 4006, seq 0, length 1480 23:53:40.978598 IP truncated-ip - 4 bytes missing! _gateway > fedora: ICMP echo request, id 4007, seq 0, length 1480 23:53:41.998991 IP truncated-ip - 4 bytes missing! _gateway > fedora: ICMP echo request, id 4008, seq 0, length 1480 23:53:43.020309 IP truncated-ip - 4 bytes missing! _gateway > fedora: ICMP echo request, id 4010, seq 0, length 1480 Here you can incidentally also see what happens if we don't pad the big packets: the packet gets truncated. Yours, Linus Walleij
On Mon, Oct 30, 2023 at 11:57:33PM +0100, Linus Walleij wrote: > This of course make no sense, since the padding function should do nothing > when the packet is bigger than 60 bytes. Indeed, this of course makes no sense. Ping doesn't work, or ARP doesn't work? Could you add a static ARP entry for the 192.168.1.137 IP address?
On Tue, Oct 31, 2023 at 12:09 AM Vladimir Oltean <olteanv@gmail.com> wrote: > On Mon, Oct 30, 2023 at 11:57:33PM +0100, Linus Walleij wrote: > > This of course make no sense, since the padding function should do nothing > > when the packet is bigger than 60 bytes. > > Indeed, this of course makes no sense. Ping doesn't work, or ARP doesn't > work? Could you add a static ARP entry for the 192.168.1.137 IP address? ARP appears to work, and also DHCP then (I reconnect and restart the interface before each test), so it's really weird. I'm trying your suggestion to skb_dump() before/after. Yours, Linus Walleij
On Mon, Oct 30, 2023 at 11:57:33PM +0100, Linus Walleij wrote: > Here you can incidentally also see what happens if we don't pad the big packets: > the packet gets truncated. Are we sure we are debugging a switch problem? On how many platforms with the RTL8366RB can the issue be seen? Is the conduit interface the same on all these platforms, or is it different and that makes no difference?
On Tue, Oct 31, 2023 at 01:09:06AM +0200, Vladimir Oltean wrote: > On Mon, Oct 30, 2023 at 11:57:33PM +0100, Linus Walleij wrote: > > This of course make no sense, since the padding function should do nothing > > when the packet is bigger than 60 bytes. > > Indeed, this of course makes no sense. Ping doesn't work, or ARP doesn't > work? Could you add a static ARP entry for the 192.168.1.137 IP address? Probably the ARP, since they are short packets and probably need the padding. Andrew
On 10/30/2023 5:37 PM, Andrew Lunn wrote: > On Tue, Oct 31, 2023 at 01:09:06AM +0200, Vladimir Oltean wrote: >> On Mon, Oct 30, 2023 at 11:57:33PM +0100, Linus Walleij wrote: >>> This of course make no sense, since the padding function should do nothing >>> when the packet is bigger than 60 bytes. >> >> Indeed, this of course makes no sense. Ping doesn't work, or ARP doesn't >> work? Could you add a static ARP entry for the 192.168.1.137 IP address? > > Probably the ARP, since they are short packets and probably need the > padding. Does this also mean that there is a general problem with unicast packets, and that broadcast packets happen to work by accident? Could you check whether a broadcast ping (ping -b) size sweep is immune to what you are reporting?
On Tue, Oct 31, 2023 at 1:37 AM Andrew Lunn <andrew@lunn.ch> wrote: > On Tue, Oct 31, 2023 at 01:09:06AM +0200, Vladimir Oltean wrote: > > On Mon, Oct 30, 2023 at 11:57:33PM +0100, Linus Walleij wrote: > > > This of course make no sense, since the padding function should do nothing > > > when the packet is bigger than 60 bytes. > > > > Indeed, this of course makes no sense. Ping doesn't work, or ARP doesn't > > work? Could you add a static ARP entry for the 192.168.1.137 IP address? > > Probably the ARP, since they are short packets and probably need the > padding. It seems correct: the reason ping stops working is not that the ping isn't reaching the host, I can see the pin request on tcpdumps. The reason is that the host has no idea where to send the reply. Because it's not getting any ARP replies. And that is probably because the ARP replies are short and needs padding to ETH_ZLEN. I notice the code is probably borrowed from tag_brcm.c which does exactly the same thing (ETH_ZLEN + tag). It comes with this explanation: /* The Ethernet switch we are interfaced with needs packets to be at * least 64 bytes (including FCS) otherwise they will be discarded when * they enter the switch port logic. When Broadcom tags are enabled, we * need to make sure that packets are at least 68 bytes * (including FCS and tag) because the length verification is done after * the Broadcom tag is stripped off the ingress packet. * * Let dsa_slave_xmit() free the SKB */ The switch fabric is dropping smaller packets when CPU tags (DSA tags) are enabled. So is the padding to ETH_ZLEN OK or should is happen elsewhere? Yours, Linus Walleij
On Tue, Oct 31, 2023 at 12:33 AM Vladimir Oltean <olteanv@gmail.com> wrote: > On Mon, Oct 30, 2023 at 11:57:33PM +0100, Linus Walleij wrote: > > Here you can incidentally also see what happens if we don't pad the big packets: > > the packet gets truncated. > > Are we sure we are debugging a switch problem? On how many platforms > with the RTL8366RB can the issue be seen? Is the conduit interface the > same on all these platforms, or is it different and that makes no > difference? I don't have any other RTL8366RB systems than the D-Link DIR-685. I however have several systems with the same backing ethernet controller connected directly to a PHY and they all work fine. Yours, Linus Walleij
On 10/31/23 07:06, Linus Walleij wrote: > On Tue, Oct 31, 2023 at 1:37 AM Andrew Lunn <andrew@lunn.ch> wrote: >> On Tue, Oct 31, 2023 at 01:09:06AM +0200, Vladimir Oltean wrote: >>> On Mon, Oct 30, 2023 at 11:57:33PM +0100, Linus Walleij wrote: >>>> This of course make no sense, since the padding function should do nothing >>>> when the packet is bigger than 60 bytes. >>> >>> Indeed, this of course makes no sense. Ping doesn't work, or ARP doesn't >>> work? Could you add a static ARP entry for the 192.168.1.137 IP address? >> >> Probably the ARP, since they are short packets and probably need the >> padding. > > It seems correct: the reason ping stops working is not that the ping > isn't reaching the host, I can see the pin request on tcpdumps. > > The reason is that the host has no idea where to send the reply. > Because it's not getting any ARP replies. > > And that is probably because the ARP replies are short and needs > padding to ETH_ZLEN. > > I notice the code is probably borrowed from tag_brcm.c which does > exactly the same thing (ETH_ZLEN + tag). It comes with this > explanation: > > /* The Ethernet switch we are interfaced with needs packets to > be at > * least 64 bytes (including FCS) otherwise they will be > discarded when > * they enter the switch port logic. When Broadcom tags are > enabled, we > * need to make sure that packets are at least 68 bytes > * (including FCS and tag) because the length verification is > done after > * the Broadcom tag is stripped off the ingress packet. > * > * Let dsa_slave_xmit() free the SKB > */ > > The switch fabric is dropping smaller packets when CPU tags > (DSA tags) are enabled. > > So is the padding to ETH_ZLEN OK or should is happen elsewhere? This is OK IMHO because you may not always have knowledge of which Ethernet MAC the switch is interfaced with, and with the tagger code, you have a central location to address them all. The padding should ideally be done by the Ethernet MAC, even better when the HW can do it on its own (as it should), because whether there is a switch or not, no link partner should accept runt packets (some might but this is not standard).
On Tue, Oct 31, 2023 at 03:16:50PM +0100, Linus Walleij wrote: > On Tue, Oct 31, 2023 at 12:33 AM Vladimir Oltean <olteanv@gmail.com> wrote: > > On Mon, Oct 30, 2023 at 11:57:33PM +0100, Linus Walleij wrote: > > > Here you can incidentally also see what happens if we don't pad the big packets: > > > the packet gets truncated. > > > > Are we sure we are debugging a switch problem? On how many platforms > > with the RTL8366RB can the issue be seen? Is the conduit interface the > > same on all these platforms, or is it different and that makes no > > difference? > > I don't have any other RTL8366RB systems than the D-Link DIR-685. > > I however have several systems with the same backing ethernet controller > connected directly to a PHY and they all work fine. > > Yours, > Linus Walleij Ok, so we don't have a confirmation of breakage with other conduit interface than the Gemini driver, either. So a problem there is still not off the table. So on the gemini-dlink-dir-685.dts platform, you can also use &gmac1 as a plain Ethernet port, right? If possible, could you set up dsa_loop (enable CONFIG_NET_DSA_LOOP, replace "eth0" in dsa_loop_pdata with the netdev name of &gmac1, replace DSA_TAG_PROTO_NONE in dsa_loop_get_protocol() with your tagging protocol) and put a tcpdump on the remote end of the gmac1 port, to see if the issue isn't, in fact, somewhere else, maybe gmac_start_xmit()? With dsa_loop, you turn &gmac1 into a conduit interface of sorts, except for the fact that there is no switch to process the DSA-tagged packets, you see those packets on the remote end, and you can investigate there whether it's the switch who's corrupting/truncating, or if they're somehow sent corrupted/truncated by Gemini on the wire (which would not be visible in a local tcpdump).
On Tue, Oct 31, 2023 at 5:34 PM Vladimir Oltean <olteanv@gmail.com> wrote: > Ok, so we don't have a confirmation of breakage with other conduit > interface than the Gemini driver, either. So a problem there is still > not off the table. True! > So on the gemini-dlink-dir-685.dts platform, you can also use &gmac1 as > a plain Ethernet port, right? As a port it exist on the SoC yes but it is not connected physically to anything. &gmac0 is connected to the switch, and the switch has all the PHYs. (I don't know if I misunderstand the question...) > If possible, could you set up dsa_loop (enable CONFIG_NET_DSA_LOOP, replace > "eth0" in dsa_loop_pdata with the netdev name of &gmac1, replace DSA_TAG_PROTO_NONE > in dsa_loop_get_protocol() with your tagging protocol) and put a tcpdump > on the remote end of the gmac1 port, to see if the issue isn't, in fact, > somewhere else, maybe gmac_start_xmit()? If you by remote end mean the end of a physical cable there is no way I can do that, as I have no PHY on gmac1. But I have other Gemini platforms, so I will try to do it on one of them! Let's see if I can do this thing.... Yours, Linus Walleij
> I don't have any other RTL8366RB systems than the D-Link DIR-685. > > I however have several systems with the same backing ethernet controller > connected directly to a PHY and they all work fine. Hi Linus, I ported TL-WR1043nd to DSA using RTL8366RB on OpenWrt main. Do you need some help testing the switch? I just need to test ping with different sizes? Regards, Luiz
On Tue, Oct 31, 2023 at 8:18 PM Luiz Angelo Daros de Luca <luizluca@gmail.com> wrote: > > I don't have any other RTL8366RB systems than the D-Link DIR-685. > > > > I however have several systems with the same backing ethernet controller > > connected directly to a PHY and they all work fine. > > Hi Linus, > > I ported TL-WR1043nd to DSA using RTL8366RB on OpenWrt main. Do you > need some help testing the switch? Yes! > I just need to test ping with different sizes? Yes try to ping the host from the router: ping -s 1472 192.168.1.1 or so to send a 1500 byte ping packet, which will be padded up to a 1518 byte ethernet frame and 1522 bytes from the conduit interface. Then if it doesn't work, see if this patch solves the issue! Yours, Linus Walleij
Hi Vladimir, I got around to testing this too: On Mon, Oct 30, 2023 at 4:31 PM Vladimir Oltean <olteanv@gmail.com> wrote: > Could you please place these skb_dump() calls before and after the magic > __skb_put_padto() call, for us to see if anything unexpected changes? > Maybe the socket buffers have some unusual geometry which the conduit > interface doesn't like, for some reason. > > The number of skb dumps that you provide back should be even, and > ideally the first one should be the unaltered packet, to avoid confusion :) I did a variant to just get one SKB dump and not tons of them; @@ -37,22 +37,35 @@ static struct sk_buff *rtl4a_tag_xmit(struct sk_buff *skb, struct net_device *dev) { struct dsa_port *dp = dsa_slave_to_port(dev); + static int cnt = 0; __be16 *p; u8 *tag; u16 out; - /* Pad out to at least 60 bytes */ - if (unlikely(__skb_put_padto(skb, ETH_ZLEN, false))) - return NULL; - /* Packets over 1496 bytes get dropped unless they get padded * out to 1518 bytes. 1496 is ETH_DATA_LEN - tag which is hardly * a coinicidence, and 1518 is ETH_FRAME_LEN + FCS so we define * the threshold size and padding like this. */ if (skb->len >= (ETH_DATA_LEN - RTL4_A_HDR_LEN)) { + cnt++; + + if (cnt == 1) { + pr_info("SKB before padding:\n"); + skb_dump(KERN_INFO, skb, true); + } + if (unlikely(__skb_put_padto(skb, ETH_FRAME_LEN + ETH_FCS_LEN, false))) return NULL; + + if (cnt == 1) { + pr_info("SKB after padding:\n"); + skb_dump(KERN_INFO, skb, true); + } + } else { + /* Pad out to at least 60 bytes */ + if (unlikely(__skb_put_padto(skb, ETH_ZLEN, false))) + return NULL; } # ping -s 1472 192.168.1.137 The result: SKB before padding: 37 (192.168.1.13skb len=1514 headroom=18 headlen=1514 tailroom=260 mac=(18,14) net=(32,20) trans=52 shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0)) csum(0xd4ef2b1 ip_summed=0 complete_sw=0 valid=0 level=0) hash(0x0 sw=0 l4=0) proto=0x0800 pkttype=0 iif=0 7): 1472 data bydev name=lan0 feat=0x0002000000005020 tes sk family=2 type=3 proto=1 skb headroom: 00000000: 00 02 00 01 00 00 00 00 00 00 03 78 02 00 bc ae skb headroom: 00000010: 00 00 skb linear: 00000000: bc ae c5 6b a8 3d c2 2f 0b dc cc b4 08 00 45 00 skb linear: 00000010: 05 dc 3b de 40 00 40 01 75 68 c0 a8 01 01 c0 a8 skb linear: 00000020: 01 89 08 00 16 d2 09 54 00 00 8a cc 4d 0d 00 00 skb linear: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000000a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000000b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000000c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000000d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000000e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000000f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000110: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000120: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000150: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000160: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000170: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000190: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000001a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000001b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000001c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000001d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000001e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000001f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000210: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000220: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000230: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000240: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000250: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000260: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000270: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000290: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000002a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000002b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000002c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000002d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000002e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000002f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000310: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000320: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000330: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000340: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000350: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000360: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000370: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000390: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000003a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000003b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000003c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000003d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000003e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000003f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000410: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000420: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000430: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000440: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000450: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000460: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000470: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000490: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000004a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000004b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000004c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000004d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000004e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000004f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000510: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000520: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000530: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000540: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000550: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000560: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000570: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000590: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000005a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000005b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000005c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000005d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000005e0: 00 00 00 00 00 00 00 00 00 00 skb tailroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb tailroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb tailroom: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb tailroom: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb tailroom: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb tailroom: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb tailroom: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb tailroom: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb tailroom: 00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb tailroom: 00000090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb tailroom: 000000a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb tailroom: 000000b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb tailroom: 000000c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb tailroom: 000000d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb tailroom: 000000e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb tailroom: 000000f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb tailroom: 00000100: 00 00 00 00 SKB after padding: skb len=1518 headroom=18 headlen=1518 tailroom=256 mac=(18,14) net=(32,20) trans=52 shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0)) csum(0xd4ef2b1 ip_summed=0 complete_sw=0 valid=0 level=0) hash(0x0 sw=0 l4=0) proto=0x0800 pkttype=0 iif=0 dev name=lan0 feat=0x0002000000005020 sk family=2 type=3 proto=1 skb headroom: 00000000: 00 02 00 01 00 00 00 00 00 00 03 78 02 00 bc ae skb headroom: 00000010: 00 00 skb linear: 00000000: bc ae c5 6b a8 3d c2 2f 0b dc cc b4 08 00 45 00 skb linear: 00000010: 05 dc 3b de 40 00 40 01 75 68 c0 a8 01 01 c0 a8 skb linear: 00000020: 01 89 08 00 16 d2 09 54 00 00 8a cc 4d 0d 00 00 skb linear: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000000a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000000b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000000c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000000d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000000e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000000f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000110: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000120: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000150: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000160: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000170: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000190: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000001a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000001b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000001c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000001d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000001e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000001f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000210: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000220: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000230: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000240: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000250: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000260: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000270: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000290: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000002a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000002b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000002c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000002d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000002e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000002f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000310: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000320: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000330: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000340: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000350: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000360: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000370: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000390: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000003a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000003b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000003c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000003d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000003e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000003f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000410: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000420: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000430: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000440: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000450: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000460: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000470: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000490: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000004a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000004b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000004c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000004d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000004e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000004f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000510: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000520: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000530: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000540: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000550: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000560: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000570: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 00000590: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000005a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000005b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000005c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000005d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb linear: 000005e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb tailroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb tailroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb tailroom: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb tailroom: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb tailroom: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb tailroom: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb tailroom: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb tailroom: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb tailroom: 00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb tailroom: 00000090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb tailroom: 000000a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb tailroom: 000000b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb tailroom: 000000c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb tailroom: 000000d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb tailroom: 000000e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb tailroom: 000000f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 As expected the linear SKB is 4 bytes longer in this case. Yours, Linus Walleij
On Mon, Oct 30, 2023 at 3:16 PM Vladimir Oltean <olteanv@gmail.com> wrote: > if (skb->len >= 1496) > __skb_put_padto(skb, 1518, false); > > (...) > > skb_push(skb, 4); > > which means that here, skb->len will be 1522, if it was originally 1496. > So the code adds 26 extra octets, and only 4 of those are legitimate (a tag). > The rest is absolutely unexplained, which means that until there is a > valid explanation for them: Indeed only 4 of them are needed, I tested to add 4 bytes on the tail for > 1496 and it works. I'll send a new version. However that its "tag" is bogus since the extra tail isn't needed before the paket becomes 1496 bytes, i.e. 1500 bytes including the tag. It has a logic to it but ... yeah. All I can think of is bad VHDL in the switch. Yours, Linus Walleij
Em ter., 31 de out. de 2023 às 16:27, Linus Walleij <linus.walleij@linaro.org> escreveu: > > On Tue, Oct 31, 2023 at 8:18 PM Luiz Angelo Daros de Luca > <luizluca@gmail.com> wrote: > > > > I don't have any other RTL8366RB systems than the D-Link DIR-685. > > > > > > I however have several systems with the same backing ethernet controller > > > connected directly to a PHY and they all work fine. > > > > Hi Linus, > > > > I ported TL-WR1043nd to DSA using RTL8366RB on OpenWrt main. Do you > > need some help testing the switch? > > Yes! > > > I just need to test ping with different sizes? > > Yes try to ping the host from the router: > > ping -s 1472 192.168.1.1 or so to send a 1500 byte ping packet, > which will be padded up to a 1518 byte ethernet frame and > 1522 bytes from the conduit interface. > > Then if it doesn't work, see if this patch solves the issue! Hi Linus, Sorry but I noticed no issues: From the router: No. Time Source Destination Protocol Length Info 1 0.000000000 192.168.1.1 192.168.1.2 ICMP 1514 Echo (ping) request id=0x0789, seq=23/5888, ttl=64 (reply in 2) 2 0.000040094 192.168.1.2 192.168.1.1 ICMP 1514 Echo (ping) reply id=0x0789, seq=23/5888, ttl=64 (request in 1) 5 1.000361559 192.168.1.1 192.168.1.2 ICMP 1514 Echo (ping) request id=0x0789, seq=24/6144, ttl=64 (reply in 6) 6 1.000439668 192.168.1.2 192.168.1.1 ICMP 1514 Echo (ping) reply id=0x0789, seq=24/6144, ttl=64 (request in 5) From the host: No. Time Source Destination Protocol Length Info 1 0.000000000 192.168.1.2 192.168.1.1 ICMP 1514 Echo (ping) request id=0x0002, seq=8/2048, ttl=64 (reply in 2) 2 0.000391800 192.168.1.1 192.168.1.2 ICMP 1514 Echo (ping) reply id=0x0002, seq=8/2048, ttl=64 (request in 1) 3 1.024825212 192.168.1.2 192.168.1.1 ICMP 1514 Echo (ping) request id=0x0002, seq=9/2304, ttl=64 (reply in 4) 4 1.026865170 192.168.1.1 192.168.1.2 ICMP 1514 Echo (ping) reply id=0x0002, seq=9/2304, ttl=64 (request in 3) If I go over that limit, it fragments the packet as expected. My device is using https://github.com/luizluca/openwrt/tree/ath79_dsa_prep%2Bdevices . In summary, kernel 6.1 with openwrt generic patches and the reset-controller patch I sent net-next recently. [ 3.888540] realtek-smi switch: found an RTL8366RB switch [ 3.952366] realtek-smi switch: RTL5937 ver 3 chip found [ 3.967086] realtek-smi switch: set MAC: 42:E4:F5:XX:XX:XX [ 3.976779] realtek-smi switch: missing child interrupt-controller node [ 3.983455] realtek-smi switch: no interrupt support [ 4.158891] realtek-smi switch: no LED for port 5 [ 4.164130] realtek-smi switch: configuring for fixed/rgmii link mode [ 4.171178] realtek-smi switch wan (uninitialized): PHY [SMI-0:00] driver [RTL8366RB Gigabit Ethernet] (irq=POLL) [ 4.183849] realtek-smi switch lan1 (uninitialized): PHY [SMI-0:01] driver [RTL8366RB Gigabit Ethernet] (irq=POLL) [ 4.196439] realtek-smi switch lan2 (uninitialized): PHY [SMI-0:02] driver [RTL8366RB Gigabit Ethernet] (irq=POLL) [ 4.209258] realtek-smi switch lan3 (uninitialized): PHY [SMI-0:03] driver [RTL8366RB Gigabit Ethernet] (irq=POLL) [ 4.221815] realtek-smi switch lan4 (uninitialized): PHY [SMI-0:04] driver [RTL8366RB Gigabit Ethernet] (irq=POLL) [ 4.243071] realtek-smi switch: Link is Up - 1Gbps/Full - flow control off [ 9.707171] realtek-smi switch lan1: configuring for phy/gmii link mode [ 9.727707] realtek-smi switch lan1: Link is Up - 1Gbps/Full - flow control rx/tx [ 12.289349] realtek-smi switch lan1: Link is Down [ 55.761797] realtek-smi switch lan1: configuring for phy/gmii link mode [ 57.460421] realtek-smi switch lan2: configuring for phy/gmii link mode [ 57.505039] realtek-smi switch lan3: configuring for phy/gmii link mode [ 57.823528] realtek-smi switch lan4: configuring for phy/gmii link mode [ 58.000712] realtek-smi switch lan1: Link is Up - 1Gbps/Full - flow control rx/tx [ 58.181047] realtek-smi switch wan: configuring for phy/gmii link mode Maybe the ag71xx driver is doing something differently. Let me know if you need to test anything else. I didn't test the device with your patch applied. Regards, Luiz
On Wed, Nov 1, 2023 at 1:35 PM Luiz Angelo Daros de Luca <luizluca@gmail.com> wrote: > Sorry but I noticed no issues: Don't be sorry about that, it's good news because now I know where to look, i.e. in the ethernet controller. > My device is using > https://github.com/luizluca/openwrt/tree/ath79_dsa_prep%2Bdevices . In > summary, kernel 6.1 with openwrt generic patches and the > reset-controller patch I sent net-next recently. Looking good to me. > [ 3.888540] realtek-smi switch: found an RTL8366RB switch > [ 3.952366] realtek-smi switch: RTL5937 ver 3 chip found Same version as mine too. > [ 3.967086] realtek-smi switch: set MAC: 42:E4:F5:XX:XX:XX Unrelated: I have seen other DSA switches "inherit" the MAC of the conduit interface (BRCM). I wonder if we could do that too instead of this random MAC number. Usually the conduit interface has a properly configured MAC. > [ 3.976779] realtek-smi switch: missing child interrupt-controller node > [ 3.983455] realtek-smi switch: no interrupt support > [ 4.158891] realtek-smi switch: no LED for port 5 Are the LEDs working? My device has no LEDs so I have never tested it, despite I coded it. (Also these days we can actually support each LED individually configured from device tree using the LED API, but that would be quite a bit of code, so only some fun for the aspiring developer...) > Maybe the ag71xx driver is doing something differently. I'll look at it! Thanks a lot Luiz, Linus Walleij
On Wed, Nov 01, 2023 at 09:26:50PM +0100, Linus Walleij wrote: > > [ 3.967086] realtek-smi switch: set MAC: 42:E4:F5:XX:XX:XX > > Unrelated: I have seen other DSA switches "inherit" the MAC of the > conduit interface (BRCM). I wonder if we could do that too instead > of this random MAC number. Usually the conduit interface has a > properly configured MAC. We need to know what is the MAC address from the RTL8366RB_SMAR registers used for. What you know about is that DSA user interfaces have their own MAC address for packet termination (send/receive to/from the CPU). MAC addresses for switch ports are an abstract concept, of course (switches normally just forward packets, nothing is addressed *to* them), and so, these addresses are not programmed per se to hardware unless the prerequisites of dsa_switch_supports_uc_filtering() are implemented. If they are, the MAC addresses of user ports are programmed as host FDB entries (FDB entries towards the CPU port). The rule for establishing the MAC address of each user port is as follows: if of_get_mac_address() finds something valid for that port's OF node - provided by the bootloader - ("address", "mac-address", "local-mac-address", a nvmem provider etc), then that value is used. Otherwise, the MAC address is inherited from the conduit interface. Some switches also have a global MAC address register (used for all ports) of their own, but it is switch-specific what this does. We look at the functionality it offers when deciding what to program it to, since it's of course not possible to sync a single hardware register to the value of the MAC addresses of multiple user ports. For KSZ switches - see ksz_switch_macaddr_get() - the global MAC address register is used for HSR self-address filtering and for Wake on LAN. We sync the value of this hardware register with the MAC address of the first user port on which these optional features are enabled. Then, we allow these optional features only on the other user ports which have the same MAC address as the original one which enabled that feature. On KSZ, the same switch MAC address is also used as MAC SA in generated pause frames, but to our knowledge, that MAC address could be any address (even 00:00:00:00:00:00), so we don't really care too much about that and we let it fall to whatever value it may.
Hi Luiz, On Wed, Nov 01, 2023 at 09:35:30AM -0300, Luiz Angelo Daros de Luca wrote: > Hi Linus, > > Sorry but I noticed no issues: > > From the router: > > No. Time Source Destination Protocol Length Info > 1 0.000000000 192.168.1.1 192.168.1.2 ICMP 1514 Echo (ping) request id=0x0789, seq=23/5888, ttl=64 (reply in 2) > 2 0.000040094 192.168.1.2 192.168.1.1 ICMP 1514 Echo (ping) reply id=0x0789, seq=23/5888, ttl=64 (request in 1) > > From the host: > > No. Time Source Destination Protocol Length Info > 1 0.000000000 192.168.1.2 192.168.1.1 ICMP 1514 Echo (ping) request id=0x0002, seq=8/2048, ttl=64 (reply in 2) > 2 0.000391800 192.168.1.1 192.168.1.2 ICMP 1514 Echo (ping) reply id=0x0002, seq=8/2048, ttl=64 (request in 1) > > If I go over that limit, it fragments the packet as expected. Could you run the shell command that sweeps over the entire range, fromhere? https://lore.kernel.org/netdev/20231030222035.oqos7v7sdq5u6mti@skbuf/
On Wed, Nov 01, 2023 at 09:26:50PM +0100, Linus Walleij wrote: > On Wed, Nov 1, 2023 at 1:35 PM Luiz Angelo Daros de Luca <luizluca@gmail.com> wrote: > > > Sorry but I noticed no issues: > > Don't be sorry about that, it's good news because now I know > where to look, i.e. in the ethernet controller. Don't look too deeply into the code just yet, just try to see what happens with dsa_loop on an identical Ethernet controller that isn't physically attached to a switch.
On Tue, Oct 31, 2023 at 10:22:05PM +0100, Linus Walleij wrote: > Hi Vladimir, > > I got around to testing this too: > > # ping -s 1472 192.168.1.137 > > The result: > > SKB before padding: > 37 (192.168.1.13skb len=1514 headroom=18 headlen=1514 tailroom=260 > mac=(18,14) net=(32,20) trans=52 > shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0)) > csum(0xd4ef2b1 ip_summed=0 complete_sw=0 valid=0 level=0) > hash(0x0 sw=0 l4=0) proto=0x0800 pkttype=0 iif=0 > 7): 1472 data bydev name=lan0 feat=0x0002000000005020 > tes > sk family=2 type=3 proto=1 > > SKB after padding: > skb len=1518 headroom=18 headlen=1518 tailroom=256 > mac=(18,14) net=(32,20) trans=52 > shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0)) > csum(0xd4ef2b1 ip_summed=0 complete_sw=0 valid=0 level=0) > hash(0x0 sw=0 l4=0) proto=0x0800 pkttype=0 iif=0 > dev name=lan0 feat=0x0002000000005020 > sk family=2 type=3 proto=1 > > As expected the linear SKB is 4 bytes longer in this case. Ok, I'm not seeing anything unusual about the skb geometry in the skb_dump() output.
On Tue, Oct 31, 2023 at 08:02:29PM +0100, Linus Walleij wrote: > On Tue, Oct 31, 2023 at 5:34 PM Vladimir Oltean <olteanv@gmail.com> wrote: > > So on the gemini-dlink-dir-685.dts platform, you can also use &gmac1 as > > a plain Ethernet port, right? > > As a port it exist on the SoC yes but it is not connected physically > to anything. > > &gmac0 is connected to the switch, and the switch has all the PHYs. (...) > If you by remote end mean the end of a physical cable there is > no way I can do that, as I have no PHY on gmac1. > > (I don't know if I misunderstand the question...) No, you aren't. > But I have other Gemini platforms, so I will try to do it on one > of them! Let's see if I can do this thing.... Ok.
> > [ 3.967086] realtek-smi switch: set MAC: 42:E4:F5:XX:XX:XX > > Unrelated: I have seen other DSA switches "inherit" the MAC of the > conduit interface (BRCM). I wonder if we could do that too instead > of this random MAC number. Usually the conduit interface has a > properly configured MAC. > > > [ 3.976779] realtek-smi switch: missing child interrupt-controller node > > [ 3.983455] realtek-smi switch: no interrupt support > > [ 4.158891] realtek-smi switch: no LED for port 5 > > Are the LEDs working? My device has no LEDs so I have never > tested it, despite I coded it. (Also these days we can actually > support each LED individually configured from device tree using > the LED API, but that would be quite a bit of code, so only some > fun for the aspiring developer...) No at all. LEDs sometimes change state all together but they normally are just kept on. My device is not funny to play with. It has only 32MB of RAM and it frequently OOM. Even flashing a new image requires some sokoban, unloading all possible kernel modules. It is not usable anymore in the real world but I might take a look at LEDs just for fun. The LEDs in the old swconfig driver do work and I can compare the code. > > Maybe the ag71xx driver is doing something differently. > > I'll look at it! > > Thanks a lot Luiz, I'm glad to help. Regards, Luiz
> > Hi Luiz, > > On Wed, Nov 01, 2023 at 09:35:30AM -0300, Luiz Angelo Daros de Luca wrote: > > Hi Linus, > > > > Sorry but I noticed no issues: > > > > From the router: > > > > No. Time Source Destination Protocol Length Info > > 1 0.000000000 192.168.1.1 192.168.1.2 ICMP 1514 Echo (ping) request id=0x0789, seq=23/5888, ttl=64 (reply in 2) > > 2 0.000040094 192.168.1.2 192.168.1.1 ICMP 1514 Echo (ping) reply id=0x0789, seq=23/5888, ttl=64 (request in 1) > > > > From the host: > > > > No. Time Source Destination Protocol Length Info > > 1 0.000000000 192.168.1.2 192.168.1.1 ICMP 1514 Echo (ping) request id=0x0002, seq=8/2048, ttl=64 (reply in 2) > > 2 0.000391800 192.168.1.1 192.168.1.2 ICMP 1514 Echo (ping) reply id=0x0002, seq=8/2048, ttl=64 (request in 1) > > > > If I go over that limit, it fragments the packet as expected. > > Could you run the shell command that sweeps over the entire range, fromhere? > https://lore.kernel.org/netdev/20231030222035.oqos7v7sdq5u6mti@skbuf/ Sure. I might be able to run it on Monday. I'm away from the device and it might have OOM. It has too little RAM to survive too much time by itself. However, I doubt it might fail at a specific packet range as it would hang an interactive SSH session quite easily. Regards, Luiz
> > [ 3.976779] realtek-smi switch: missing child interrupt-controller node > > [ 3.983455] realtek-smi switch: no interrupt support > > [ 4.158891] realtek-smi switch: no LED for port 5 > > Are the LEDs working? My device has no LEDs so I have never > tested it, despite I coded it. (Also these days we can actually > support each LED individually configured from device tree using > the LED API, but that would be quite a bit of code, so only some > fun for the aspiring developer...) Hi Linus, I took a look at the LED code. It looks like you got it a little bit wrong. /* Set blinking, TODO: make this configurable */ ret = regmap_update_bits(priv->map, RTL8366RB_LED_BLINKRATE_REG, RTL8366RB_LED_BLINKRATE_MASK, RTL8366RB_LED_BLINKRATE_56MS); if (ret) return ret; /* Set up LED activity: * Each port has 4 LEDs, we configure all ports to the same * behaviour (no individual config) but we can set up each * LED separately. */ if (priv->leds_disabled) { /* Turn everything off */ regmap_update_bits(priv->map, RTL8366RB_LED_0_1_CTRL_REG, 0x0FFF, 0); regmap_update_bits(priv->map, RTL8366RB_LED_2_3_CTRL_REG, 0x0FFF, 0); regmap_update_bits(priv->map, RTL8366RB_INTERRUPT_CONTROL_REG, RTL8366RB_P4_RGMII_LED, 0); val = RTL8366RB_LED_OFF; } else { /* TODO: make this configurable per LED */ val = RTL8366RB_LED_FORCE; } for (i = 0; i < 4; i++) { ret = regmap_update_bits(priv->map, RTL8366RB_LED_CTRL_REG, 0xf << (i * 4), val << (i * 4)); if (ret) return ret; } If LEDs are not disabled, it will use the RTL8366RB_LED_FORCE for all 4 LED groups. That RTL8366RB_LED_FORCE keeps the LEDs on. I would use RTL8366RB_LED_LINK_ACT by default to make it blink on link activity (or make it configurable as the comment suggests) but it is not wrong. I cannot evaluate the RTL8366RB_INTERRUPT_CONTROL_REG usage when you disable the LEDs but it seems to be odd. We also have: static void rb8366rb_set_port_led(struct realtek_priv *priv, int port, bool enable) { u16 val = enable ? 0x3f : 0; int ret; if (priv->leds_disabled) return; switch (port) { case 0: ret = regmap_update_bits(priv->map, RTL8366RB_LED_0_1_CTRL_REG, 0x3F, val); break; case 1: ret = regmap_update_bits(priv->map, RTL8366RB_LED_0_1_CTRL_REG, 0x3F << RTL8366RB_LED_1_OFFSET, val << RTL8366RB_LED_1_OFFSET); break; case 2: ret = regmap_update_bits(priv->map, RTL8366RB_LED_2_3_CTRL_REG, 0x3F, val); break; case 3: ret = regmap_update_bits(priv->map, RTL8366RB_LED_2_3_CTRL_REG, 0x3F << RTL8366RB_LED_3_OFFSET, val << RTL8366RB_LED_3_OFFSET); break; case 4: ret = regmap_update_bits(priv->map, RTL8366RB_INTERRUPT_CONTROL_REG, RTL8366RB_P4_RGMII_LED, enable ? RTL8366RB_P4_RGMII_LED : 0); break; default: dev_err(priv->dev, "no LED for port %d\n", port); return; } if (ret) dev_err(priv->dev, "error updating LED on port %d\n", port); } Here things gets strange. The code assumes that RTL8366RB_LED_0_1_CTRL_REG is related to ports 0 and 1. However, it is actually LED group 0 and 1. I don't have the docs but the register seem to enable/disable a port in a group. The first LED pins for all ports form the group 0 (and so on). My device only use the group 0 for its 5 ports, limiting my tests. Anyway, to make all ports blink on link act, I need, at least: RTL8366RB_LED_CTRL_REG(0x0431) = 0x0002 / 0x000f (set led group 0 to RTL8366RB_LED_LINK_ACT or 0x2) RTL8366RB_LED_0_1_CTRL_REG(0x0432) = 0x001f / 0x003f (enable ports 0..5 in LED group 0. I don't really know the mask but the probe code indicates it is 6 bits per group). If you really want to disable port 0 LEDs in all groups, you should unset the first bit for each group. If the mask is really 0x3f, it would be: ret = regmap_update_bits(priv->map, RTL8366RB_LED_0_1_CTRL_REG, port, enable); ret = regmap_update_bits(priv->map, RTL8366RB_LED_0_1_CTRL_REG, port << RTL8366RB_LED_1_OFFSET, enable); ret = regmap_update_bits(priv->map, RTL8366RB_LED_2_3_CTRL_REG, port, enable); ret = regmap_update_bits(priv->map, RTL8366RB_LED_2_3_CTRL_REG, port << RTL8366RB_LED_3_OFFSET, enable); This message, in fact, does not make sense: > > [ 4.158891] realtek-smi switch: no LED for port 5 I though that maybe we could setup a LED driver to expose the LEDs status in sysfs. However, I'm not sure it is worth it. If you change a LED behavior, it would break the HW triggering rule for all the group. I'm not sure the LED API is ready to expose LEDs with related fate. It would, indeed, be useful as a readonly source or just to enable/disable a LED. Regards, Luiz
On Sat, Dec 30, 2023 at 6:19 AM Luiz Angelo Daros de Luca <luizluca@gmail.com> wrote: > I took a look at the LED code. It looks like you got it a little bit wrong. You are right... > If LEDs are not disabled, it will use the RTL8366RB_LED_FORCE for all > 4 LED groups. That RTL8366RB_LED_FORCE keeps the LEDs on. I would use > RTL8366RB_LED_LINK_ACT by default to make it blink on link activity > (or make it configurable as the comment suggests) but it is not wrong. > I cannot evaluate the RTL8366RB_INTERRUPT_CONTROL_REG usage when you > disable the LEDs but it seems to be odd. The problem is that since I don't have a device with LEDs connected to tHE RTL8366RB it is all just dry coding. I would suggest if you can test it just make a basic patch that will at least turn on the LEDs to some default setting that works for you? > I though that maybe we could setup a LED driver to expose the LEDs > status in sysfs. However, I'm not sure it is worth it. If you change a > LED behavior, it would break the HW triggering rule for all the group. > I'm not sure the LED API is ready to expose LEDs with related fate. It > would, indeed, be useful as a readonly source or just to > enable/disable a LED. The LED subsystem supports hardware triggering etc thanks to the elaborate work by Christian (ansuel). You can see an example of how this is done in: drivers/net/dsa/qca/qca8k-leds.c Christian also extended the LEDs subsystem with the necessary callbacks to support HW-backed LED control. This can be used already to achieve HW triggers for the LEDs from sysfs. (See callbacks .hw_control_is_supported, .hw_control_set etc etc). I was working to implement this for the Marvell switches but Andrew wanted to do some more structured approach with a LED library for DSA switches. Yours, Linus Walleij
> > I took a look at the LED code. It looks like you got it a little bit wrong. > > You are right... > > > If LEDs are not disabled, it will use the RTL8366RB_LED_FORCE for all > > 4 LED groups. That RTL8366RB_LED_FORCE keeps the LEDs on. I would use > > RTL8366RB_LED_LINK_ACT by default to make it blink on link activity > > (or make it configurable as the comment suggests) but it is not wrong. > > I cannot evaluate the RTL8366RB_INTERRUPT_CONTROL_REG usage when you > > disable the LEDs but it seems to be odd. > > The problem is that since I don't have a device with LEDs connected > to tHE RTL8366RB it is all just dry coding. > > I would suggest if you can test it just make a basic patch that will > at least turn on the LEDs to some default setting that works for > you? Sure. I believe using link act will be much more useful than just turning all leds on, independently from the port state. It is an easy one-line fix. I can do that after the other series gets merged. I think that we should also remove rb8366rb_set_port_led(). Even if I fix it, it will just turn the LED off when the port is administratively down. RTL8366RB_LED_0_1_CTRL_REG and RTL8366RB_LED_2_3_CTRL_REG are only used to control leds when you force them to be on (RTL8366RB_LED_FORCE). In that scenario, the OS might be in charge of triggering them, even for uses not related to the switch. > > I though that maybe we could setup a LED driver to expose the LEDs > > status in sysfs. However, I'm not sure it is worth it. If you change a > > LED behavior, it would break the HW triggering rule for all the group. > > I'm not sure the LED API is ready to expose LEDs with related fate. It > > would, indeed, be useful as a readonly source or just to > > enable/disable a LED. > > The LED subsystem supports hardware triggering etc thanks to the > elaborate work by Christian (ansuel). You can see an example of how > this is done in: > drivers/net/dsa/qca/qca8k-leds.c > > Christian also extended the LEDs subsystem with the necessary > callbacks to support HW-backed LED control. > > This can be used already to achieve HW triggers for the LEDs > from sysfs. (See callbacks .hw_control_is_supported, > .hw_control_set etc etc). > > I was working to implement this for the Marvell switches but Andrew > wanted to do some more structured approach with a LED library > for DSA switches. Yes, I took a look at it. It would be my inspiration if I go down that road. I can use the HW offload only if all LEDs in the group share the same trigger and timer. Otherwise, I'll need to fall back to software control. I'll think about it if that is a viable solution. Regards, Luiz
diff --git a/net/dsa/tag_rtl4_a.c b/net/dsa/tag_rtl4_a.c index c327314b95e3..3292bc49b158 100644 --- a/net/dsa/tag_rtl4_a.c +++ b/net/dsa/tag_rtl4_a.c @@ -45,6 +45,16 @@ static struct sk_buff *rtl4a_tag_xmit(struct sk_buff *skb, if (unlikely(__skb_put_padto(skb, ETH_ZLEN, false))) return NULL; + /* Packets over 1496 bytes get dropped unless they get padded + * out to 1518 bytes. 1496 is ETH_DATA_LEN - tag which is hardly + * a coinicidence, and 1518 is ETH_FRAME_LEN + FCS so we define + * the threshold size and padding like this. + */ + if (skb->len >= (ETH_DATA_LEN - RTL4_A_HDR_LEN)) { + if (unlikely(__skb_put_padto(skb, ETH_FRAME_LEN + ETH_FCS_LEN, false))) + return NULL; + } + netdev_dbg(dev, "add realtek tag to package to port %d\n", dp->index); skb_push(skb, RTL4_A_HDR_LEN);
It was reported that the "LuCI" web UI was not working properly with a device using the RTL8366RB switch. Disabling the egress port tagging code made the switch work again, but this is not a good solution as we want to be able to direct traffic to a certain port. It turns out that packets between 1496 and 1500 bytes are dropped unless padded out to 1518 bytes. If we pad the ethernet frames to a minimum of ETH_FRAME_LEN + FCS (1518 bytes) everything starts working fine. 1496 is the size of a normal data frame minus the internal DSA tag. The number is intuitive, the explanation evades me. As we completely lack datasheet or any insight into how this switch works, this trial-and-error solution is the best we can do. FWIW this bug may very well be the reason why Realteks code drops are not using CPU tagging. The vendor routers using this switch are all hardwired to disable CPU tagging and all packets are pushed to all ports on TX. This is also the case in the old OpenWrt driver, derived from the vendor code drops. I have tested smaller sizes, only 1518 or bigger padding works. Modifying the MTU on the switch (one setting affecting all ports) has no effect. Before this patch: PING 192.168.1.1 (192.168.1.1) 1470(1498) bytes of data. ^C --- 192.168.1.1 ping statistics --- 2 packets transmitted, 0 received, 100% packet loss, time 1048ms PING 192.168.1.1 (192.168.1.1) 1472(1500) bytes of data. ^C --- 192.168.1.1 ping statistics --- 12 packets transmitted, 0 received, 100% packet loss, time 11267ms After this patch: PING 192.168.1.1 (192.168.1.1) 1470(1498) bytes of data. 1478 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.533 ms 1478 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.630 ms PING 192.168.1.1 (192.168.1.1) 1472(1500) bytes of data. 1480 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.527 ms 1480 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.562 ms Also LuCI starts working with the 1500 bytes frames it produces from the HTTP server. Fixes: 9eb8bc593a5e ("net: dsa: tag_rtl4_a: fix egress tags") Signed-off-by: Linus Walleij <linus.walleij@linaro.org> --- Changes in v2: - Pad packages >= 1496 bytes after some further tests - Use more to-the-point description - Provide ping results in the commit message - Link to v1: https://lore.kernel.org/r/20231027-fix-rtl8366rb-v1-1-d565d905535a@linaro.org --- net/dsa/tag_rtl4_a.c | 10 ++++++++++ 1 file changed, 10 insertions(+) --- base-commit: d9e164e4199bc465b3540d5fe3ffc8a9a4fc6157 change-id: 20231027-fix-rtl8366rb-e752bd5901ca Best regards,