Message ID | d142b909d0600b67b9ceadc767c4177be216f5bd.1720512888.git.0x1207@gmail.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | net: stmmac: refactor FPE for gmac4 and xgmac | expand |
On Tue, Jul 09, 2024 at 04:21:19PM +0800, Furong Xu wrote: > The FPE support for xgmac is incomplete, drop it temporarily. > Once FPE implementation is refactored, xgmac support will be added. This is a pretty unusual thing to do. What does the current implementation do? Is there enough for it to actually work? If i was doing a git bisect and landed on this patch, could i find my networking is broken? More normal is to build a new implementation by the side, and then swap to it. Andrew
Hi Andrew, Furong, On Tue, Jul 09, 2024 at 03:16:35PM +0200, Andrew Lunn wrote: > On Tue, Jul 09, 2024 at 04:21:19PM +0800, Furong Xu wrote: > > The FPE support for xgmac is incomplete, drop it temporarily. > > Once FPE implementation is refactored, xgmac support will be added. > > This is a pretty unusual thing to do. What does the current > implementation do? Is there enough for it to actually work? If i was > doing a git bisect and landed on this patch, could i find my > networking is broken? > > More normal is to build a new implementation by the side, and then > swap to it. > > Andrew > There were 2 earlier attempts from Jianheng Zhang @ Synopsys to add FPE support to new hardware. I told him that the #1 priority should be to move the stmmac driver over to the new standard API which uses ethtool + tc. https://lore.kernel.org/netdev/CY5PR12MB63726FED738099761A9B81E7BF8FA@CY5PR12MB6372.namprd12.prod.outlook.com/ https://lore.kernel.org/netdev/CY5PR12MB63727C24923AE855CFF0D425BF8EA@CY5PR12MB6372.namprd12.prod.outlook.com/ I'm not sure what happened in the meantime. Jianheng must have faced some issue, because he never came back. I did comment this at the time: | Even this very patch is slightly strange - it is not brand new hardware | support, but it fills in some more FPE ops in dwxlgmac2_ops - when | dwxgmac3_fpe_configure() was already there. So this suggests the | existing support was incomplete. How complete is it now? No way to tell. | There is a selftest to tell, but we can't run it because the driver | doesn't integrate with those kernel APIs. So it is relatively known that the support is incomplete. But I still think we should push for more reviewer insight into this driver by having access to a selftest to get a clearer picture of how it behaves. For that, we need the compliance to the common API. Otherwise, I completely agree to the idea that any development should be done incrementally on top of whatever already exists, instead of putting a curtain on and then taking it back off.
Hi Vladimir On Tue, 9 Jul 2024 20:10:18 +0300, Vladimir Oltean <olteanv@gmail.com> wrote: > Hi Andrew, Furong, > > On Tue, Jul 09, 2024 at 03:16:35PM +0200, Andrew Lunn wrote: > > On Tue, Jul 09, 2024 at 04:21:19PM +0800, Furong Xu wrote: > > > The FPE support for xgmac is incomplete, drop it temporarily. > > > Once FPE implementation is refactored, xgmac support will be added. > > > > This is a pretty unusual thing to do. What does the current > > implementation do? Is there enough for it to actually work? If i was > > doing a git bisect and landed on this patch, could i find my > > networking is broken? > > > > More normal is to build a new implementation by the side, and then > > swap to it. > > > > Andrew > > > > There were 2 earlier attempts from Jianheng Zhang @ Synopsys to add FPE > support to new hardware. > > I told him that the #1 priority should be to move the stmmac driver over > to the new standard API which uses ethtool + tc. > https://lore.kernel.org/netdev/CY5PR12MB63726FED738099761A9B81E7BF8FA@CY5PR12MB6372.namprd12.prod.outlook.com/ > https://lore.kernel.org/netdev/CY5PR12MB63727C24923AE855CFF0D425BF8EA@CY5PR12MB6372.namprd12.prod.outlook.com/ > > I'm not sure what happened in the meantime. Jianheng must have faced > some issue, because he never came back. > > I did comment this at the time: > > | Even this very patch is slightly strange - it is not brand new hardware > | support, but it fills in some more FPE ops in dwxlgmac2_ops - when > | dwxgmac3_fpe_configure() was already there. So this suggests the > | existing support was incomplete. How complete is it now? No way to tell. > | There is a selftest to tell, but we can't run it because the driver > | doesn't integrate with those kernel APIs. > > So it is relatively known that the support is incomplete. But I still > think we should push for more reviewer insight into this driver by > having access to a selftest to get a clearer picture of how it behaves. > For that, we need the compliance to the common API. > After some searching and learning about your commits for FPE using the generic framework, I think it is clear enough to me to implement the new standard driver interface which uses ethtool + tc, and then the refactor of low level FPE function can follow. Thanks.
Hi Vladimir, I hope this email finds you well. I'm reaching out because I'm having an issue with the SJA1105QELY switch connected to an STM32MP151CAA3T over an RGMII interface. About 1 in every 10 starts, the SJA1105 fails to receive frames from the CPU. Specifically, the p04_n_rxfrm counters stay at 0, and all RX error counters are zero too, indicating that the port doesn't seem to see any frames at all. I can reproduce this issue even without rebooting, just by using the unbind/bind sequence: ```sh echo spi0.0 > /sys/bus/spi/drivers/sja1105/unbind echo spi0.0 > /sys/bus/spi/drivers/sja1105/bind ip l s dev t10 up sleep 1 ethtool -t t10 ``` Running the ethtool self-test is the most reliable way to reproduce the problem early without additional software. I've checked the RX_CTL and RX_CLK lines of the SJA1105 port 4 with an oscilloscope, and both look correct and identical in both working and non-working cases. Interestingly, the external RGMII switch ports are working fine. I can bridge them and push frames in all directions without issues. Transferring frames from the switch to the CPU works fine as well, which makes me suspect that the problem is isolated to the reception of frames on the CPU RGMII interface. Is it possible there's some RGMII-specific race condition during the initialization stage? The bootloader implementation of the SJA1105 on same HW seems to work fine too, so this seems to be a Linux kernel-specific issue. Any insights or suggestions you might have would be greatly appreciated! Thanks a lot for your help. Best regards, Oleksij
Hi Oleksij, On Wed, Jul 10, 2024 at 03:31:12PM +0200, Oleksij Rempel wrote: > Hi Vladimir, > > I hope this email finds you well. > > I'm reaching out because I'm having an issue with the SJA1105QELY switch > connected to an STM32MP151CAA3T over an RGMII interface. About 1 in > every 10 starts, the SJA1105 fails to receive frames from the CPU. > Specifically, the p04_n_rxfrm counters stay at 0, and all RX error > counters are zero too, indicating that the port doesn't seem to see any > frames at all. > > I can reproduce this issue even without rebooting, just by using the > unbind/bind sequence: > > ```sh > echo spi0.0 > /sys/bus/spi/drivers/sja1105/unbind > echo spi0.0 > /sys/bus/spi/drivers/sja1105/bind > ip l s dev t10 up > sleep 1 > ethtool -t t10 > ``` > > Running the ethtool self-test is the most reliable way to reproduce the > problem early without additional software. > > I've checked the RX_CTL and RX_CLK lines of the SJA1105 port 4 with an > oscilloscope, and both look correct and identical in both working and > non-working cases. > > Interestingly, the external RGMII switch ports are working fine. I can > bridge them and push frames in all directions without issues. > Transferring frames from the switch to the CPU works fine as well, which > makes me suspect that the problem is isolated to the reception of frames > on the CPU RGMII interface. > > Is it possible there's some RGMII-specific race condition during the > initialization stage? The bootloader implementation of the SJA1105 on same HW > seems to work fine too, so this seems to be a Linux kernel-specific issue. > > Any insights or suggestions you might have would be greatly appreciated! > > Thanks a lot for your help. The symptoms sound awfully similar to a situation described in NXP document AH1704, chapter 6.1.13.3. I can't disclose anything from its contents here, since it is not a public document, requiring, AFAIK, sign-in. But there have been workarounds implemented in other drivers for this particular issue, used as DSA masters for this switch. For example commit 4fa6c976158b ("net: stmmac: dwmac-imx: pause the TXC clock in fixed-link"). If you get access to the NXP document, maybe you can confirm that the circumstances leading to the issue are the same, and you are able to implement a similar workaround for this NIC (also stmmac driver, I believe?!). Hope this helps.
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h index 6a2c7d22df1e..917796293c26 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h +++ b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h @@ -193,8 +193,6 @@ #define XGMAC_MDIO_ADDR 0x00000200 #define XGMAC_MDIO_DATA 0x00000204 #define XGMAC_MDIO_C22P 0x00000220 -#define XGMAC_FPE_CTRL_STS 0x00000280 -#define XGMAC_EFPE BIT(0) #define XGMAC_ADDRx_HIGH(x) (0x00000300 + (x) * 0x8) #define XGMAC_ADDR_MAX 32 #define XGMAC_AE BIT(31) diff --git a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c index 6a987cf598e4..e5bc3957041e 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c @@ -1505,31 +1505,6 @@ static void dwxgmac2_set_arp_offload(struct mac_device_info *hw, bool en, writel(value, ioaddr + XGMAC_RX_CONFIG); } -static void dwxgmac3_fpe_configure(void __iomem *ioaddr, struct stmmac_fpe_cfg *cfg, - u32 num_txq, - u32 num_rxq, bool enable) -{ - u32 value; - - if (!enable) { - value = readl(ioaddr + XGMAC_FPE_CTRL_STS); - - value &= ~XGMAC_EFPE; - - writel(value, ioaddr + XGMAC_FPE_CTRL_STS); - return; - } - - value = readl(ioaddr + XGMAC_RXQ_CTRL1); - value &= ~XGMAC_RQ; - value |= (num_rxq - 1) << XGMAC_RQ_SHIFT; - writel(value, ioaddr + XGMAC_RXQ_CTRL1); - - value = readl(ioaddr + XGMAC_FPE_CTRL_STS); - value |= XGMAC_EFPE; - writel(value, ioaddr + XGMAC_FPE_CTRL_STS); -} - const struct stmmac_ops dwxgmac210_ops = { .core_init = dwxgmac2_core_init, .set_mac = dwxgmac2_set_mac, @@ -1570,7 +1545,6 @@ const struct stmmac_ops dwxgmac210_ops = { .config_l3_filter = dwxgmac2_config_l3_filter, .config_l4_filter = dwxgmac2_config_l4_filter, .set_arp_offload = dwxgmac2_set_arp_offload, - .fpe_configure = dwxgmac3_fpe_configure, }; static void dwxlgmac2_rx_queue_enable(struct mac_device_info *hw, u8 mode, @@ -1627,7 +1601,6 @@ const struct stmmac_ops dwxlgmac2_ops = { .config_l3_filter = dwxgmac2_config_l3_filter, .config_l4_filter = dwxgmac2_config_l4_filter, .set_arp_offload = dwxgmac2_set_arp_offload, - .fpe_configure = dwxgmac3_fpe_configure, }; int dwxgmac2_setup(struct stmmac_priv *priv)
The FPE support for xgmac is incomplete, drop it temporarily. Once FPE implementation is refactored, xgmac support will be added. Signed-off-by: Furong Xu <0x1207@gmail.com> --- .../net/ethernet/stmicro/stmmac/dwxgmac2.h | 2 -- .../ethernet/stmicro/stmmac/dwxgmac2_core.c | 27 ------------------- 2 files changed, 29 deletions(-)