Message ID | 20250414152634.2786447-1-fiona.klute@gmx.de (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [net] net: phy: microchip: force IRQ polling mode for lan88xx | expand |
On Mon, Apr 14, 2025 at 05:26:33PM +0200, Fiona Klute wrote: > With lan88xx based devices the lan78xx driver can get stuck in an > interrupt loop while bringing the device up, flooding the kernel log > with messages like the following: > > lan78xx 2-3:1.0 enp1s0u3: kevent 4 may have been dropped > > Removing interrupt support from the lan88xx PHY driver forces the > driver to use polling instead, which avoids the problem. > > The issue has been observed with Raspberry Pi devices at least since > 4.14 (see [1], bug report for their downstream kernel), as well as > with Nvidia devices [2] in 2020, where disabling polling was the > vendor-suggested workaround (together with the claim that phylib > changes in 4.9 made the interrupt handling in lan78xx incompatible). > > Iperf reports well over 900Mbits/sec per direction with client in > --dualtest mode, so there does not seem to be a significant impact on > throughput (lan88xx device connected via switch to the peer). > > [1] https://github.com/raspberrypi/linux/issues/2447 > [2] https://forums.developer.nvidia.com/t/jetson-xavier-and-lan7800-problem/142134/11 > > Link: https://lore.kernel.org/0901d90d-3f20-4a10-b680-9c978e04ddda@lunn.ch > Signed-off-by: Fiona Klute <fiona.klute@gmx.de> > Cc: kernel-list@raspberrypi.com > Cc: stable@vger.kernel.org Thanks for submitting this. Two nit picks: It needed a Fixes: tag. Probably: Fixes: 792aec47d59d ("add microchip LAN88xx phy driver") > static int lan88xx_suspend(struct phy_device *phydev) > { > struct lan88xx_priv *priv = phydev->priv; > @@ -528,9 +487,6 @@ static struct phy_driver microchip_phy_driver[] = { > .config_aneg = lan88xx_config_aneg, > .link_change_notify = lan88xx_link_change_notify, > > - .config_intr = lan88xx_phy_config_intr, > - .handle_interrupt = lan88xx_handle_interrupt, > - Maybe add a comment somewhere around here that interrupts are broken, so not supported. Developers frequently don't look at commit messages, but are more likely to notice a comment. Thanks Andrew --- pw-bot: cr
Am 14.04.25 um 17:43 schrieb Andrew Lunn: > On Mon, Apr 14, 2025 at 05:26:33PM +0200, Fiona Klute wrote: >> With lan88xx based devices the lan78xx driver can get stuck in an >> interrupt loop while bringing the device up, flooding the kernel log >> with messages like the following: >> >> lan78xx 2-3:1.0 enp1s0u3: kevent 4 may have been dropped >> >> Removing interrupt support from the lan88xx PHY driver forces the >> driver to use polling instead, which avoids the problem. >> >> The issue has been observed with Raspberry Pi devices at least since >> 4.14 (see [1], bug report for their downstream kernel), as well as >> with Nvidia devices [2] in 2020, where disabling polling was the >> vendor-suggested workaround (together with the claim that phylib >> changes in 4.9 made the interrupt handling in lan78xx incompatible). >> >> Iperf reports well over 900Mbits/sec per direction with client in >> --dualtest mode, so there does not seem to be a significant impact on >> throughput (lan88xx device connected via switch to the peer). >> >> [1] https://github.com/raspberrypi/linux/issues/2447 >> [2] https://forums.developer.nvidia.com/t/jetson-xavier-and-lan7800-problem/142134/11 >> >> Link: https://lore.kernel.org/0901d90d-3f20-4a10-b680-9c978e04ddda@lunn.ch >> Signed-off-by: Fiona Klute <fiona.klute@gmx.de> >> Cc: kernel-list@raspberrypi.com >> Cc: stable@vger.kernel.org > > Thanks for submitting this. Two nit picks: > > It needed a Fixes: tag. Probably: > > Fixes: 792aec47d59d ("add microchip LAN88xx phy driver") Sure, will add that (and a comment) and resend. I wasn't sure if I should add it if I can't pinpoint exactly where the problem was introduced, and it looks like the interrupt handling was changed a bit after. Best regards, Fiona
> > Thanks for submitting this. Two nit picks: > > > > It needed a Fixes: tag. Probably: > > > > Fixes: 792aec47d59d ("add microchip LAN88xx phy driver") > Sure, will add that (and a comment) and resend. I wasn't sure if I > should add it if I can't pinpoint exactly where the problem was > introduced, and it looks like the interrupt handling was changed a bit > after. If you can prove interrupt handling did actually work to start with, maybe a later commit could be referenced. But i'm not sure it is worth the effort. Andrew
diff --git a/drivers/net/phy/microchip.c b/drivers/net/phy/microchip.c index 0e17cc458efd..06e286387fa9 100644 --- a/drivers/net/phy/microchip.c +++ b/drivers/net/phy/microchip.c @@ -37,47 +37,6 @@ static int lan88xx_write_page(struct phy_device *phydev, int page) return __phy_write(phydev, LAN88XX_EXT_PAGE_ACCESS, page); } -static int lan88xx_phy_config_intr(struct phy_device *phydev) -{ - int rc; - - if (phydev->interrupts == PHY_INTERRUPT_ENABLED) { - /* unmask all source and clear them before enable */ - rc = phy_write(phydev, LAN88XX_INT_MASK, 0x7FFF); - rc = phy_read(phydev, LAN88XX_INT_STS); - rc = phy_write(phydev, LAN88XX_INT_MASK, - LAN88XX_INT_MASK_MDINTPIN_EN_ | - LAN88XX_INT_MASK_LINK_CHANGE_); - } else { - rc = phy_write(phydev, LAN88XX_INT_MASK, 0); - if (rc) - return rc; - - /* Ack interrupts after they have been disabled */ - rc = phy_read(phydev, LAN88XX_INT_STS); - } - - return rc < 0 ? rc : 0; -} - -static irqreturn_t lan88xx_handle_interrupt(struct phy_device *phydev) -{ - int irq_status; - - irq_status = phy_read(phydev, LAN88XX_INT_STS); - if (irq_status < 0) { - phy_error(phydev); - return IRQ_NONE; - } - - if (!(irq_status & LAN88XX_INT_STS_LINK_CHANGE_)) - return IRQ_NONE; - - phy_trigger_machine(phydev); - - return IRQ_HANDLED; -} - static int lan88xx_suspend(struct phy_device *phydev) { struct lan88xx_priv *priv = phydev->priv; @@ -528,9 +487,6 @@ static struct phy_driver microchip_phy_driver[] = { .config_aneg = lan88xx_config_aneg, .link_change_notify = lan88xx_link_change_notify, - .config_intr = lan88xx_phy_config_intr, - .handle_interrupt = lan88xx_handle_interrupt, - .suspend = lan88xx_suspend, .resume = genphy_resume, .set_wol = lan88xx_set_wol,
With lan88xx based devices the lan78xx driver can get stuck in an interrupt loop while bringing the device up, flooding the kernel log with messages like the following: lan78xx 2-3:1.0 enp1s0u3: kevent 4 may have been dropped Removing interrupt support from the lan88xx PHY driver forces the driver to use polling instead, which avoids the problem. The issue has been observed with Raspberry Pi devices at least since 4.14 (see [1], bug report for their downstream kernel), as well as with Nvidia devices [2] in 2020, where disabling polling was the vendor-suggested workaround (together with the claim that phylib changes in 4.9 made the interrupt handling in lan78xx incompatible). Iperf reports well over 900Mbits/sec per direction with client in --dualtest mode, so there does not seem to be a significant impact on throughput (lan88xx device connected via switch to the peer). [1] https://github.com/raspberrypi/linux/issues/2447 [2] https://forums.developer.nvidia.com/t/jetson-xavier-and-lan7800-problem/142134/11 Link: https://lore.kernel.org/0901d90d-3f20-4a10-b680-9c978e04ddda@lunn.ch Signed-off-by: Fiona Klute <fiona.klute@gmx.de> Cc: kernel-list@raspberrypi.com Cc: stable@vger.kernel.org --- drivers/net/phy/microchip.c | 44 ------------------------------------- 1 file changed, 44 deletions(-)