Message ID | 20220331114819.14929-1-huangguangbin2@huawei.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | net: phy: genphy_loopback: fix loopback failed when speed is unknown | expand |
On Thu, Mar 31, 2022 at 07:48:19PM +0800, Guangbin Huang wrote: > If phy link status is down because link partner goes down, the phy speed > will be updated to SPEED_UNKNOWN when autoneg on with general phy driver. > If test loopback in this case, the phy speed will be set to 10M. However, > the speed of mac may not be 10M, it causes loopback test failed. > > To fix this problem, if speed is SPEED_UNKNOWN, don't configure link speed. I don't think this explanation is correct. If speed is UNKNOWN, ctl is just going to have BMCR_LOOPBACK set. That is very similar to what you are doing. The code then waits for the link to establish. This is where i guess your problem is. Are you seeing ETIMEDOUT? Does the link not establish? Thanks Andrew
On 2022/3/31 20:23, Andrew Lunn wrote: > On Thu, Mar 31, 2022 at 07:48:19PM +0800, Guangbin Huang wrote: >> If phy link status is down because link partner goes down, the phy speed >> will be updated to SPEED_UNKNOWN when autoneg on with general phy driver. >> If test loopback in this case, the phy speed will be set to 10M. However, >> the speed of mac may not be 10M, it causes loopback test failed. >> >> To fix this problem, if speed is SPEED_UNKNOWN, don't configure link speed. > > I don't think this explanation is correct. > > If speed is UNKNOWN, ctl is just going to have BMCR_LOOPBACK set. That > is very similar to what you are doing. The code then waits for the > link to establish. This is where i guess your problem is. Are you > seeing ETIMEDOUT? Does the link not establish? > > Thanks > Andrew > . > Hi Andrew This problem is not timeout, I have print return value of phy_read_poll_timeout() and it is 0. In this case, as speed and duplex both are unknown, ctl is just set to 0x4000. However, the follow code sets mask to ~0 for function phy_modify(): int genphy_loopback(struct phy_device *phydev, bool enable) { if (enable) { ... phy_modify(phydev, MII_BMCR, ~0, ctl); ... } so all other bits of BMCR will be cleared and just set bit 14, I use phy trace to prove that: $ cat /sys/kernel/debug/tracing/trace # tracer: nop # # entries-in-buffer/entries-written: 923/923 #P:128 # # _-----=> irqs-off/BH-disabled # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / _-=> migrate-disable # |||| / delay # TASK-PID CPU# ||||| TIMESTAMP FUNCTION # | | | ||||| | | kworker/u257:2-694 [015] ..... 209.263912: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x00 val:0x1040 kworker/u257:2-694 [015] ..... 209.263951: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x01 val:0x7989 kworker/u257:2-694 [015] ..... 209.263990: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x01 val:0x7989 kworker/u257:2-694 [015] ..... 209.264028: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x09 val:0x0200 kworker/u257:2-694 [015] ..... 209.264067: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x0a val:0x0000 ethtool-1148 [007] ..... 209.665693: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x00 val:0x1040 ethtool-1148 [007] ..... 209.665706: mdio_access: mii-0000:bd:00.1 write phy:0x03 reg:0x00 val:0x1840 ethtool-1148 [007] ..... 210.588139: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x00 val:0x1840 ethtool-1148 [007] ..... 210.588152: mdio_access: mii-0000:bd:00.1 write phy:0x03 reg:0x00 val:0x1040 ethtool-1148 [007] ..... 210.615900: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x00 val:0x1040 ethtool-1148 [007] ..... 210.615912: mdio_access: mii-0000:bd:00.1 write phy:0x03 reg:0x00 val:0x4000 //here just set bit 14 ethtool-1148 [007] ..... 210.620952: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x01 val:0x7989 ethtool-1148 [007] ..... 210.625992: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x01 val:0x7989 ethtool-1148 [007] ..... 210.631034: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x01 val:0x7989 ethtool-1148 [007] ..... 210.636075: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x01 val:0x7989 ethtool-1148 [007] ..... 210.641116: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x01 val:0x7989 ethtool-1148 [007] ..... 210.646159: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x01 val:0x7989 ethtool-1148 [007] ..... 210.651215: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x01 val:0x7989 ethtool-1148 [007] ..... 210.656256: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x01 val:0x7989 ethtool-1148 [007] ..... 210.661296: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x01 val:0x7989 ethtool-1148 [007] ..... 210.666338: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x01 val:0x7989 ethtool-1148 [007] ..... 210.671378: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x01 val:0x798d ethtool-1148 [007] ..... 210.679016: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x00 val:0x4000 ethtool-1148 [007] ..... 210.679053: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x01 val:0x798d ethtool-1148 [007] ..... 210.679091: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x09 val:0x0200 ethtool-1148 [007] ..... 210.679129: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x0a val:0x0000 ethtool-1148 [007] ..... 210.695902: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x00 val:0x4000 ethtool-1148 [007] ..... 210.695939: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x01 val:0x798d ethtool-1148 [007] ..... 210.695977: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x09 val:0x0200 ethtool-1148 [007] ..... 210.696014: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x0a val:0x0000 So phy speed will be set to 10M in this case, if previous speed of device before going down is 10M, loopback test is pass. Only previous speed is 100M or 1000M, loopback test is failed. Thanks Guangbin .
> In this case, as speed and duplex both are unknown, ctl is just set to 0x4000. > However, the follow code sets mask to ~0 for function phy_modify(): > int genphy_loopback(struct phy_device *phydev, bool enable) > { > if (enable) { > ... > phy_modify(phydev, MII_BMCR, ~0, ctl); > ... > } > so all other bits of BMCR will be cleared and just set bit 14, I use phy trace to > prove that: > > $ cat /sys/kernel/debug/tracing/trace > # tracer: nop > # > # entries-in-buffer/entries-written: 923/923 #P:128 > # > # _-----=> irqs-off/BH-disabled > # / _----=> need-resched > # | / _---=> hardirq/softirq > # || / _--=> preempt-depth > # ||| / _-=> migrate-disable > # |||| / delay > # TASK-PID CPU# ||||| TIMESTAMP FUNCTION > # | | | ||||| | | > kworker/u257:2-694 [015] ..... 209.263912: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x00 val:0x1040 > kworker/u257:2-694 [015] ..... 209.263951: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x01 val:0x7989 > kworker/u257:2-694 [015] ..... 209.263990: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x01 val:0x7989 > kworker/u257:2-694 [015] ..... 209.264028: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x09 val:0x0200 > kworker/u257:2-694 [015] ..... 209.264067: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x0a val:0x0000 > ethtool-1148 [007] ..... 209.665693: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x00 val:0x1040 > ethtool-1148 [007] ..... 209.665706: mdio_access: mii-0000:bd:00.1 write phy:0x03 reg:0x00 val:0x1840 > ethtool-1148 [007] ..... 210.588139: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x00 val:0x1840 > ethtool-1148 [007] ..... 210.588152: mdio_access: mii-0000:bd:00.1 write phy:0x03 reg:0x00 val:0x1040 > ethtool-1148 [007] ..... 210.615900: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x00 val:0x1040 > ethtool-1148 [007] ..... 210.615912: mdio_access: mii-0000:bd:00.1 write phy:0x03 reg:0x00 val:0x4000 //here just set bit 14 > > So phy speed will be set to 10M in this case, if previous speed of > device before going down is 10M, loopback test is pass. Only > previous speed is 100M or 1000M, loopback test is failed. O.K. So it should be set into 10M half duplex. But why does this cause it not to loopback packets? Does the PHY you are using not actually support 10 Half? Why does it need to be the same speed as when the link was up? And why does it actually set LSTATUS indicating there is link? Is this a generic problem, all PHYs are like this, or is this specific to the PHY you are using? Maybe this PHY needs its own loopback function because it does something odd? Andrew
On Thu, Mar 31, 2022 at 04:26:47PM +0200, Andrew Lunn wrote: > > In this case, as speed and duplex both are unknown, ctl is just set to 0x4000. > > However, the follow code sets mask to ~0 for function phy_modify(): > > int genphy_loopback(struct phy_device *phydev, bool enable) > > { > > if (enable) { > > ... > > phy_modify(phydev, MII_BMCR, ~0, ctl); > > ... > > } > > so all other bits of BMCR will be cleared and just set bit 14, I use phy trace to > > prove that: > > > > $ cat /sys/kernel/debug/tracing/trace > > # tracer: nop > > # > > # entries-in-buffer/entries-written: 923/923 #P:128 > > # > > # _-----=> irqs-off/BH-disabled > > # / _----=> need-resched > > # | / _---=> hardirq/softirq > > # || / _--=> preempt-depth > > # ||| / _-=> migrate-disable > > # |||| / delay > > # TASK-PID CPU# ||||| TIMESTAMP FUNCTION > > # | | | ||||| | | > > kworker/u257:2-694 [015] ..... 209.263912: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x00 val:0x1040 > > kworker/u257:2-694 [015] ..... 209.263951: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x01 val:0x7989 > > kworker/u257:2-694 [015] ..... 209.263990: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x01 val:0x7989 > > kworker/u257:2-694 [015] ..... 209.264028: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x09 val:0x0200 > > kworker/u257:2-694 [015] ..... 209.264067: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x0a val:0x0000 > > ethtool-1148 [007] ..... 209.665693: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x00 val:0x1040 > > ethtool-1148 [007] ..... 209.665706: mdio_access: mii-0000:bd:00.1 write phy:0x03 reg:0x00 val:0x1840 > > ethtool-1148 [007] ..... 210.588139: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x00 val:0x1840 > > ethtool-1148 [007] ..... 210.588152: mdio_access: mii-0000:bd:00.1 write phy:0x03 reg:0x00 val:0x1040 > > ethtool-1148 [007] ..... 210.615900: mdio_access: mii-0000:bd:00.1 read phy:0x03 reg:0x00 val:0x1040 > > ethtool-1148 [007] ..... 210.615912: mdio_access: mii-0000:bd:00.1 write phy:0x03 reg:0x00 val:0x4000 //here just set bit 14 > > > > So phy speed will be set to 10M in this case, if previous speed of > > device before going down is 10M, loopback test is pass. Only > > previous speed is 100M or 1000M, loopback test is failed. > > O.K. So it should be set into 10M half duplex. But why does this cause > it not to loopback packets? Does the PHY you are using not actually > support 10 Half? Why does it need to be the same speed as when the > link was up? And why does it actually set LSTATUS indicating there is > link? > > Is this a generic problem, all PHYs are like this, or is this specific > to the PHY you are using? Maybe this PHY needs its own loopback > function because it does something odd? It looks for me like attempt to fix loopback test for setup without active link partner. Correct? Regards, Oleksij
> > O.K. So it should be set into 10M half duplex. But why does this cause > > it not to loopback packets? Does the PHY you are using not actually > > support 10 Half? Why does it need to be the same speed as when the > > link was up? And why does it actually set LSTATUS indicating there is > > link? > > > > Is this a generic problem, all PHYs are like this, or is this specific > > to the PHY you are using? Maybe this PHY needs its own loopback > > function because it does something odd? > > It looks for me like attempt to fix loopback test for setup without active > link partner. Correct? You should not need a link partner for loopback to work. This is local loopback. The PHY is also saying it has link, if the LSTATUS bit is set. So i don't see why previous speed is relevant hear. This seems to me to be an issue for this particular PHY. What i don't like about this patch is that it is not deterministic what mode the PHY will end up in if speed is unknown. Without the patch, it is 10Mbps, which is historically a sensible default. If this PHY has never had link, what speed does it use? Does it still work in that case? Andrew
On 2022/4/1 20:14, Andrew Lunn wrote: >>> O.K. So it should be set into 10M half duplex. But why does this cause >>> it not to loopback packets? Does the PHY you are using not actually >>> support 10 Half? Why does it need to be the same speed as when the >>> link was up? And why does it actually set LSTATUS indicating there is >>> link? >>> >>> Is this a generic problem, all PHYs are like this, or is this specific >>> to the PHY you are using? Maybe this PHY needs its own loopback >>> function because it does something odd? >> >> It looks for me like attempt to fix loopback test for setup without active >> link partner. Correct? > > You should not need a link partner for loopback to work. This is local > loopback. The PHY is also saying it has link, if the LSTATUS bit is > set. So i don't see why previous speed is relevant hear. This seems to > me to be an issue for this particular PHY. > > What i don't like about this patch is that it is not deterministic > what mode the PHY will end up in if speed is unknown. Without the > patch, it is 10Mbps, which is historically a sensible default. > > If this PHY has never had link, what speed does it use? Does it still > work in that case? > > Andrew > . > Hi Andrew, The PHY we test is RTL8211F, it supports 10 half. This problem actually is, our board has MAC connected with PHY, when loopback test, packet flow is MAC->PHY->MAC, it needs speed of MAC and PHY should be same when they work. If PHY speed is unknown when PHY goes down, we will not set MAC speed in adjust_link interface. In this case, we hope that PHY speed should not be changed, as the old code of function genphy_loopback() before patch "net: phy: genphy_loopback: add link speed configuration". If PHY has never link, MAC speed has never be set in adjust_link interface, yeah, in this case, MAC and PHY may has different speed, and they can not work. I think we can accept this situation. I think it is general problem if there is MAC connected with PHY.
> Hi Andrew, > The PHY we test is RTL8211F, it supports 10 half. This problem actually is, > our board has MAC connected with PHY, when loopback test, packet flow is > MAC->PHY->MAC, it needs speed of MAC and PHY should be same when they work. > > If PHY speed is unknown when PHY goes down, we will not set MAC speed in > adjust_link interface. In this case, we hope that PHY speed should not be > changed, as the old code of function genphy_loopback() before patch > "net: phy: genphy_loopback: add link speed configuration". > > If PHY has never link, MAC speed has never be set in adjust_link interface, > yeah, in this case, MAC and PHY may has different speed, and they can not work. > I think we can accept this situation. > > I think it is general problem if there is MAC connected with PHY. Thanks for investigating. Looks like we are getting close the real solution. And it is a generic problem, that the MAC and PHY might not be using the same configuration. So it looks like if the link is down, or speed is UNKNOWN, we need to set phydev->link true, speed 10, duplex half, the PHY into 10/Half and call the adjust_link callback. That should get the MAC and PHY to talk to each other. The open question is what to do when we disable loopback. Maybe we need to always set link false, speed unknown and call phy_start_aneg() to restart the link? Andrew
On Thu, Apr 07, 2022 at 04:45:03PM +0200, Andrew Lunn wrote: > > Hi Andrew, > > The PHY we test is RTL8211F, it supports 10 half. This problem actually is, > > our board has MAC connected with PHY, when loopback test, packet flow is > > MAC->PHY->MAC, it needs speed of MAC and PHY should be same when they work. > > > > If PHY speed is unknown when PHY goes down, we will not set MAC speed in > > adjust_link interface. In this case, we hope that PHY speed should not be > > changed, as the old code of function genphy_loopback() before patch > > "net: phy: genphy_loopback: add link speed configuration". > > > > If PHY has never link, MAC speed has never be set in adjust_link interface, > > yeah, in this case, MAC and PHY may has different speed, and they can not work. > > I think we can accept this situation. > > > > I think it is general problem if there is MAC connected with PHY. > > Thanks for investigating. Looks like we are getting close the real > solution. And it is a generic problem, that the MAC and PHY might not > be using the same configuration. > > So it looks like if the link is down, or speed is UNKNOWN, we need to > set phydev->link true, speed 10, duplex half... Hm, i was thinking, is it impossible to run loopback test on half duplex link. Do I missing something? Regards, Oleksij
On Fri, Apr 08, 2022 at 10:18:24AM +0200, Oleksij Rempel wrote: > On Thu, Apr 07, 2022 at 04:45:03PM +0200, Andrew Lunn wrote: > > > Hi Andrew, > > > The PHY we test is RTL8211F, it supports 10 half. This problem actually is, > > > our board has MAC connected with PHY, when loopback test, packet flow is > > > MAC->PHY->MAC, it needs speed of MAC and PHY should be same when they work. > > > > > > If PHY speed is unknown when PHY goes down, we will not set MAC speed in > > > adjust_link interface. In this case, we hope that PHY speed should not be > > > changed, as the old code of function genphy_loopback() before patch > > > "net: phy: genphy_loopback: add link speed configuration". > > > > > > If PHY has never link, MAC speed has never be set in adjust_link interface, > > > yeah, in this case, MAC and PHY may has different speed, and they can not work. > > > I think we can accept this situation. > > > > > > I think it is general problem if there is MAC connected with PHY. > > > > Thanks for investigating. Looks like we are getting close the real > > solution. And it is a generic problem, that the MAC and PHY might not > > be using the same configuration. > > > > So it looks like if the link is down, or speed is UNKNOWN, we need to > > set phydev->link true, speed 10, duplex half... > > Hm, i was thinking, is it impossible to run loopback test on half duplex link. > Do I missing something? Ah, you might be right. I was just thinking of the lowest common denominator. So 10 Full. Or maybe even look at phydev->supported and pick one from there. Andrew
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c index 8406ac739def..5001bb1a019c 100644 --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c @@ -2618,6 +2618,9 @@ int genphy_loopback(struct phy_device *phydev, bool enable) ctl |= BMCR_SPEED1000; else if (phydev->speed == SPEED_100) ctl |= BMCR_SPEED100; + else if (phydev->speed == SPEED_UNKNOWN) + return phy_modify(phydev, MII_BMCR, BMCR_LOOPBACK, + BMCR_LOOPBACK); if (phydev->duplex == DUPLEX_FULL) ctl |= BMCR_FULLDPLX;
If phy link status is down because link partner goes down, the phy speed will be updated to SPEED_UNKNOWN when autoneg on with general phy driver. If test loopback in this case, the phy speed will be set to 10M. However, the speed of mac may not be 10M, it causes loopback test failed. To fix this problem, if speed is SPEED_UNKNOWN, don't configure link speed. Fixes: 014068dcb5b1 ("net: phy: genphy_loopback: add link speed configuration") Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> --- drivers/net/phy/phy_device.c | 3 +++ 1 file changed, 3 insertions(+)