diff mbox series

net: phy: genphy_loopback: fix loopback failed when speed is unknown

Message ID 20220331114819.14929-1-huangguangbin2@huawei.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series net: phy: genphy_loopback: fix loopback failed when speed is unknown | expand

Checks

Context Check Description
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix warning Target tree name not specified in the subject
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers fail 1 blamed authors not CCed: linux@rempel-privat.de; 4 maintainers not CCed: linux@armlinux.org.uk linux@rempel-privat.de hkallweit1@gmail.com pabeni@redhat.com
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 9 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/tree_selection success Guessing tree name failed - patch did not apply

Commit Message

Guangbin Huang March 31, 2022, 11:48 a.m. UTC
If phy link status is down because link partner goes down, the phy speed
will be updated to SPEED_UNKNOWN when autoneg on with general phy driver.
If test loopback in this case, the phy speed will be set to 10M. However,
the speed of mac may not be 10M, it causes loopback test failed.

To fix this problem, if speed is SPEED_UNKNOWN, don't configure link speed.

Fixes: 014068dcb5b1 ("net: phy: genphy_loopback: add link speed configuration")
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
---
 drivers/net/phy/phy_device.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Andrew Lunn March 31, 2022, 12:23 p.m. UTC | #1
On Thu, Mar 31, 2022 at 07:48:19PM +0800, Guangbin Huang wrote:
> If phy link status is down because link partner goes down, the phy speed
> will be updated to SPEED_UNKNOWN when autoneg on with general phy driver.
> If test loopback in this case, the phy speed will be set to 10M. However,
> the speed of mac may not be 10M, it causes loopback test failed.
> 
> To fix this problem, if speed is SPEED_UNKNOWN, don't configure link speed.

I don't think this explanation is correct.

If speed is UNKNOWN, ctl is just going to have BMCR_LOOPBACK set. That
is very similar to what you are doing. The code then waits for the
link to establish. This is where i guess your problem is. Are you
seeing ETIMEDOUT? Does the link not establish?

Thanks
	Andrew
Guangbin Huang March 31, 2022, 1:57 p.m. UTC | #2
On 2022/3/31 20:23, Andrew Lunn wrote:
> On Thu, Mar 31, 2022 at 07:48:19PM +0800, Guangbin Huang wrote:
>> If phy link status is down because link partner goes down, the phy speed
>> will be updated to SPEED_UNKNOWN when autoneg on with general phy driver.
>> If test loopback in this case, the phy speed will be set to 10M. However,
>> the speed of mac may not be 10M, it causes loopback test failed.
>>
>> To fix this problem, if speed is SPEED_UNKNOWN, don't configure link speed.
> 
> I don't think this explanation is correct.
> 
> If speed is UNKNOWN, ctl is just going to have BMCR_LOOPBACK set. That
> is very similar to what you are doing. The code then waits for the
> link to establish. This is where i guess your problem is. Are you
> seeing ETIMEDOUT? Does the link not establish?
> 
> Thanks
> 	Andrew	
> .
> 
Hi Andrew
This problem is not timeout, I have print return value of phy_read_poll_timeout()
and it is 0.

In this case, as speed and duplex both are unknown, ctl is just set to 0x4000.
However, the follow code sets mask to ~0 for function phy_modify():
int genphy_loopback(struct phy_device *phydev, bool enable)
{
	if (enable) {
		...
		phy_modify(phydev, MII_BMCR, ~0, ctl);
		...
}
so all other bits of BMCR will be cleared and just set bit 14, I use phy trace to
prove that:

$ cat /sys/kernel/debug/tracing/trace
# tracer: nop
#
# entries-in-buffer/entries-written: 923/923   #P:128
#
#                                _-----=> irqs-off/BH-disabled
#                               / _----=> need-resched
#                              | / _---=> hardirq/softirq
#                              || / _--=> preempt-depth
#                              ||| / _-=> migrate-disable
#                              |||| /     delay
#           TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
#              | |         |   |||||     |         |
   kworker/u257:2-694     [015] .....   209.263912: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x00 val:0x1040
   kworker/u257:2-694     [015] .....   209.263951: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x01 val:0x7989
   kworker/u257:2-694     [015] .....   209.263990: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x01 val:0x7989
   kworker/u257:2-694     [015] .....   209.264028: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x09 val:0x0200
   kworker/u257:2-694     [015] .....   209.264067: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x0a val:0x0000
          ethtool-1148    [007] .....   209.665693: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x00 val:0x1040
          ethtool-1148    [007] .....   209.665706: mdio_access: mii-0000:bd:00.1 write phy:0x03 reg:0x00 val:0x1840
          ethtool-1148    [007] .....   210.588139: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x00 val:0x1840
          ethtool-1148    [007] .....   210.588152: mdio_access: mii-0000:bd:00.1 write phy:0x03 reg:0x00 val:0x1040
          ethtool-1148    [007] .....   210.615900: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x00 val:0x1040
          ethtool-1148    [007] .....   210.615912: mdio_access: mii-0000:bd:00.1 write phy:0x03 reg:0x00 val:0x4000 //here just set bit 14
          ethtool-1148    [007] .....   210.620952: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x01 val:0x7989
          ethtool-1148    [007] .....   210.625992: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x01 val:0x7989
          ethtool-1148    [007] .....   210.631034: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x01 val:0x7989
          ethtool-1148    [007] .....   210.636075: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x01 val:0x7989
          ethtool-1148    [007] .....   210.641116: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x01 val:0x7989
          ethtool-1148    [007] .....   210.646159: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x01 val:0x7989
          ethtool-1148    [007] .....   210.651215: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x01 val:0x7989
          ethtool-1148    [007] .....   210.656256: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x01 val:0x7989
          ethtool-1148    [007] .....   210.661296: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x01 val:0x7989
          ethtool-1148    [007] .....   210.666338: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x01 val:0x7989
          ethtool-1148    [007] .....   210.671378: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x01 val:0x798d
          ethtool-1148    [007] .....   210.679016: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x00 val:0x4000
          ethtool-1148    [007] .....   210.679053: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x01 val:0x798d
          ethtool-1148    [007] .....   210.679091: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x09 val:0x0200
          ethtool-1148    [007] .....   210.679129: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x0a val:0x0000
          ethtool-1148    [007] .....   210.695902: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x00 val:0x4000
          ethtool-1148    [007] .....   210.695939: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x01 val:0x798d
          ethtool-1148    [007] .....   210.695977: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x09 val:0x0200
          ethtool-1148    [007] .....   210.696014: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x0a val:0x0000

So phy speed will be set to 10M in this case, if previous speed of device before going
down is 10M, loopback test is pass. Only previous speed is 100M or 1000M, loopback test is failed.

Thanks
	Guangbin
.
Andrew Lunn March 31, 2022, 2:26 p.m. UTC | #3
> In this case, as speed and duplex both are unknown, ctl is just set to 0x4000.
> However, the follow code sets mask to ~0 for function phy_modify():
> int genphy_loopback(struct phy_device *phydev, bool enable)
> {
> 	if (enable) {
> 		...
> 		phy_modify(phydev, MII_BMCR, ~0, ctl);
> 		...
> }
> so all other bits of BMCR will be cleared and just set bit 14, I use phy trace to
> prove that:
> 
> $ cat /sys/kernel/debug/tracing/trace
> # tracer: nop
> #
> # entries-in-buffer/entries-written: 923/923   #P:128
> #
> #                                _-----=> irqs-off/BH-disabled
> #                               / _----=> need-resched
> #                              | / _---=> hardirq/softirq
> #                              || / _--=> preempt-depth
> #                              ||| / _-=> migrate-disable
> #                              |||| /     delay
> #           TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
> #              | |         |   |||||     |         |
>   kworker/u257:2-694     [015] .....   209.263912: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x00 val:0x1040
>   kworker/u257:2-694     [015] .....   209.263951: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x01 val:0x7989
>   kworker/u257:2-694     [015] .....   209.263990: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x01 val:0x7989
>   kworker/u257:2-694     [015] .....   209.264028: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x09 val:0x0200
>   kworker/u257:2-694     [015] .....   209.264067: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x0a val:0x0000
>          ethtool-1148    [007] .....   209.665693: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x00 val:0x1040
>          ethtool-1148    [007] .....   209.665706: mdio_access: mii-0000:bd:00.1 write phy:0x03 reg:0x00 val:0x1840
>          ethtool-1148    [007] .....   210.588139: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x00 val:0x1840
>          ethtool-1148    [007] .....   210.588152: mdio_access: mii-0000:bd:00.1 write phy:0x03 reg:0x00 val:0x1040
>          ethtool-1148    [007] .....   210.615900: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x00 val:0x1040
>          ethtool-1148    [007] .....   210.615912: mdio_access: mii-0000:bd:00.1 write phy:0x03 reg:0x00 val:0x4000 //here just set bit 14
> 
> So phy speed will be set to 10M in this case, if previous speed of
> device before going down is 10M, loopback test is pass. Only
> previous speed is 100M or 1000M, loopback test is failed.

O.K. So it should be set into 10M half duplex. But why does this cause
it not to loopback packets? Does the PHY you are using not actually
support 10 Half? Why does it need to be the same speed as when the
link was up? And why does it actually set LSTATUS indicating there is
link?

Is this a generic problem, all PHYs are like this, or is this specific
to the PHY you are using? Maybe this PHY needs its own loopback
function because it does something odd?

   Andrew
Oleksij Rempel April 1, 2022, 6:40 a.m. UTC | #4
On Thu, Mar 31, 2022 at 04:26:47PM +0200, Andrew Lunn wrote:
> > In this case, as speed and duplex both are unknown, ctl is just set to 0x4000.
> > However, the follow code sets mask to ~0 for function phy_modify():
> > int genphy_loopback(struct phy_device *phydev, bool enable)
> > {
> > 	if (enable) {
> > 		...
> > 		phy_modify(phydev, MII_BMCR, ~0, ctl);
> > 		...
> > }
> > so all other bits of BMCR will be cleared and just set bit 14, I use phy trace to
> > prove that:
> > 
> > $ cat /sys/kernel/debug/tracing/trace
> > # tracer: nop
> > #
> > # entries-in-buffer/entries-written: 923/923   #P:128
> > #
> > #                                _-----=> irqs-off/BH-disabled
> > #                               / _----=> need-resched
> > #                              | / _---=> hardirq/softirq
> > #                              || / _--=> preempt-depth
> > #                              ||| / _-=> migrate-disable
> > #                              |||| /     delay
> > #           TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
> > #              | |         |   |||||     |         |
> >   kworker/u257:2-694     [015] .....   209.263912: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x00 val:0x1040
> >   kworker/u257:2-694     [015] .....   209.263951: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x01 val:0x7989
> >   kworker/u257:2-694     [015] .....   209.263990: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x01 val:0x7989
> >   kworker/u257:2-694     [015] .....   209.264028: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x09 val:0x0200
> >   kworker/u257:2-694     [015] .....   209.264067: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x0a val:0x0000
> >          ethtool-1148    [007] .....   209.665693: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x00 val:0x1040
> >          ethtool-1148    [007] .....   209.665706: mdio_access: mii-0000:bd:00.1 write phy:0x03 reg:0x00 val:0x1840
> >          ethtool-1148    [007] .....   210.588139: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x00 val:0x1840
> >          ethtool-1148    [007] .....   210.588152: mdio_access: mii-0000:bd:00.1 write phy:0x03 reg:0x00 val:0x1040
> >          ethtool-1148    [007] .....   210.615900: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x00 val:0x1040
> >          ethtool-1148    [007] .....   210.615912: mdio_access: mii-0000:bd:00.1 write phy:0x03 reg:0x00 val:0x4000 //here just set bit 14
> > 
> > So phy speed will be set to 10M in this case, if previous speed of
> > device before going down is 10M, loopback test is pass. Only
> > previous speed is 100M or 1000M, loopback test is failed.
> 
> O.K. So it should be set into 10M half duplex. But why does this cause
> it not to loopback packets? Does the PHY you are using not actually
> support 10 Half? Why does it need to be the same speed as when the
> link was up? And why does it actually set LSTATUS indicating there is
> link?
> 
> Is this a generic problem, all PHYs are like this, or is this specific
> to the PHY you are using? Maybe this PHY needs its own loopback
> function because it does something odd?

It looks for me like attempt to fix loopback test for setup without active
link partner. Correct?

Regards,
Oleksij
Andrew Lunn April 1, 2022, 12:14 p.m. UTC | #5
> > O.K. So it should be set into 10M half duplex. But why does this cause
> > it not to loopback packets? Does the PHY you are using not actually
> > support 10 Half? Why does it need to be the same speed as when the
> > link was up? And why does it actually set LSTATUS indicating there is
> > link?
> > 
> > Is this a generic problem, all PHYs are like this, or is this specific
> > to the PHY you are using? Maybe this PHY needs its own loopback
> > function because it does something odd?
> 
> It looks for me like attempt to fix loopback test for setup without active
> link partner. Correct?

You should not need a link partner for loopback to work. This is local
loopback. The PHY is also saying it has link, if the LSTATUS bit is
set. So i don't see why previous speed is relevant hear. This seems to
me to be an issue for this particular PHY.

What i don't like about this patch is that it is not deterministic
what mode the PHY will end up in if speed is unknown. Without the
patch, it is 10Mbps, which is historically a sensible default.

If this PHY has never had link, what speed does it use? Does it still
work in that case?

   Andrew
Guangbin Huang April 7, 2022, 1:54 p.m. UTC | #6
On 2022/4/1 20:14, Andrew Lunn wrote:
>>> O.K. So it should be set into 10M half duplex. But why does this cause
>>> it not to loopback packets? Does the PHY you are using not actually
>>> support 10 Half? Why does it need to be the same speed as when the
>>> link was up? And why does it actually set LSTATUS indicating there is
>>> link?
>>>
>>> Is this a generic problem, all PHYs are like this, or is this specific
>>> to the PHY you are using? Maybe this PHY needs its own loopback
>>> function because it does something odd?
>>
>> It looks for me like attempt to fix loopback test for setup without active
>> link partner. Correct?
> 
> You should not need a link partner for loopback to work. This is local
> loopback. The PHY is also saying it has link, if the LSTATUS bit is
> set. So i don't see why previous speed is relevant hear. This seems to
> me to be an issue for this particular PHY.
> 
> What i don't like about this patch is that it is not deterministic
> what mode the PHY will end up in if speed is unknown. Without the
> patch, it is 10Mbps, which is historically a sensible default.
> 
> If this PHY has never had link, what speed does it use? Does it still
> work in that case?
> 
>     Andrew
> .
> 
Hi Andrew,
The PHY we test is RTL8211F, it supports 10 half. This problem actually is,
our board has MAC connected with PHY, when loopback test, packet flow is
MAC->PHY->MAC, it needs speed of MAC and PHY should be same when they work.

If PHY speed is unknown when PHY goes down, we will not set MAC speed in
adjust_link interface. In this case, we hope that PHY speed should not be
changed, as the old code of function genphy_loopback() before patch
"net: phy: genphy_loopback: add link speed configuration".

If PHY has never link, MAC speed has never be set in adjust_link interface,
yeah, in this case, MAC and PHY may has different speed, and they can not work.
I think we can accept this situation.

I think it is general problem if there is MAC connected with PHY.
Andrew Lunn April 7, 2022, 2:45 p.m. UTC | #7
> Hi Andrew,
> The PHY we test is RTL8211F, it supports 10 half. This problem actually is,
> our board has MAC connected with PHY, when loopback test, packet flow is
> MAC->PHY->MAC, it needs speed of MAC and PHY should be same when they work.
> 
> If PHY speed is unknown when PHY goes down, we will not set MAC speed in
> adjust_link interface. In this case, we hope that PHY speed should not be
> changed, as the old code of function genphy_loopback() before patch
> "net: phy: genphy_loopback: add link speed configuration".
> 
> If PHY has never link, MAC speed has never be set in adjust_link interface,
> yeah, in this case, MAC and PHY may has different speed, and they can not work.
> I think we can accept this situation.
> 
> I think it is general problem if there is MAC connected with PHY.

Thanks for investigating. Looks like we are getting close the real
solution. And it is a generic problem, that the MAC and PHY might not
be using the same configuration.

So it looks like if the link is down, or speed is UNKNOWN, we need to
set phydev->link true, speed 10, duplex half, the PHY into 10/Half and
call the adjust_link callback. That should get the MAC and PHY to talk
to each other.

The open question is what to do when we disable loopback. Maybe we
need to always set link false, speed unknown and call phy_start_aneg()
to restart the link?

   Andrew
Oleksij Rempel April 8, 2022, 8:18 a.m. UTC | #8
On Thu, Apr 07, 2022 at 04:45:03PM +0200, Andrew Lunn wrote:
> > Hi Andrew,
> > The PHY we test is RTL8211F, it supports 10 half. This problem actually is,
> > our board has MAC connected with PHY, when loopback test, packet flow is
> > MAC->PHY->MAC, it needs speed of MAC and PHY should be same when they work.
> > 
> > If PHY speed is unknown when PHY goes down, we will not set MAC speed in
> > adjust_link interface. In this case, we hope that PHY speed should not be
> > changed, as the old code of function genphy_loopback() before patch
> > "net: phy: genphy_loopback: add link speed configuration".
> > 
> > If PHY has never link, MAC speed has never be set in adjust_link interface,
> > yeah, in this case, MAC and PHY may has different speed, and they can not work.
> > I think we can accept this situation.
> > 
> > I think it is general problem if there is MAC connected with PHY.
> 
> Thanks for investigating. Looks like we are getting close the real
> solution. And it is a generic problem, that the MAC and PHY might not
> be using the same configuration.
> 
> So it looks like if the link is down, or speed is UNKNOWN, we need to
> set phydev->link true, speed 10, duplex half...

Hm, i was thinking, is it impossible to run loopback test on half duplex link.
Do I missing something?

Regards,
Oleksij
Andrew Lunn April 8, 2022, 1:04 p.m. UTC | #9
On Fri, Apr 08, 2022 at 10:18:24AM +0200, Oleksij Rempel wrote:
> On Thu, Apr 07, 2022 at 04:45:03PM +0200, Andrew Lunn wrote:
> > > Hi Andrew,
> > > The PHY we test is RTL8211F, it supports 10 half. This problem actually is,
> > > our board has MAC connected with PHY, when loopback test, packet flow is
> > > MAC->PHY->MAC, it needs speed of MAC and PHY should be same when they work.
> > > 
> > > If PHY speed is unknown when PHY goes down, we will not set MAC speed in
> > > adjust_link interface. In this case, we hope that PHY speed should not be
> > > changed, as the old code of function genphy_loopback() before patch
> > > "net: phy: genphy_loopback: add link speed configuration".
> > > 
> > > If PHY has never link, MAC speed has never be set in adjust_link interface,
> > > yeah, in this case, MAC and PHY may has different speed, and they can not work.
> > > I think we can accept this situation.
> > > 
> > > I think it is general problem if there is MAC connected with PHY.
> > 
> > Thanks for investigating. Looks like we are getting close the real
> > solution. And it is a generic problem, that the MAC and PHY might not
> > be using the same configuration.
> > 
> > So it looks like if the link is down, or speed is UNKNOWN, we need to
> > set phydev->link true, speed 10, duplex half...
> 
> Hm, i was thinking, is it impossible to run loopback test on half duplex link.
> Do I missing something?

Ah, you might be right. I was just thinking of the lowest common
denominator. So 10 Full. Or maybe even look at phydev->supported and
pick one from there.

     Andrew
diff mbox series

Patch

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 8406ac739def..5001bb1a019c 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -2618,6 +2618,9 @@  int genphy_loopback(struct phy_device *phydev, bool enable)
 			ctl |= BMCR_SPEED1000;
 		else if (phydev->speed == SPEED_100)
 			ctl |= BMCR_SPEED100;
+		else if (phydev->speed == SPEED_UNKNOWN)
+			return phy_modify(phydev, MII_BMCR, BMCR_LOOPBACK,
+					  BMCR_LOOPBACK);
 
 		if (phydev->duplex == DUPLEX_FULL)
 			ctl |= BMCR_FULLDPLX;