diff mbox series

[RFC,net-next,4/5] net: phy: introduce is_c45_over_c22 flag

Message ID 20220323183419.2278676-5-michael@walle.cc (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series net: phy: C45-over-C22 access | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Series has a cover letter
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit fail Errors and warnings before: 412 this patch: 413
netdev/cc_maintainers success CCed 8 of 8 maintainers
netdev/build_clang fail Errors and warnings before: 304 this patch: 305
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn fail Errors and warnings before: 402 this patch: 403
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 63 lines checked
netdev/kdoc fail Errors and warnings before: 0 this patch: 1
netdev/source_inline success Was 0 now: 0

Commit Message

Michael Walle March 23, 2022, 6:34 p.m. UTC
The GPY215 driver supports indirect accesses to c45 over the c22
registers. In its probe function phy_get_c45_ids() is called and the
author descibed their use case as follows:

  The problem comes from condition "phydev->c45_ids.mmds_present &
  MDIO_DEVS_AN".

  Our product supports both C22 and C45.

  In the real system, we found C22 was used by customers (with indirect
  access to C45 registers when necessary).

So it is pretty clear that the intention was to have a method to use the
c45 features over a c22-only MDIO bus. The purpose of calling
phy_get_c45_ids() is to populate the .c45_ids for a PHY which wasn't
probed as a c45 one. Thus, first rename the phy_get_c45_ids() function
to reflect its actual meaning and second, add a new flag which indicates
that this is actually a c45 PHY but behind a c22 bus. The latter is
important for phylink because phylink will treat c45 in a special way by
checking the .is_c45 property. But in our case this isn't set.

Signed-off-by: Michael Walle <michael@walle.cc>
---
 drivers/net/phy/mxl-gpy.c    |  2 +-
 drivers/net/phy/phy_device.c | 20 +++++++++++++++-----
 include/linux/phy.h          |  4 +++-
 3 files changed, 19 insertions(+), 7 deletions(-)

Comments

Andrew Lunn March 23, 2022, 8:07 p.m. UTC | #1
On Wed, Mar 23, 2022 at 07:34:18PM +0100, Michael Walle wrote:
> The GPY215 driver supports indirect accesses to c45 over the c22
> registers. In its probe function phy_get_c45_ids() is called and the
> author descibed their use case as follows:
> 
>   The problem comes from condition "phydev->c45_ids.mmds_present &
>   MDIO_DEVS_AN".
> 
>   Our product supports both C22 and C45.
> 
>   In the real system, we found C22 was used by customers (with indirect
>   access to C45 registers when necessary).
> 
> So it is pretty clear that the intention was to have a method to use the
> c45 features over a c22-only MDIO bus. The purpose of calling
> phy_get_c45_ids() is to populate the .c45_ids for a PHY which wasn't
> probed as a c45 one. Thus, first rename the phy_get_c45_ids() function
> to reflect its actual meaning and second, add a new flag which indicates
> that this is actually a c45 PHY but behind a c22 bus. The latter is
> important for phylink because phylink will treat c45 in a special way by
> checking the .is_c45 property. But in our case this isn't set.

Thinking out loud...

1) We have a C22 only bus. Easy, C45 over C22 should be used.

2) We have a C45 only bus. Easy, C45 should be used, and it will of
   probed that way.

3) We have a C22 and C45 bus, but MDIOBUS_NO_CAP. It will probe C22,
   but ideally we want to swap to C45.

4) We have a C22 and C45 bus, MDIOBUS_C22_C45. It will probe C22, but
   ideally we want to swap to C45.

> @@ -99,7 +99,7 @@ static int gpy_probe(struct phy_device *phydev)
>  	int ret;
>  
>  	if (!phydev->is_c45) {
> -		ret = phy_get_c45_ids(phydev);
> +		ret = phy_get_c45_ids_by_c22(phydev);
>  		if (ret < 0)
>  			return ret;
>  	}

If we are inside the if, we know we probed C22. We have to achieve two
things:

1) Get the c45 ids,
2) Figure out if C45 works, or if C45 over C22 is needed.

I don't see how we are getting this second bit of information, if we
are explicitly using c45 over c22.

This _by_c22 is also making me think of the previous patch, where we
look at the bus capabilities. We are explicitly saying here was want
c45 over c22, and the PHY driver should know the PHY is capable of
it. So we don't need to look at the capabilities, just do it.

     Andrew
Michael Walle March 23, 2022, 10:38 p.m. UTC | #2
Am 2022-03-23 21:07, schrieb Andrew Lunn:
> On Wed, Mar 23, 2022 at 07:34:18PM +0100, Michael Walle wrote:
>> The GPY215 driver supports indirect accesses to c45 over the c22
>> registers. In its probe function phy_get_c45_ids() is called and the
>> author descibed their use case as follows:
>> 
>>   The problem comes from condition "phydev->c45_ids.mmds_present &
>>   MDIO_DEVS_AN".
>> 
>>   Our product supports both C22 and C45.
>> 
>>   In the real system, we found C22 was used by customers (with 
>> indirect
>>   access to C45 registers when necessary).
>> 
>> So it is pretty clear that the intention was to have a method to use 
>> the
>> c45 features over a c22-only MDIO bus. The purpose of calling
>> phy_get_c45_ids() is to populate the .c45_ids for a PHY which wasn't
>> probed as a c45 one. Thus, first rename the phy_get_c45_ids() function
>> to reflect its actual meaning and second, add a new flag which 
>> indicates
>> that this is actually a c45 PHY but behind a c22 bus. The latter is
>> important for phylink because phylink will treat c45 in a special way 
>> by
>> checking the .is_c45 property. But in our case this isn't set.
> 
> Thinking out loud...
> 
> 1) We have a C22 only bus. Easy, C45 over C22 should be used.
> 
> 2) We have a C45 only bus. Easy, C45 should be used, and it will of
>    probed that way.
> 
> 3) We have a C22 and C45 bus, but MDIOBUS_NO_CAP. It will probe C22,
>    but ideally we want to swap to C45.
> 
> 4) We have a C22 and C45 bus, MDIOBUS_C22_C45. It will probe C22, but
>    ideally we want to swap to C45.

I presume you are speaking of
https://elixir.bootlin.com/linux/v5.17/source/drivers/net/phy/mdio_bus.c#L700

Shouldn't that be the other way around then? How would you tell if
you can do C45?

>> @@ -99,7 +99,7 @@ static int gpy_probe(struct phy_device *phydev)
>>  	int ret;
>> 
>>  	if (!phydev->is_c45) {
>> -		ret = phy_get_c45_ids(phydev);
>> +		ret = phy_get_c45_ids_by_c22(phydev);
>>  		if (ret < 0)
>>  			return ret;
>>  	}
> 
> If we are inside the if, we know we probed C22. We have to achieve two
> things:
> 
> 1) Get the c45 ids,
> 2) Figure out if C45 works, or if C45 over C22 is needed.
> 
> I don't see how we are getting this second bit of information, if we
> are explicitly using c45 over c22.

That is related to how C45 capable PHYs are probed (your 4) above),
right? If the PHY would be probed correctly as C45 we wouldn't have
to worry about it. TBH I didn't consider that a valid case because
I thought there were other means to tell "treat this PHY as C45",
that is by the device tree compatible, for example.

Btw. all of this made me question if this is actually the correct
place, or if if shouldn't be handled in the core. With a flag
in the phy driver which might indicate its capable of doing
c45 over c22.

> This _by_c22 is also making me think of the previous patch, where we
> look at the bus capabilities. We are explicitly saying here was want
> c45 over c22, and the PHY driver should know the PHY is capable of
> it. So we don't need to look at the capabilities, just do it.

Mh? I can't follow you here. Are you talking about the
probe_capabilites? These are for the bus probing, i.e. if you can
call mdiobus_c45_read().

-michael
Andrew Lunn March 24, 2022, 12:41 a.m. UTC | #3
> > Thinking out loud...
> > 
> > 1) We have a C22 only bus. Easy, C45 over C22 should be used.
> > 
> > 2) We have a C45 only bus. Easy, C45 should be used, and it will of
> >    probed that way.
> > 
> > 3) We have a C22 and C45 bus, but MDIOBUS_NO_CAP. It will probe C22,
> >    but ideally we want to swap to C45.
> > 
> > 4) We have a C22 and C45 bus, MDIOBUS_C22_C45. It will probe C22, but
> >    ideally we want to swap to C45.
> 
> I presume you are speaking of
> https://elixir.bootlin.com/linux/v5.17/source/drivers/net/phy/mdio_bus.c#L700

Yes.

> Shouldn't that be the other way around then? How would you tell if
> you can do C45?

For a long time, only C22 was probed. To actually make use of a C45
only PHY, you had to have a compatible string in DT which indicated a
C45 device is present on the bus. But then none DT systems came along
which needed to find a C45 only PHY during probe without the hint it
is was actually there. That is when the probe capabilities we added,
and the scan extended to look for a C45 device if the capabilities
indicates the bus actually supported it. But to keep backwards
compatibility, C22 was scanned first and then C45 afterwards.

To some extent, we need to separate finding the device on the bus to
actually using the device. The device might respond to C22, give us
its ID, get the correct driver loaded based on that ID, and the driver
then uses the C45 address space to actually configure the PHY.

Then there is the Marvel 10G PHY. It responds to C22, but returns 0
for the ID! There is a special case for this in the code, it then
looks in the C45 space and uses the ID from there, if it finds
something useful.

So as i said in my reply to the cover letter, we have two different
state variables:

1) The PHY has the C45 register space.

2) We need to either use C45 transfers, or C45 over C22 transfers to
   access the C45 register space.

And we potentially have a chicken/egg problem. The PHY driver knows
1), but in order to know what driver to load we need the ID registers
from the PHY, or some external hint like DT. We are also currently
only probing C22, or C45, but not C45 over C22. And i'm not sure we
actually can probe C45 over C22 because there are C22 only PHYs which
use those two register for other things. So we are back to the driver
again which does know if C45 over C22 will work.

So phydev->has_c45 we can provisionally set if we probed the PHY by
C45. But the driver should also set it if it knows better, or even the
core can set it the first time the driver uses an _mmd API call.

phydev->c45_over_c22 we are currently in a bad shape for. We cannot
reliably say the bus master supports C45. If the bus capabilities say
C22 only, we can set phydev->c45_over_c22. If the bus capabilities
list C45, we can set it false. But that only covers a few busses, most
don't have any capabilities set. We can try a C45 access and see if we
get an -EOPNOTSUPP, in which case we can set phydev->c45_over_c22. But
the bus driver could also do the wrong thing, issue a C22 transfer and
give us back rubbish.

So i think we do need to review all the bus drivers and any which do
support C45 need their capabilities set to indicate that. We can then
set phydev->c45_over_c22.

    Andrew
Michael Walle March 24, 2022, 4:03 p.m. UTC | #4
Am 2022-03-24 01:41, schrieb Andrew Lunn:
>> > Thinking out loud...
>> >
>> > 1) We have a C22 only bus. Easy, C45 over C22 should be used.
>> >
>> > 2) We have a C45 only bus. Easy, C45 should be used, and it will of
>> >    probed that way.
>> >
>> > 3) We have a C22 and C45 bus, but MDIOBUS_NO_CAP. It will probe C22,
>> >    but ideally we want to swap to C45.
>> >
>> > 4) We have a C22 and C45 bus, MDIOBUS_C22_C45. It will probe C22, but
>> >    ideally we want to swap to C45.
>> 
>> I presume you are speaking of
>> https://elixir.bootlin.com/linux/v5.17/source/drivers/net/phy/mdio_bus.c#L700
> 
> Yes.
> 
>> Shouldn't that be the other way around then? How would you tell if
>> you can do C45?
> 
> For a long time, only C22 was probed. To actually make use of a C45
> only PHY, you had to have a compatible string in DT which indicated a
> C45 device is present on the bus. But then none DT systems came along
> which needed to find a C45 only PHY during probe without the hint it
> is was actually there. That is when the probe capabilities we added,
> and the scan extended to look for a C45 device if the capabilities
> indicates the bus actually supported it. But to keep backwards
> compatibility, C22 was scanned first and then C45 afterwards.

Ok, I figured.

> To some extent, we need to separate finding the device on the bus to
> actually using the device. The device might respond to C22, give us
> its ID, get the correct driver loaded based on that ID, and the driver
> then uses the C45 address space to actually configure the PHY.
> 
> Then there is the Marvel 10G PHY. It responds to C22, but returns 0
> for the ID! There is a special case for this in the code, it then
> looks in the C45 space and uses the ID from there, if it finds
> something useful.
> 
> So as i said in my reply to the cover letter, we have two different
> state variables:
> 
> 1) The PHY has the C45 register space.
> 
> 2) We need to either use C45 transfers, or C45 over C22 transfers to
>    access the C45 register space.
> 
> And we potentially have a chicken/egg problem. The PHY driver knows
> 1), but in order to know what driver to load we need the ID registers
> from the PHY, or some external hint like DT. We are also currently
> only probing C22, or C45, but not C45 over C22. And i'm not sure we
> actually can probe C45 over C22 because there are C22 only PHYs which
> use those two register for other things. So we are back to the driver
> again which does know if C45 over C22 will work.

Isn't it safe to assume that if a PHY implements the indirect
registers for c45 in its c22 space that it will also have a valid
PHY ID and then the it's driver will be probed? So if a PHY is
probed as c22 its driver might tell us "wait, it's actually a c45
phy and hey for your convenience it also have the indirect registers
in c22". We can then set has_c45 and maybe c45_over_c22 (also depending
on the bus capabilities).

> So phydev->has_c45 we can provisionally set if we probed the PHY by
> C45. But the driver should also set it if it knows better, or even the
> core can set it the first time the driver uses an _mmd API call.

I'm not sure about the _mmd calls, there are PHYs which have MMDs
(I guess EEE is an example?) but are not capable of C45 accesses.

> phydev->c45_over_c22 we are currently in a bad shape for. We cannot
> reliably say the bus master supports C45. If the bus capabilities say
> C22 only, we can set phydev->c45_over_c22. If the bus capabilities
> list C45, we can set it false. But that only covers a few busses, most
> don't have any capabilities set. We can try a C45 access and see if we
> get an -EOPNOTSUPP, in which case we can set phydev->c45_over_c22. But
> the bus driver could also do the wrong thing, issue a C22 transfer and
> give us back rubbish.

First question, what do you think about keeping the is_c45 property but
with a different meaning and add use_c45_over_c22. That way it will be
less code churn:

  * @is_c45:  Set to true if this PHY has clause 45 address space.
  * @use_c45_over_c22:  Set to true if c45-over-c22 addressing is used.

  1) c45 phy probed as c45 -> is_c45 = 1, like it is at the moment
     use c45 transfers
2a) c45 phy probed as c22 -> is_c45 = 1, use_c45_over_c22 = 0
     use c45 transfers
2b) c45 phy probed as c22 -> is_c45 = 1, use_c45_over_c22 = 1
     use c22 transfers

Did I miss something?

To promote from c22 to c45 we could look at a flag in the PHY
driver. That should be basically that what the gpy driver is trying
to do with the "if (!is_c45) get_c45_ids()" (but fail).

> So i think we do need to review all the bus drivers and any which do
> support C45 need their capabilities set to indicate that. We can then
> set phydev->c45_over_c22.

I could add the probe_capabilites, or at least I could try. But it won't
tell us if the PHY is capable of the indirect addressing. So we aren't
much further in this regard. But IMHO this can be done incrementally
if someone is interested in having that feature. He should also be in
the position to test it properly.

[just saw your latest mail while writing this, so half of it is done
anyway, btw, I did the same today ;)]

-michael
Andrew Lunn March 24, 2022, 4:23 p.m. UTC | #5
> > To some extent, we need to separate finding the device on the bus to
> > actually using the device. The device might respond to C22, give us
> > its ID, get the correct driver loaded based on that ID, and the driver
> > then uses the C45 address space to actually configure the PHY.
> > 
> > Then there is the Marvel 10G PHY. It responds to C22, but returns 0
> > for the ID! There is a special case for this in the code, it then
> > looks in the C45 space and uses the ID from there, if it finds
> > something useful.
> > 
> > So as i said in my reply to the cover letter, we have two different
> > state variables:
> > 
> > 1) The PHY has the C45 register space.
> > 
> > 2) We need to either use C45 transfers, or C45 over C22 transfers to
> >    access the C45 register space.
> > 
> > And we potentially have a chicken/egg problem. The PHY driver knows
> > 1), but in order to know what driver to load we need the ID registers
> > from the PHY, or some external hint like DT. We are also currently
> > only probing C22, or C45, but not C45 over C22. And i'm not sure we
> > actually can probe C45 over C22 because there are C22 only PHYs which
> > use those two register for other things. So we are back to the driver
> > again which does know if C45 over C22 will work.
> 
> Isn't it safe to assume that if a PHY implements the indirect
> registers for c45 in its c22 space that it will also have a valid
> PHY ID and then the it's driver will be probed?

See: https://elixir.bootlin.com/linux/latest/source/drivers/net/phy/phy_device.c#L895

No valid ID in C22 space.

> So if a PHY is
> probed as c22 its driver might tell us "wait, it's actually a c45
> phy and hey for your convenience it also have the indirect registers
> in c22". We can then set has_c45 and maybe c45_over_c22 (also depending
> on the bus capabilities).

In general, if the core can do something, it is better than the driver
doing it. If the core cannot reliably figure it out, then we have to
leave it to the drivers. It could well be we need the drivers to set
has_c45. I would prefer that drivers don't touch c45_over_c22 because
they don't have the knowledge of what the bus is capable of doing. The
only valid case i can think of is for a very oddball PHY which has C45
register space, but cannot actually do C45 transfers, and so C45 over
C22 is the only option.

> > So phydev->has_c45 we can provisionally set if we probed the PHY by
> > C45. But the driver should also set it if it knows better, or even the
> > core can set it the first time the driver uses an _mmd API call.
> 
> I'm not sure about the _mmd calls, there are PHYs which have MMDs
> (I guess EEE is an example?) but are not capable of C45 accesses.

Ah, yes, i forgot about EEE. That was a bad idea.

> > phydev->c45_over_c22 we are currently in a bad shape for. We cannot
> > reliably say the bus master supports C45. If the bus capabilities say
> > C22 only, we can set phydev->c45_over_c22. If the bus capabilities
> > list C45, we can set it false. But that only covers a few busses, most
> > don't have any capabilities set. We can try a C45 access and see if we
> > get an -EOPNOTSUPP, in which case we can set phydev->c45_over_c22. But
> > the bus driver could also do the wrong thing, issue a C22 transfer and
> > give us back rubbish.
> 
> First question, what do you think about keeping the is_c45 property but
> with a different meaning and add use_c45_over_c22. That way it will be
> less code churn:
> 
>  * @is_c45:  Set to true if this PHY has clause 45 address space.
>  * @use_c45_over_c22:  Set to true if c45-over-c22 addressing is used.

I prefer to change is_c45. We then get the compiler to help us with
code review. The build bots will tell us about any code we fail to
check and change. It will also help anybody with out of tree code
making use of is_c45.

       Andrew
Michael Walle March 24, 2022, 5:18 p.m. UTC | #6
Am 2022-03-24 17:23, schrieb Andrew Lunn:
>> > To some extent, we need to separate finding the device on the bus to
>> > actually using the device. The device might respond to C22, give us
>> > its ID, get the correct driver loaded based on that ID, and the driver
>> > then uses the C45 address space to actually configure the PHY.
>> >
>> > Then there is the Marvel 10G PHY. It responds to C22, but returns 0
>> > for the ID! There is a special case for this in the code, it then
>> > looks in the C45 space and uses the ID from there, if it finds
>> > something useful.
>> >
>> > So as i said in my reply to the cover letter, we have two different
>> > state variables:
>> >
>> > 1) The PHY has the C45 register space.
>> >
>> > 2) We need to either use C45 transfers, or C45 over C22 transfers to
>> >    access the C45 register space.
>> >
>> > And we potentially have a chicken/egg problem. The PHY driver knows
>> > 1), but in order to know what driver to load we need the ID registers
>> > from the PHY, or some external hint like DT. We are also currently
>> > only probing C22, or C45, but not C45 over C22. And i'm not sure we
>> > actually can probe C45 over C22 because there are C22 only PHYs which
>> > use those two register for other things. So we are back to the driver
>> > again which does know if C45 over C22 will work.
>> 
>> Isn't it safe to assume that if a PHY implements the indirect
>> registers for c45 in its c22 space that it will also have a valid
>> PHY ID and then the it's driver will be probed?
> 
> See:
> https://elixir.bootlin.com/linux/latest/source/drivers/net/phy/phy_device.c#L895
> 
> No valid ID in C22 space.

I actually looked at the datasheet and yes, it implements the
registers 13 and 14 in c22 to access the c45 space. I couldn't
find any descriptions of other c22 registers though.

>> So if a PHY is
>> probed as c22 its driver might tell us "wait, it's actually a c45
>> phy and hey for your convenience it also have the indirect registers
>> in c22". We can then set has_c45 and maybe c45_over_c22 (also 
>> depending
>> on the bus capabilities).
> 
> In general, if the core can do something, it is better than the driver
> doing it. If the core cannot reliably figure it out, then we have to
> leave it to the drivers. It could well be we need the drivers to set
> has_c45. I would prefer that drivers don't touch c45_over_c22 because
> they don't have the knowledge of what the bus is capable of doing. The
> only valid case i can think of is for a very oddball PHY which has C45
> register space, but cannot actually do C45 transfers, and so C45 over
> C22 is the only option.

And how would you know that the PHY has the needed registers in c22
space? Or do we assume that every C45 PHY has these registers?

..

>> > phydev->c45_over_c22 we are currently in a bad shape for. We cannot
>> > reliably say the bus master supports C45. If the bus capabilities say
>> > C22 only, we can set phydev->c45_over_c22. If the bus capabilities
>> > list C45, we can set it false. But that only covers a few busses, most
>> > don't have any capabilities set. We can try a C45 access and see if we
>> > get an -EOPNOTSUPP, in which case we can set phydev->c45_over_c22. But
>> > the bus driver could also do the wrong thing, issue a C22 transfer and
>> > give us back rubbish.
>> 
>> First question, what do you think about keeping the is_c45 property 
>> but
>> with a different meaning and add use_c45_over_c22. That way it will be
>> less code churn:
>> 
>>  * @is_c45:  Set to true if this PHY has clause 45 address space.
>>  * @use_c45_over_c22:  Set to true if c45-over-c22 addressing is used.
> 
> I prefer to change is_c45. We then get the compiler to help us with
> code review. The build bots will tell us about any code we fail to
> check and change. It will also help anybody with out of tree code
> making use of is_c45.

Ok. Fair enough.

-michael
Andrew Lunn March 24, 2022, 6:55 p.m. UTC | #7
> The
> > only valid case i can think of is for a very oddball PHY which has C45
> > register space, but cannot actually do C45 transfers, and so C45 over
> > C22 is the only option.
> 
> And how would you know that the PHY has the needed registers in c22
> space? Or do we assume that every C45 PHY has these registers?

I think it is a reasonable assumption at the moment. We have around
170 MDIO bus masters in Linux. All but one can do C22. There are
around 15 which can do both C22 and C45, and only one that i know of
which is C45 only. So anybody making a C45 transfer only PHY is making
their device hard to sell.

I suppose a PHY integrated with the MAC could by C45 only, and not
support C22, but i've not seen such a device yet. My guess such a
device would be an Ethernet switch, or a USB device, or maybe an
automotive device with a T1 PHY.

       Andrew
Russell King (Oracle) March 31, 2022, 11:31 a.m. UTC | #8
On Thu, Mar 24, 2022 at 01:41:44AM +0100, Andrew Lunn wrote:
ydev->c45_over_c22 we are currently in a bad shape for. We cannot
> reliably say the bus master supports C45. If the bus capabilities say
> C22 only, we can set phydev->c45_over_c22. If the bus capabilities
> list C45, we can set it false. But that only covers a few busses, most
> don't have any capabilities set. We can try a C45 access and see if we
> get an -EOPNOTSUPP, in which case we can set phydev->c45_over_c22. But
> the bus driver could also do the wrong thing, issue a C22 transfer and
> give us back rubbish.

Unfortunately, trying a C45 access will be very hit and miss - we
need to fix all the MDIO drivers before we do that to check the
access type. Many don't, and worse, many assume a C22 formatted
access request, and just try throwing the PHY address and register
address into the register fields without any masking. The result is
that a C45 access will set random bits in the register.

For example:
drivers/net/mdio/mdio-bcm-iproc.c (no bus capability):

        /* Prepare the read operation */
        cmd = (MII_DATA_TA_VAL << MII_DATA_TA_SHIFT) |
                (reg << MII_DATA_RA_SHIFT) |
                (phy_id << MII_DATA_PA_SHIFT) |
                BIT(MII_DATA_SB_SHIFT) |
                (MII_DATA_OP_READ << MII_DATA_OP_SHIFT);

        writel(cmd, priv->base + MII_DATA_OFFSET);

Similar is true for:
drivers/net/mdio/mdio-bcm-unimac.c (no bus capability)
drivers/net/mdio/mdio-hisi-femac.c (no bus capability)
drivers/net/mdio/mdio-moxart.c (no bus capability)
drivers/net/mdio/mdio-mscc-miim.c (no bus capability)
drivers/net/mdio/mdio-mux-bcm6368.c (no bus capability)
drivers/net/mdio/mdio-mux-bcm-iproc.c (no bus capability)
drivers/net/mdio/mdio-sun4i.c (no bus capability)

These truncate the fields, and fwics they don't set the bus type:
drivers/net/mdio/mdio-xgene.c (for the "rgmii" only bus and no bus capability)

So all of the above need at the very least code added to reject a C45
"dev" or "phy_id" address, or they need to set the bus capability
correctly.

My feeling is that the introduction of the bus capability hasn't done
much to improve this situation; it was introduced to hint at whether a
bus is safe for clause 45 accesses, but it hasn't actually solved this
problem because we patently still have this same issue today.

I think we just need to bite the bullet and audit all the MDIO drivers
we currently have, checking what the results would be if they were
passed a C45 access request, and make them reject such a request if
the register address or phy address obviously overflows into different
register fields and also mark them as C22-only.

I can't see any other reasonable option.
Russell King (Oracle) March 31, 2022, 11:44 a.m. UTC | #9
On Thu, Mar 24, 2022 at 06:18:14PM +0100, Michael Walle wrote:
> Am 2022-03-24 17:23, schrieb Andrew Lunn:
> > > Isn't it safe to assume that if a PHY implements the indirect
> > > registers for c45 in its c22 space that it will also have a valid
> > > PHY ID and then the it's driver will be probed?
> > 
> > See:
> > https://elixir.bootlin.com/linux/latest/source/drivers/net/phy/phy_device.c#L895
> > 
> > No valid ID in C22 space.
> 
> I actually looked at the datasheet and yes, it implements the
> registers 13 and 14 in c22 to access the c45 space. I couldn't
> find any descriptions of other c22 registers though.

I'm not sure which PHY you're referring to here, but iirc, the later
hardware revisions of the 88x3310 implement the indirect access, but
earlier revisions do not.

> > In general, if the core can do something, it is better than the driver
> > doing it. If the core cannot reliably figure it out, then we have to
> > leave it to the drivers. It could well be we need the drivers to set
> > has_c45. I would prefer that drivers don't touch c45_over_c22 because
> > they don't have the knowledge of what the bus is capable of doing. The
> > only valid case i can think of is for a very oddball PHY which has C45
> > register space, but cannot actually do C45 transfers, and so C45 over
> > C22 is the only option.
> 
> And how would you know that the PHY has the needed registers in c22
> space? Or do we assume that every C45 PHY has these registers?

That's the problem. Currently C22 PHY drivers that do not support the
C45 register space have to set the .read_mmd and .write_mmd methods to
genphy_read_mmd_unsupported/genphy_write_mmd_unsupported which
effectively disables access to the C45 register space. In order for
that to happen, we must have read the C22 PHY ID and bound the driver.

That doesn't help with reading the PHY ID though.
Russell King (Oracle) March 31, 2022, 11:50 a.m. UTC | #10
On Thu, Mar 24, 2022 at 07:55:28PM +0100, Andrew Lunn wrote:
> > The
> > > only valid case i can think of is for a very oddball PHY which has C45
> > > register space, but cannot actually do C45 transfers, and so C45 over
> > > C22 is the only option.
> > 
> > And how would you know that the PHY has the needed registers in c22
> > space? Or do we assume that every C45 PHY has these registers?
> 
> I think it is a reasonable assumption at the moment. We have around
> 170 MDIO bus masters in Linux. All but one can do C22.

I don't think that is correct. I'm aware of the Marvell XMDIO driver
that is C45 only, and also xgene's non-rgmii "xfi" variant which is
also C45 only. Note that the xfi variant doesn't reject C22 and makes
no distinction between a C22 and C45 access (so a C22 access to
phy_id = 0 reg = 0 hits C45 phy_id = 0 mmd 0 reg 0.

MDIO drivers are IMHO an utter mess and are in dire need of fixing...
and I'm coming to the conclusion that the bodge of passing both C22
and C45 accesses through the same read/write functions is a huge
mistake, one that is crying out for fixing to prevent more prolification
of this kind of mess.

Yes, it's a lot of work, but I think it needs to be done. Retrofitting
the MDIO drivers with checks etc sounds nice, but if we assume that
patches will continue to be applied to net-next with little review,
we have a losing battle - it would be better to have interfaces designed
to make this kind of mistake impossible.
Andrew Lunn March 31, 2022, 12:06 p.m. UTC | #11
On Thu, Mar 31, 2022 at 12:50:42PM +0100, Russell King (Oracle) wrote:
> On Thu, Mar 24, 2022 at 07:55:28PM +0100, Andrew Lunn wrote:
> > > The
> > > > only valid case i can think of is for a very oddball PHY which has C45
> > > > register space, but cannot actually do C45 transfers, and so C45 over
> > > > C22 is the only option.
> > > 
> > > And how would you know that the PHY has the needed registers in c22
> > > space? Or do we assume that every C45 PHY has these registers?
> > 
> > I think it is a reasonable assumption at the moment. We have around
> > 170 MDIO bus masters in Linux. All but one can do C22.
> 
> I don't think that is correct. I'm aware of the Marvell XMDIO driver
> that is C45 only, and also xgene's non-rgmii "xfi" variant which is
> also C45 only. Note that the xfi variant doesn't reject C22 and makes
> no distinction between a C22 and C45 access (so a C22 access to
> phy_id = 0 reg = 0 hits C45 phy_id = 0 mmd 0 reg 0.
> 
> MDIO drivers are IMHO an utter mess and are in dire need of fixing...
> and I'm coming to the conclusion that the bodge of passing both C22
> and C45 accesses through the same read/write functions is a huge
> mistake, one that is crying out for fixing to prevent more prolification
> of this kind of mess.
> 
> Yes, it's a lot of work, but I think it needs to be done. Retrofitting
> the MDIO drivers with checks etc sounds nice, but if we assume that
> patches will continue to be applied to net-next with little review,
> we have a losing battle - it would be better to have interfaces designed
> to make this kind of mistake impossible.

Hi Russell

So what i think you are saying is change the mii_bus structure:

diff --git a/include/linux/phy.h b/include/linux/phy.h
index 36ca2b5c2253..26322ee23867 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -353,10 +353,15 @@ struct mii_bus {
        const char *name;
        char id[MII_BUS_ID_SIZE];
        void *priv;
-       /** @read: Perform a read transfer on the bus */
-       int (*read)(struct mii_bus *bus, int addr, int regnum);
-       /** @write: Perform a write transfer on the bus */
-       int (*write)(struct mii_bus *bus, int addr, int regnum, u16 val);
+       /** @read: Perform a C22 read transfer on the bus */
+       int (*read_c22)(struct mii_bus *bus, int addr, int regnum);
+       /** @write: Perform a C22 write transfer on the bus */
+       int (*write_c22)(struct mii_bus *bus, int addr, int regnum, u16 val);
+       /** @read: Perform a C45 read transfer on the bus */
+       int (*read_c45)(struct mii_bus *bus, int addr, int devnum, int regnum);
+       /** @write: Perform a C45 write transfer on the bus */
+       int (*write_c45)(struct mii_bus *bus, int addr, int devnum,
+                        int regnum, u16 val);
        /** @reset: Perform a reset of the bus */
        int (*reset)(struct mii_bus *bus);

This way we get a cleaner interface, and the compiler helping us
finding drivers we miss during conversion?

	Andrew
Russell King (Oracle) March 31, 2022, 1:04 p.m. UTC | #12
On Thu, Mar 31, 2022 at 02:06:16PM +0200, Andrew Lunn wrote:
> On Thu, Mar 31, 2022 at 12:50:42PM +0100, Russell King (Oracle) wrote:
> > On Thu, Mar 24, 2022 at 07:55:28PM +0100, Andrew Lunn wrote:
> > > > The
> > > > > only valid case i can think of is for a very oddball PHY which has C45
> > > > > register space, but cannot actually do C45 transfers, and so C45 over
> > > > > C22 is the only option.
> > > > 
> > > > And how would you know that the PHY has the needed registers in c22
> > > > space? Or do we assume that every C45 PHY has these registers?
> > > 
> > > I think it is a reasonable assumption at the moment. We have around
> > > 170 MDIO bus masters in Linux. All but one can do C22.
> > 
> > I don't think that is correct. I'm aware of the Marvell XMDIO driver
> > that is C45 only, and also xgene's non-rgmii "xfi" variant which is
> > also C45 only. Note that the xfi variant doesn't reject C22 and makes
> > no distinction between a C22 and C45 access (so a C22 access to
> > phy_id = 0 reg = 0 hits C45 phy_id = 0 mmd 0 reg 0.
> > 
> > MDIO drivers are IMHO an utter mess and are in dire need of fixing...
> > and I'm coming to the conclusion that the bodge of passing both C22
> > and C45 accesses through the same read/write functions is a huge
> > mistake, one that is crying out for fixing to prevent more prolification
> > of this kind of mess.
> > 
> > Yes, it's a lot of work, but I think it needs to be done. Retrofitting
> > the MDIO drivers with checks etc sounds nice, but if we assume that
> > patches will continue to be applied to net-next with little review,
> > we have a losing battle - it would be better to have interfaces designed
> > to make this kind of mistake impossible.
> 
> Hi Russell
> 
> So what i think you are saying is change the mii_bus structure:
> 
> diff --git a/include/linux/phy.h b/include/linux/phy.h
> index 36ca2b5c2253..26322ee23867 100644
> --- a/include/linux/phy.h
> +++ b/include/linux/phy.h
> @@ -353,10 +353,15 @@ struct mii_bus {
>         const char *name;
>         char id[MII_BUS_ID_SIZE];
>         void *priv;
> -       /** @read: Perform a read transfer on the bus */
> -       int (*read)(struct mii_bus *bus, int addr, int regnum);
> -       /** @write: Perform a write transfer on the bus */
> -       int (*write)(struct mii_bus *bus, int addr, int regnum, u16 val);
> +       /** @read: Perform a C22 read transfer on the bus */
> +       int (*read_c22)(struct mii_bus *bus, int addr, int regnum);
> +       /** @write: Perform a C22 write transfer on the bus */
> +       int (*write_c22)(struct mii_bus *bus, int addr, int regnum, u16 val);
> +       /** @read: Perform a C45 read transfer on the bus */
> +       int (*read_c45)(struct mii_bus *bus, int addr, int devnum, int regnum);
> +       /** @write: Perform a C45 write transfer on the bus */
> +       int (*write_c45)(struct mii_bus *bus, int addr, int devnum,
> +                        int regnum, u16 val);
>         /** @reset: Perform a reset of the bus */
>         int (*reset)(struct mii_bus *bus);
> 
> This way we get a cleaner interface, and the compiler helping us
> finding drivers we miss during conversion?

Yes, and not only does it mean the compiler helps us find unconverted
drivers, it will also stop new drivers that use the old interfaces being
merged and most importantly it puts an end to the ongoing question about
which clause accesses are actually supported by a MDIO driver.
diff mbox series

Patch

diff --git a/drivers/net/phy/mxl-gpy.c b/drivers/net/phy/mxl-gpy.c
index 5ce1bf03bbd7..0c825ec20eaa 100644
--- a/drivers/net/phy/mxl-gpy.c
+++ b/drivers/net/phy/mxl-gpy.c
@@ -99,7 +99,7 @@  static int gpy_probe(struct phy_device *phydev)
 	int ret;
 
 	if (!phydev->is_c45) {
-		ret = phy_get_c45_ids(phydev);
+		ret = phy_get_c45_ids_by_c22(phydev);
 		if (ret < 0)
 			return ret;
 	}
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index c766f5bb421a..43354b261bd5 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -1005,18 +1005,28 @@  void phy_device_remove(struct phy_device *phydev)
 EXPORT_SYMBOL(phy_device_remove);
 
 /**
- * phy_get_c45_ids - Read 802.3-c45 IDs for phy device.
+ * phy_get_c45_ids - Read 802.3-c45 IDs for phy device by using indirect
+ *                   c22 accesses.
  * @phydev: phy_device structure to read 802.3-c45 IDs
  *
  * Returns zero on success, %-EIO on bus access error, or %-ENODEV if
  * the "devices in package" is invalid.
  */
-int phy_get_c45_ids(struct phy_device *phydev)
+int phy_get_c45_ids_by_c22(struct phy_device *phydev)
 {
-	return get_phy_c45_ids(phydev->mdio.bus, phydev->mdio.addr,
-			       &phydev->c45_ids);
+	int ret;
+
+	if (WARN(phydev->is_c45, "PHY is already clause 45\n"))
+		return -EINVAL;
+
+	ret = get_phy_c45_ids(phydev->mdio.bus, phydev->mdio.addr,
+			      &phydev->c45_ids);
+	if (!ret)
+		phydev->is_c45_over_c22 = true;
+
+	return ret;
 }
-EXPORT_SYMBOL(phy_get_c45_ids);
+EXPORT_SYMBOL(phy_get_c45_ids_by_c22);
 
 /**
  * phy_find_first - finds the first PHY device on the bus
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 36ca2b5c2253..eb436d603feb 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -525,6 +525,7 @@  struct macsec_ops;
  * @phy_id: UID for this device found during discovery
  * @c45_ids: 802.3-c45 Device Identifiers if is_c45.
  * @is_c45:  Set to true if this PHY uses clause 45 addressing.
+ * @is_c45_over_c22:  Set to true if this PHY uses c45-over-c22 addressing.
  * @is_internal: Set to true if this PHY is internal to a MAC.
  * @is_pseudo_fixed_link: Set to true if this PHY is an Ethernet switch, etc.
  * @is_gigabit_capable: Set to true if PHY supports 1000Mbps
@@ -606,6 +607,7 @@  struct phy_device {
 
 	struct phy_c45_device_ids c45_ids;
 	unsigned is_c45:1;
+	unsigned is_c45_over_c22:1;
 	unsigned is_internal:1;
 	unsigned is_pseudo_fixed_link:1;
 	unsigned is_gigabit_capable:1;
@@ -1466,7 +1468,7 @@  static inline int phy_device_register(struct phy_device *phy)
 static inline void phy_device_free(struct phy_device *phydev) { }
 #endif /* CONFIG_PHYLIB */
 void phy_device_remove(struct phy_device *phydev);
-int phy_get_c45_ids(struct phy_device *phydev);
+int phy_get_c45_ids_by_c22(struct phy_device *phydev);
 int phy_init_hw(struct phy_device *phydev);
 int phy_suspend(struct phy_device *phydev);
 int phy_resume(struct phy_device *phydev);