diff mbox series

[net-next] net: phylink: require supported_interfaces to be filled

Message ID E1q0K1u-006EIP-ET@rmk-PC.armlinux.org.uk (mailing list archive)
State Accepted
Commit de5c9bf40c4582729f64f66d9cf4920d50beb897
Delegated to: Netdev Maintainers
Headers show
Series [net-next] net: phylink: require supported_interfaces to be filled | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 8 this patch: 8
netdev/cc_maintainers success CCed 8 of 8 maintainers
netdev/build_clang success Errors and warnings before: 8 this patch: 8
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 8 this patch: 8
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 44 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Russell King (Oracle) May 20, 2023, 10:41 a.m. UTC
We have been requiring the supported_interfaces bitmap to be filled in
by MAC drivers that have a mac_select_pcs() method. Now that all MAC
drivers fill in the supported_interfaces bitmap, it is time to enforce
this. We have already required supported_interfaces to be set in order
for optical SFPs to be configured in commit f81fa96d8a6c ("net: phylink:
use phy_interface_t bitmaps for optical modules").

Refuse phylink creation if supported_interfaces is empty, and remove
code to deal with cases where this mask is empty.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
---
I believe what I've said above is indeed the case, but there is always
the chance that something has been missed and this will cause breakage.
I would post as RFC and ask for testing, but in my experience that is
a complete waste of time as it doesn't result in any testing feedback.
So, it's probably better to get it merged into net-next and then wait
for any reports of breakage.

 drivers/net/phy/phylink.c | 26 +++++++++++---------------
 1 file changed, 11 insertions(+), 15 deletions(-)

Comments

Andrew Lunn May 21, 2023, 3:23 p.m. UTC | #1
On Sat, May 20, 2023 at 11:41:42AM +0100, Russell King (Oracle) wrote:
> We have been requiring the supported_interfaces bitmap to be filled in
> by MAC drivers that have a mac_select_pcs() method. Now that all MAC
> drivers fill in the supported_interfaces bitmap, it is time to enforce
> this. We have already required supported_interfaces to be set in order
> for optical SFPs to be configured in commit f81fa96d8a6c ("net: phylink:
> use phy_interface_t bitmaps for optical modules").
> 
> Refuse phylink creation if supported_interfaces is empty, and remove
> code to deal with cases where this mask is empty.
> 
> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

> I believe what I've said above is indeed the case, but there is always
> the chance that something has been missed and this will cause breakage.
> I would post as RFC and ask for testing, but in my experience that is
> a complete waste of time as it doesn't result in any testing feedback.
> So, it's probably better to get it merged into net-next and then wait
> for any reports of breakage.

Agreed.

    Andrew
patchwork-bot+netdevbpf@kernel.org May 23, 2023, 2:20 a.m. UTC | #2
Hello:

This patch was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Sat, 20 May 2023 11:41:42 +0100 you wrote:
> We have been requiring the supported_interfaces bitmap to be filled in
> by MAC drivers that have a mac_select_pcs() method. Now that all MAC
> drivers fill in the supported_interfaces bitmap, it is time to enforce
> this. We have already required supported_interfaces to be set in order
> for optical SFPs to be configured in commit f81fa96d8a6c ("net: phylink:
> use phy_interface_t bitmaps for optical modules").
> 
> [...]

Here is the summary with links:
  - [net-next] net: phylink: require supported_interfaces to be filled
    https://git.kernel.org/netdev/net-next/c/de5c9bf40c45

You are awesome, thank you!
Greg Ungerer Nov. 21, 2023, 2:19 p.m. UTC | #3
Hi Russell,

On 20/5/23 20:41, Russell King (Oracle) wrote:
> We have been requiring the supported_interfaces bitmap to be filled in
> by MAC drivers that have a mac_select_pcs() method. Now that all MAC
> drivers fill in the supported_interfaces bitmap, it is time to enforce
> this. We have already required supported_interfaces to be set in order
> for optical SFPs to be configured in commit f81fa96d8a6c ("net: phylink:
> use phy_interface_t bitmaps for optical modules").
> 
> Refuse phylink creation if supported_interfaces is empty, and remove
> code to deal with cases where this mask is empty.
> 
> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> ---
> I believe what I've said above is indeed the case, but there is always
> the chance that something has been missed and this will cause breakage.
> I would post as RFC and ask for testing, but in my experience that is
> a complete waste of time as it doesn't result in any testing feedback.
> So, it's probably better to get it merged into net-next and then wait
> for any reports of breakage.

This commit breaks a platform I have with a Marvell 88e6350 switch.
During boot up it now fails with:

     ...
     mv88e6085 d0072004.mdio-mii:11: switch 0x3710 detected: Marvell 88E6350, revision 2
     mv88e6085 d0072004.mdio-mii:11: phylink: error: empty supported_interfaces
     error creating PHYLINK: -22
     mv88e6085: probe of d0072004.mdio-mii:11 failed with error -22
     ...

The 6350 looks to be similar to the 6352 in many respects, though it lacks
a SERDES interface, but it otherwise mostly seems compatible. Using the 6352
phylink_get_caps function instead of the 6185 one fixes this:

--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -5418,7 +5418,7 @@ static const struct mv88e6xxx_ops mv88e6350_ops = {
         .set_max_frame_size = mv88e6185_g1_set_max_frame_size,
         .stu_getnext = mv88e6352_g1_stu_getnext,
         .stu_loadpurge = mv88e6352_g1_stu_loadpurge,
-       .phylink_get_caps = mv88e6185_phylink_get_caps,
+       .phylink_get_caps = mv88e6352_phylink_get_caps,
  };

  static const struct mv88e6xxx_ops mv88e6351_ops = {


The story doesn't quite end here though. With this fix in place support
for the 6350 is then again broken by commit b92143d4420f ("net: dsa:
mv88e6xxx: add infrastructure for phylink_pcs"). This results in a dump
on boot up:

     ...
     mv88e6085 d0072004.mdio-mii:11: switch 0x3710 detected: Marvell 88E6350, revision 2
     8<--- cut here ---
     Unable to handle kernel NULL pointer dereference at virtual address 00000000 when read
     [00000000] *pgd=00000000
     Internal error: Oops: 5 [#1] ARM
     Modules linked in:
     CPU: 0 PID: 8 Comm: kworker/u2:0 Not tainted 6.7.0-rc2-dirty #26
     Hardware name: Marvell Armada 370/XP (Device Tree)
     Workqueue: events_unbound deferred_probe_work_func
     PC is at mv88e6xxx_port_setup+0x1c/0x44
     LR is at dsa_port_devlink_setup+0x74/0x154
     pc : [<c057ea24>]    lr : [<c0819598>]    psr: a0000013
     sp : c184fce0  ip : c542b8f4  fp : 00000000
     r10: 00000001  r9 : c542a540  r8 : c542bc00
     r7 : c542b838  r6 : c5244580  r5 : 00000005  r4 : c5244580
     r3 : 00000000  r2 : c542b840  r1 : 00000005  r0 : c1a02040
     ...

I can see that the mv88e6350_ops struct doesn't have an initializer for
pcs_ops. Based on the similarity with the 6352 again I tried using the
mv88e6352_pcs_ops for that:

--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -5069,7 +5069,8 @@ static const struct mv88e6xxx_ops mv88e6350_ops = {
         .vtu_loadpurge = mv88e6352_g1_vtu_loadpurge,
         .stu_getnext = mv88e6352_g1_stu_getnext,
         .stu_loadpurge = mv88e6352_g1_stu_loadpurge,
-       .phylink_get_caps = mv88e6185_phylink_get_caps,
+       .phylink_get_caps = mv88e6352_phylink_get_caps,
+       .pcs_ops = &mv88e6352_pcs_ops,
  };


This gets the 6350 switch back to working again (on current linux 6.7-rc2).
I am not entirely sure if this needs a dedicated phylink_get_caps
and pcs_ops for the 6350, or if it is safe to use the 6352 ones?

Looking at the mv88e6351_ops I am guessing it is going to suffer the
same problems too.

Regards
Greg



>   drivers/net/phy/phylink.c | 26 +++++++++++---------------
>   1 file changed, 11 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
> index a4dd5197355a..093b7b6e0263 100644
> --- a/drivers/net/phy/phylink.c
> +++ b/drivers/net/phy/phylink.c
> @@ -712,14 +712,11 @@ static int phylink_validate(struct phylink *pl, unsigned long *supported,
>   {
>   	const unsigned long *interfaces = pl->config->supported_interfaces;
>   
> -	if (!phy_interface_empty(interfaces)) {
> -		if (state->interface == PHY_INTERFACE_MODE_NA)
> -			return phylink_validate_mask(pl, supported, state,
> -						     interfaces);
> +	if (state->interface == PHY_INTERFACE_MODE_NA)
> +		return phylink_validate_mask(pl, supported, state, interfaces);
>   
> -		if (!test_bit(state->interface, interfaces))
> -			return -EINVAL;
> -	}
> +	if (!test_bit(state->interface, interfaces))
> +		return -EINVAL;
>   
>   	return phylink_validate_mac_and_pcs(pl, supported, state);
>   }
> @@ -1513,19 +1510,18 @@ struct phylink *phylink_create(struct phylink_config *config,
>   	struct phylink *pl;
>   	int ret;
>   
> -	if (mac_ops->mac_select_pcs &&
> -	    mac_ops->mac_select_pcs(config, PHY_INTERFACE_MODE_NA) !=
> -	      ERR_PTR(-EOPNOTSUPP))
> -		using_mac_select_pcs = true;
> -
>   	/* Validate the supplied configuration */
> -	if (using_mac_select_pcs &&
> -	    phy_interface_empty(config->supported_interfaces)) {
> +	if (phy_interface_empty(config->supported_interfaces)) {
>   		dev_err(config->dev,
> -			"phylink: error: empty supported_interfaces but mac_select_pcs() method present\n");
> +			"phylink: error: empty supported_interfaces\n");
>   		return ERR_PTR(-EINVAL);
>   	}
>   
> +	if (mac_ops->mac_select_pcs &&
> +	    mac_ops->mac_select_pcs(config, PHY_INTERFACE_MODE_NA) !=
> +	      ERR_PTR(-EOPNOTSUPP))
> +		using_mac_select_pcs = true;
> +
>   	pl = kzalloc(sizeof(*pl), GFP_KERNEL);
>   	if (!pl)
>   		return ERR_PTR(-ENOMEM);
Andrew Lunn Nov. 21, 2023, 2:29 p.m. UTC | #4
> The 6350 looks to be similar to the 6352 in many respects, though it lacks
> a SERDES interface, but it otherwise mostly seems compatible.

Not having the SERDES is important. Without that SERDES, the bit about
Port 4 in mv88e6352_phylink_get_caps() is
incorrect. mv88e61852_phylink_get_caps() looks reasonable for this
hardware.

> Using the 6352
> phylink_get_caps function instead of the 6185 one fixes this:
> 
> --- a/drivers/net/dsa/mv88e6xxx/chip.c
> +++ b/drivers/net/dsa/mv88e6xxx/chip.c
> @@ -5418,7 +5418,7 @@ static const struct mv88e6xxx_ops mv88e6350_ops = {
>         .set_max_frame_size = mv88e6185_g1_set_max_frame_size,
>         .stu_getnext = mv88e6352_g1_stu_getnext,
>         .stu_loadpurge = mv88e6352_g1_stu_loadpurge,
> -       .phylink_get_caps = mv88e6185_phylink_get_caps,
> +       .phylink_get_caps = mv88e6352_phylink_get_caps,
>  };
> 
>  static const struct mv88e6xxx_ops mv88e6351_ops = {
> 
> 
> The story doesn't quite end here though. With this fix in place support
> for the 6350 is then again broken by commit b92143d4420f ("net: dsa:
> mv88e6xxx: add infrastructure for phylink_pcs"). This results in a dump
> on boot up:

PCS is approximately another name of a SERDES. Since there is no
SERDES, you don't don't want any of the pcs ops filled in.

Russell knows this code much better than i do. Let see what he says.

	Andrew
Russell King (Oracle) Nov. 21, 2023, 2:41 p.m. UTC | #5
On Wed, Nov 22, 2023 at 12:19:38AM +1000, Greg Ungerer wrote:
> Hi Russell,
> 
> On 20/5/23 20:41, Russell King (Oracle) wrote:
> > We have been requiring the supported_interfaces bitmap to be filled in
> > by MAC drivers that have a mac_select_pcs() method. Now that all MAC
> > drivers fill in the supported_interfaces bitmap, it is time to enforce
> > this. We have already required supported_interfaces to be set in order
> > for optical SFPs to be configured in commit f81fa96d8a6c ("net: phylink:
> > use phy_interface_t bitmaps for optical modules").
> > 
> > Refuse phylink creation if supported_interfaces is empty, and remove
> > code to deal with cases where this mask is empty.
> > 
> > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> > ---
> > I believe what I've said above is indeed the case, but there is always
> > the chance that something has been missed and this will cause breakage.
> > I would post as RFC and ask for testing, but in my experience that is
> > a complete waste of time as it doesn't result in any testing feedback.
> > So, it's probably better to get it merged into net-next and then wait
> > for any reports of breakage.
> 
> This commit breaks a platform I have with a Marvell 88e6350 switch.
> During boot up it now fails with:
> 
>     ...
>     mv88e6085 d0072004.mdio-mii:11: switch 0x3710 detected: Marvell 88E6350, revision 2
>     mv88e6085 d0072004.mdio-mii:11: phylink: error: empty supported_interfaces
>     error creating PHYLINK: -22
>     mv88e6085: probe of d0072004.mdio-mii:11 failed with error -22
>     ...
> 
> The 6350 looks to be similar to the 6352 in many respects, though it lacks
> a SERDES interface, but it otherwise mostly seems compatible. Using the 6352
> phylink_get_caps function instead of the 6185 one fixes this:
> 
> --- a/drivers/net/dsa/mv88e6xxx/chip.c
> +++ b/drivers/net/dsa/mv88e6xxx/chip.c
> @@ -5418,7 +5418,7 @@ static const struct mv88e6xxx_ops mv88e6350_ops = {
>         .set_max_frame_size = mv88e6185_g1_set_max_frame_size,
>         .stu_getnext = mv88e6352_g1_stu_getnext,
>         .stu_loadpurge = mv88e6352_g1_stu_loadpurge,
> -       .phylink_get_caps = mv88e6185_phylink_get_caps,
> +       .phylink_get_caps = mv88e6352_phylink_get_caps,
>  };

Based on what you say, this is probably correct, but I've no idea as I
don't have any data on the 88e6350.

> The story doesn't quite end here though. With this fix in place support
> for the 6350 is then again broken by commit b92143d4420f ("net: dsa:
> mv88e6xxx: add infrastructure for phylink_pcs"). This results in a dump
> on boot up:
> 
>     ...
>     mv88e6085 d0072004.mdio-mii:11: switch 0x3710 detected: Marvell 88E6350, revision 2
>     8<--- cut here ---
>     Unable to handle kernel NULL pointer dereference at virtual address 00000000 when read
>     [00000000] *pgd=00000000
>     Internal error: Oops: 5 [#1] ARM
>     Modules linked in:
>     CPU: 0 PID: 8 Comm: kworker/u2:0 Not tainted 6.7.0-rc2-dirty #26
>     Hardware name: Marvell Armada 370/XP (Device Tree)
>     Workqueue: events_unbound deferred_probe_work_func
>     PC is at mv88e6xxx_port_setup+0x1c/0x44
>     LR is at dsa_port_devlink_setup+0x74/0x154
>     pc : [<c057ea24>]    lr : [<c0819598>]    psr: a0000013
>     sp : c184fce0  ip : c542b8f4  fp : 00000000
>     r10: 00000001  r9 : c542a540  r8 : c542bc00
>     r7 : c542b838  r6 : c5244580  r5 : 00000005  r4 : c5244580
>     r3 : 00000000  r2 : c542b840  r1 : 00000005  r0 : c1a02040
>     ...
> 
> I can see that the mv88e6350_ops struct doesn't have an initializer for
> pcs_ops.

You mentioned that the 6350 doesn't have serdes, so there should be no
PCS. mv88e6xxx_port_setup() probably should be doing:

-	if (chip->info->ops->pcs_ops->pcs_init) {
+	if (chip->info->ops->pcs_ops &&
+	    chip->info->ops->pcs_ops->pcs_init) {

> This gets the 6350 switch back to working again (on current linux 6.7-rc2).
> I am not entirely sure if this needs a dedicated phylink_get_caps
> and pcs_ops for the 6350, or if it is safe to use the 6352 ones?

Without knowing what the 6350 cmode values are, I can't comment.

> Looking at the mv88e6351_ops I am guessing it is going to suffer the
> same problems too.

Again, it's a similar problem - I have no information for the 6351
so I could only make guesses.
Greg Ungerer Nov. 22, 2023, 4:12 a.m. UTC | #6
On 22/11/23 00:29, Andrew Lunn wrote:
>> The 6350 looks to be similar to the 6352 in many respects, though it lacks
>> a SERDES interface, but it otherwise mostly seems compatible.
> 
> Not having the SERDES is important. Without that SERDES, the bit about
> Port 4 in mv88e6352_phylink_get_caps() is
> incorrect. mv88e61852_phylink_get_caps() looks reasonable for this
> hardware.
              ^^^^^^^^^^
The problem with mv88e6185_phylink_get_caps() is the cmode check fails
for me. For my 6350 hardware chip->ports[port].cmode is "9", so set to
MV88E6XXX_PORT_STS_CMODE_1000BASEX. But that is not part of the defines
used in mv88e6185_phy_interface_modes[].

Doesn't it need to be checking in mv88e6xxx_phy_interface_modes[]
for the cmode?

I see another similar function, mv88e6250_phylink_get_caps().
But that is only 10/100 capable.

So I am thinking that something like this actually makes more sense now:

--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -577,6 +577,18 @@ static void mv88e6250_phylink_get_caps(struct mv88e6xxx_chip *chip, int port,
         config->mac_capabilities = MAC_SYM_PAUSE | MAC_10 | MAC_100;
  }
  
+static void mv88e6350_phylink_get_caps(struct mv88e6xxx_chip *chip, int port,
+                                      struct phylink_config *config)
+{
+       unsigned long *supported = config->supported_interfaces;
+
+       /* Translate the default cmode */
+       mv88e6xxx_translate_cmode(chip->ports[port].cmode, supported);
+
+       config->mac_capabilities = MAC_SYM_PAUSE | MAC_10 | MAC_100 |
+                                  MAC_1000FD;
+}
+
  static int mv88e6352_get_port4_serdes_cmode(struct mv88e6xxx_chip *chip)
  {
         u16 reg, val;
@@ -5069,7 +5082,7 @@ static const struct mv88e6xxx_ops mv88e6350_ops = {
         .vtu_loadpurge = mv88e6352_g1_vtu_loadpurge,
         .stu_getnext = mv88e6352_g1_stu_getnext,
         .stu_loadpurge = mv88e6352_g1_stu_loadpurge,
-       .phylink_get_caps = mv88e6185_phylink_get_caps,
+       .phylink_get_caps = mv88e6350_phylink_get_caps,
  };
  
  static const struct mv88e6xxx_ops mv88e6351_ops = {
@@ -5117,7 +5130,7 @@ static const struct mv88e6xxx_ops mv88e6351_ops = {
         .stu_loadpurge = mv88e6352_g1_stu_loadpurge,
         .avb_ops = &mv88e6352_avb_ops,
         .ptp_ops = &mv88e6352_ptp_ops,
-       .phylink_get_caps = mv88e6185_phylink_get_caps,
+       .phylink_get_caps = mv88e6350_phylink_get_caps,
  };
  
  static const struct mv88e6xxx_ops mv88e6352_ops = {


>> Using the 6352
>> phylink_get_caps function instead of the 6185 one fixes this:
>>
>> --- a/drivers/net/dsa/mv88e6xxx/chip.c
>> +++ b/drivers/net/dsa/mv88e6xxx/chip.c
>> @@ -5418,7 +5418,7 @@ static const struct mv88e6xxx_ops mv88e6350_ops = {
>>          .set_max_frame_size = mv88e6185_g1_set_max_frame_size,
>>          .stu_getnext = mv88e6352_g1_stu_getnext,
>>          .stu_loadpurge = mv88e6352_g1_stu_loadpurge,
>> -       .phylink_get_caps = mv88e6185_phylink_get_caps,
>> +       .phylink_get_caps = mv88e6352_phylink_get_caps,
>>   };
>>
>>   static const struct mv88e6xxx_ops mv88e6351_ops = {
>>
>>
>> The story doesn't quite end here though. With this fix in place support
>> for the 6350 is then again broken by commit b92143d4420f ("net: dsa:
>> mv88e6xxx: add infrastructure for phylink_pcs"). This results in a dump
>> on boot up:
> 
> PCS is approximately another name of a SERDES. Since there is no
> SERDES, you don't don't want any of the pcs ops filled in.
> 
> Russell knows this code much better than i do. Let see what he says.

Ok, that makes sense. Russell had a suggestion for this one and I will
follow up with that.

Thanks
Greg
Greg Ungerer Nov. 22, 2023, 4:41 a.m. UTC | #7
On 22/11/23 00:41, Russell King (Oracle) wrote:
> On Wed, Nov 22, 2023 at 12:19:38AM +1000, Greg Ungerer wrote:
>> Hi Russell,
>>
>> On 20/5/23 20:41, Russell King (Oracle) wrote:
>>> We have been requiring the supported_interfaces bitmap to be filled in
>>> by MAC drivers that have a mac_select_pcs() method. Now that all MAC
>>> drivers fill in the supported_interfaces bitmap, it is time to enforce
>>> this. We have already required supported_interfaces to be set in order
>>> for optical SFPs to be configured in commit f81fa96d8a6c ("net: phylink:
>>> use phy_interface_t bitmaps for optical modules").
>>>
>>> Refuse phylink creation if supported_interfaces is empty, and remove
>>> code to deal with cases where this mask is empty.
>>>
>>> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
>>> ---
>>> I believe what I've said above is indeed the case, but there is always
>>> the chance that something has been missed and this will cause breakage.
>>> I would post as RFC and ask for testing, but in my experience that is
>>> a complete waste of time as it doesn't result in any testing feedback.
>>> So, it's probably better to get it merged into net-next and then wait
>>> for any reports of breakage.
>>
>> This commit breaks a platform I have with a Marvell 88e6350 switch.
>> During boot up it now fails with:
>>
>>      ...
>>      mv88e6085 d0072004.mdio-mii:11: switch 0x3710 detected: Marvell 88E6350, revision 2
>>      mv88e6085 d0072004.mdio-mii:11: phylink: error: empty supported_interfaces
>>      error creating PHYLINK: -22
>>      mv88e6085: probe of d0072004.mdio-mii:11 failed with error -22
>>      ...
>>
>> The 6350 looks to be similar to the 6352 in many respects, though it lacks
>> a SERDES interface, but it otherwise mostly seems compatible. Using the 6352
>> phylink_get_caps function instead of the 6185 one fixes this:
>>
>> --- a/drivers/net/dsa/mv88e6xxx/chip.c
>> +++ b/drivers/net/dsa/mv88e6xxx/chip.c
>> @@ -5418,7 +5418,7 @@ static const struct mv88e6xxx_ops mv88e6350_ops = {
>>          .set_max_frame_size = mv88e6185_g1_set_max_frame_size,
>>          .stu_getnext = mv88e6352_g1_stu_getnext,
>>          .stu_loadpurge = mv88e6352_g1_stu_loadpurge,
>> -       .phylink_get_caps = mv88e6185_phylink_get_caps,
>> +       .phylink_get_caps = mv88e6352_phylink_get_caps,
>>   };
> 
> Based on what you say, this is probably correct, but I've no idea as I
> don't have any data on the 88e6350.

I am thinking a dedicated 6350 phylink_get_caps may be better here now,
since the 6350 does not have a SERDES lane like the 6352.


>> The story doesn't quite end here though. With this fix in place support
>> for the 6350 is then again broken by commit b92143d4420f ("net: dsa:
>> mv88e6xxx: add infrastructure for phylink_pcs"). This results in a dump
>> on boot up:
>>
>>      ...
>>      mv88e6085 d0072004.mdio-mii:11: switch 0x3710 detected: Marvell 88E6350, revision 2
>>      8<--- cut here ---
>>      Unable to handle kernel NULL pointer dereference at virtual address 00000000 when read
>>      [00000000] *pgd=00000000
>>      Internal error: Oops: 5 [#1] ARM
>>      Modules linked in:
>>      CPU: 0 PID: 8 Comm: kworker/u2:0 Not tainted 6.7.0-rc2-dirty #26
>>      Hardware name: Marvell Armada 370/XP (Device Tree)
>>      Workqueue: events_unbound deferred_probe_work_func
>>      PC is at mv88e6xxx_port_setup+0x1c/0x44
>>      LR is at dsa_port_devlink_setup+0x74/0x154
>>      pc : [<c057ea24>]    lr : [<c0819598>]    psr: a0000013
>>      sp : c184fce0  ip : c542b8f4  fp : 00000000
>>      r10: 00000001  r9 : c542a540  r8 : c542bc00
>>      r7 : c542b838  r6 : c5244580  r5 : 00000005  r4 : c5244580
>>      r3 : 00000000  r2 : c542b840  r1 : 00000005  r0 : c1a02040
>>      ...
>>
>> I can see that the mv88e6350_ops struct doesn't have an initializer for
>> pcs_ops.
> 
> You mentioned that the 6350 doesn't have serdes, so there should be no
> PCS. mv88e6xxx_port_setup() probably should be doing:
> 
> -	if (chip->info->ops->pcs_ops->pcs_init) {
> +	if (chip->info->ops->pcs_ops &&
> +	    chip->info->ops->pcs_ops->pcs_init) {

Yes, that probably makes more sense.
With that change in place the probing works again, and does not need
a pcs_ops entry for the 6350.

Thanks
Greg



>> This gets the 6350 switch back to working again (on current linux 6.7-rc2).
>> I am not entirely sure if this needs a dedicated phylink_get_caps
>> and pcs_ops for the 6350, or if it is safe to use the 6352 ones?
> 
> Without knowing what the 6350 cmode values are, I can't comment.
> 
>> Looking at the mv88e6351_ops I am guessing it is going to suffer the
>> same problems too.
> 
> Again, it's a similar problem - I have no information for the 6351
> so I could only make guesses.
>
Russell King (Oracle) Nov. 22, 2023, 9:27 a.m. UTC | #8
On Wed, Nov 22, 2023 at 02:12:44PM +1000, Greg Ungerer wrote:
> So I am thinking that something like this actually makes more sense now:
> 
> --- a/drivers/net/dsa/mv88e6xxx/chip.c
> +++ b/drivers/net/dsa/mv88e6xxx/chip.c
> @@ -577,6 +577,18 @@ static void mv88e6250_phylink_get_caps(struct mv88e6xxx_chip *chip, int port,
>         config->mac_capabilities = MAC_SYM_PAUSE | MAC_10 | MAC_100;
>  }
> +static void mv88e6350_phylink_get_caps(struct mv88e6xxx_chip *chip, int port,
> +                                      struct phylink_config *config)
> +{
> +       unsigned long *supported = config->supported_interfaces;
> +
> +       /* Translate the default cmode */
> +       mv88e6xxx_translate_cmode(chip->ports[port].cmode, supported);
> +
> +       config->mac_capabilities = MAC_SYM_PAUSE | MAC_10 | MAC_100 |
> +                                  MAC_1000FD;
> +}
> +

Looks sensible to me - but I do notice that a black line has been lost
between mv88e6250_phylink_get_caps() and your new function - probably
down to your email client being stupid with whitespace because it's
broken the patch context. Just be aware of that when you come to send
the patch for real.
Greg Ungerer Nov. 22, 2023, 1:10 p.m. UTC | #9
On 22/11/23 19:27, Russell King (Oracle) wrote:
> On Wed, Nov 22, 2023 at 02:12:44PM +1000, Greg Ungerer wrote:
>> So I am thinking that something like this actually makes more sense now:
>>
>> --- a/drivers/net/dsa/mv88e6xxx/chip.c
>> +++ b/drivers/net/dsa/mv88e6xxx/chip.c
>> @@ -577,6 +577,18 @@ static void mv88e6250_phylink_get_caps(struct mv88e6xxx_chip *chip, int port,
>>          config->mac_capabilities = MAC_SYM_PAUSE | MAC_10 | MAC_100;
>>   }
>> +static void mv88e6350_phylink_get_caps(struct mv88e6xxx_chip *chip, int port,
>> +                                      struct phylink_config *config)
>> +{
>> +       unsigned long *supported = config->supported_interfaces;
>> +
>> +       /* Translate the default cmode */
>> +       mv88e6xxx_translate_cmode(chip->ports[port].cmode, supported);
>> +
>> +       config->mac_capabilities = MAC_SYM_PAUSE | MAC_10 | MAC_100 |
>> +                                  MAC_1000FD;
>> +}
>> +
> 
> Looks sensible to me - but I do notice that a black line has been lost
> between mv88e6250_phylink_get_caps() and your new function - probably
> down to your email client being stupid with whitespace because it's
> broken the patch context. Just be aware of that when you come to send
> the patch for real.

Oh, yes, for sure. The above was a cut and paste into email client.
I'll send proper patches via git send-email soon.

Regards
Greg
diff mbox series

Patch

diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
index a4dd5197355a..093b7b6e0263 100644
--- a/drivers/net/phy/phylink.c
+++ b/drivers/net/phy/phylink.c
@@ -712,14 +712,11 @@  static int phylink_validate(struct phylink *pl, unsigned long *supported,
 {
 	const unsigned long *interfaces = pl->config->supported_interfaces;
 
-	if (!phy_interface_empty(interfaces)) {
-		if (state->interface == PHY_INTERFACE_MODE_NA)
-			return phylink_validate_mask(pl, supported, state,
-						     interfaces);
+	if (state->interface == PHY_INTERFACE_MODE_NA)
+		return phylink_validate_mask(pl, supported, state, interfaces);
 
-		if (!test_bit(state->interface, interfaces))
-			return -EINVAL;
-	}
+	if (!test_bit(state->interface, interfaces))
+		return -EINVAL;
 
 	return phylink_validate_mac_and_pcs(pl, supported, state);
 }
@@ -1513,19 +1510,18 @@  struct phylink *phylink_create(struct phylink_config *config,
 	struct phylink *pl;
 	int ret;
 
-	if (mac_ops->mac_select_pcs &&
-	    mac_ops->mac_select_pcs(config, PHY_INTERFACE_MODE_NA) !=
-	      ERR_PTR(-EOPNOTSUPP))
-		using_mac_select_pcs = true;
-
 	/* Validate the supplied configuration */
-	if (using_mac_select_pcs &&
-	    phy_interface_empty(config->supported_interfaces)) {
+	if (phy_interface_empty(config->supported_interfaces)) {
 		dev_err(config->dev,
-			"phylink: error: empty supported_interfaces but mac_select_pcs() method present\n");
+			"phylink: error: empty supported_interfaces\n");
 		return ERR_PTR(-EINVAL);
 	}
 
+	if (mac_ops->mac_select_pcs &&
+	    mac_ops->mac_select_pcs(config, PHY_INTERFACE_MODE_NA) !=
+	      ERR_PTR(-EOPNOTSUPP))
+		using_mac_select_pcs = true;
+
 	pl = kzalloc(sizeof(*pl), GFP_KERNEL);
 	if (!pl)
 		return ERR_PTR(-ENOMEM);