diff mbox

DSA: phy polling

Message ID 20150914104254.GI21084@n2100.arm.linux.org.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Russell King - ARM Linux Sept. 14, 2015, 10:42 a.m. UTC
Andrew,

I think you're the current maintainer of the Marvell DSA code, as being
the most recent author of changes to it. :)

I've noticed in my testing that the Marvell DSA code seems to poll the
internal phy link status in mv88e6xxx_poll_link(), and set the network
device carrier status according to the results.

However, the internal phys are created using phylib, which also polls
the phys for their link status, and controls the associated netdev
carrier status.

The side effect of this is that I see duplicated link status messages in
the kernel log when connecting or disconnecting cables from the switch,
caused by the code in mv88e6xxx_poll_link() racing with the phylib code.
From what I can see, the code in mv88e6xxx_poll_link() is entirely
redundant as the phylib layer will take care of any phy attached to the
switch.

To prove this, I have the following code in my tree, which disables the
polling on a port where we have a phy attached (either an internal or
external phy).  The result is that the per-port network devices are still
updated with the link status even though this code is disabled - thanks
to the phylib polling.

I'm left wondering whether the DSA specific phy polling does anything
useful, or whether the entire polling code both in mv88e6xxx.c and
net/dsa can be removed (mv88e6xxx.c seems to be its only user.)

Comments

Florian Fainelli Sept. 14, 2015, 5:28 p.m. UTC | #1
On 14/09/15 03:42, Russell King - ARM Linux wrote:
> Andrew,
> 
> I think you're the current maintainer of the Marvell DSA code, as being
> the most recent author of changes to it. :)
> 
> I've noticed in my testing that the Marvell DSA code seems to poll the
> internal phy link status in mv88e6xxx_poll_link(), and set the network
> device carrier status according to the results.
> 
> However, the internal phys are created using phylib, which also polls
> the phys for their link status, and controls the associated netdev
> carrier status.
> 
> The side effect of this is that I see duplicated link status messages in
> the kernel log when connecting or disconnecting cables from the switch,
> caused by the code in mv88e6xxx_poll_link() racing with the phylib code.
> From what I can see, the code in mv88e6xxx_poll_link() is entirely
> redundant as the phylib layer will take care of any phy attached to the
> switch.
> 
> To prove this, I have the following code in my tree, which disables the
> polling on a port where we have a phy attached (either an internal or
> external phy).  The result is that the per-port network devices are still
> updated with the link status even though this code is disabled - thanks
> to the phylib polling.
> 
> I'm left wondering whether the DSA specific phy polling does anything
> useful, or whether the entire polling code both in mv88e6xxx.c and
> net/dsa can be removed (mv88e6xxx.c seems to be its only user.)

Just my 2 cents here, I suspect the original intention behind this code
was to help utilize the switch's built-in PHY polling unit when
available, and use the HW to collect the state of all PHYs in fewer
register to read, instead of having to do individual (and quite possibly
expensive) MDIO reads towards each individual per-port PHYs (at least
two reads per PHY to latch MII_BMSR).

Now, I do agree there is a duplication of functionality here, and a
potential fix would be to avoid starting the PHY state machine if/when
the switch supports such a feature (not call phy_start*), that should
still get you consistent consistent link partner advertised/status
values, question is, does that really benefit anybody though?

> 
> diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
> index 26ec2fbfaa89..4c324eafeef2 100644
> --- a/drivers/net/dsa/mv88e6xxx.c
> +++ b/drivers/net/dsa/mv88e6xxx.c
> @@ -400,6 +400,13 @@ void mv88e6xxx_poll_link(struct dsa_switch *ds)
>  		if (dev == NULL)
>  			continue;
>  
> +		/*
> +		 * Ignore ports which have a phy; phylib will take care
> +		 * of polling the link status for these.
> +		 */
> +		if (dsa_slave_has_phy(dev))
> +			continue;
> +
>  		link = 0;
>  		if (dev->flags & IFF_UP) {
>  			port_status = mv88e6xxx_reg_read(ds, REG_PORT(i),
> diff --git a/include/net/dsa.h b/include/net/dsa.h
> index fbca63ba8f73..b31e9da43ea7 100644
> --- a/include/net/dsa.h
> +++ b/include/net/dsa.h
> @@ -176,6 +176,8 @@ static inline bool dsa_is_port_initialized(struct dsa_switch *ds, int p)
>  	return ds->phys_port_mask & (1 << p) && ds->ports[p];
>  }
>  
> +extern bool dsa_slave_has_phy(struct net_device *);
> +
>  static inline u8 dsa_upstream_port(struct dsa_switch *ds)
>  {
>  	struct dsa_switch_tree *dst = ds->dst;
> diff --git a/net/dsa/slave.c b/net/dsa/slave.c
> index 35c47ddd04f0..a107242816ff 100644
> --- a/net/dsa/slave.c
> +++ b/net/dsa/slave.c
> @@ -10,6 +10,7 @@
>  
>  #include <linux/list.h>
>  #include <linux/etherdevice.h>
> +#include <linux/export.h>
>  #include <linux/netdevice.h>
>  #include <linux/phy.h>
>  #include <linux/phy_fixed.h>
> @@ -873,6 +874,14 @@ int dsa_slave_resume(struct net_device *slave_dev)
>  	return 0;
>  }
>  
> +bool dsa_slave_has_phy(struct net_device *slave_dev)
> +{
> +	struct dsa_slave_priv *p = netdev_priv(slave_dev);
> +
> +	return p->phy != NULL;
> +}
> +EXPORT_SYMBOL_GPL(dsa_slave_has_phy);
> +
>  int dsa_slave_create(struct dsa_switch *ds, struct device *parent,
>  		     int port, char *name)
>  {
> 
>
Russell King - ARM Linux Sept. 14, 2015, 6:23 p.m. UTC | #2
On Mon, Sep 14, 2015 at 10:28:55AM -0700, Florian Fainelli wrote:
> Just my 2 cents here, I suspect the original intention behind this code
> was to help utilize the switch's built-in PHY polling unit when
> available, and use the HW to collect the state of all PHYs in fewer
> register to read, instead of having to do individual (and quite possibly
> expensive) MDIO reads towards each individual per-port PHYs (at least
> two reads per PHY to latch MII_BMSR).

Does the Marvell phy have such a register?  Looking at the register
dump and plugging/unplugging cables seems not to show a register
reporting whether any particular interface has changed state, and
I haven't noticed there being any combined register in anything I've
seen on these switches.

> Now, I do agree there is a duplication of functionality here, and a
> potential fix would be to avoid starting the PHY state machine if/when
> the switch supports such a feature (not call phy_start*), that should
> still get you consistent consistent link partner advertised/status
> values, question is, does that really benefit anybody though?

I disagree - it's the DSA polling that needs to go.  The DSA polling
only looks at the port status, and derives from it the carrier
state.  The rest of the information is only turned into a printk().
The PHY state machine does a lot more, recording the link speed so
that ethtool works on the interface.

If we do want to go the other way, then the phy code needs a rework so
that it can be properly classed and drivers with non-standard MII
registers supported without needing to build register emulation layers.
Florian Fainelli Sept. 14, 2015, 10:41 p.m. UTC | #3
On 14/09/15 11:23, Russell King - ARM Linux wrote:
> On Mon, Sep 14, 2015 at 10:28:55AM -0700, Florian Fainelli wrote:
>> Just my 2 cents here, I suspect the original intention behind this code
>> was to help utilize the switch's built-in PHY polling unit when
>> available, and use the HW to collect the state of all PHYs in fewer
>> register to read, instead of having to do individual (and quite possibly
>> expensive) MDIO reads towards each individual per-port PHYs (at least
>> two reads per PHY to latch MII_BMSR).
> 
> Does the Marvell phy have such a register?  Looking at the register
> dump and plugging/unplugging cables seems not to show a register
> reporting whether any particular interface has changed state, and
> I haven't noticed there being any combined register in anything I've
> seen on these switches.

It seemed to me like the PPU was meant to provide that, but I cannot
find any "summary" register which would give you such a status, must
have conflated that with what Broadcom switches support.

> 
>> Now, I do agree there is a duplication of functionality here, and a
>> potential fix would be to avoid starting the PHY state machine if/when
>> the switch supports such a feature (not call phy_start*), that should
>> still get you consistent consistent link partner advertised/status
>> values, question is, does that really benefit anybody though?
> 
> I disagree - it's the DSA polling that needs to go.  The DSA polling
> only looks at the port status, and derives from it the carrier
> state.  The rest of the information is only turned into a printk().
> The PHY state machine does a lot more, recording the link speed so
> that ethtool works on the interface.

I am fine with that approach, it was not exactly clear to me before
reading the code whether the link polling workqueue was doing anything
useful in mv88e6xxx.c, now it is pretty clear to me, this is as
expensive (more actually because of the PPU get/put) and useless since
the PHY library directly polls the individual per-port PHYs.

> 
> If we do want to go the other way, then the phy code needs a rework so
> that it can be properly classed and drivers with non-standard MII
> registers supported without needing to build register emulation layers.
> 

That part is going to be challenging, all the ethtool/PHY library/MII
code is built around the assumption of translating user-configurable
settings into standard MII calls (all the adv_to* etc.), but, then
again, MII is just one possible translation layer here, you could "plug"
another one if your HW supports that (e.g: non-MDIO, but MMIO for
instance which understands basic concepts like speed/link/duplex/pause).

Oh well.
Peter Korsgaard Sept. 15, 2015, 6:32 a.m. UTC | #4
>>>>> "Florian" == Florian Fainelli <f.fainelli@gmail.com> writes:

Hi,

 >> Does the Marvell phy have such a register?  Looking at the register
 >> dump and plugging/unplugging cables seems not to show a register
 >> reporting whether any particular interface has changed state, and
 >> I haven't noticed there being any combined register in anything I've
 >> seen on these switches.

 > It seemed to me like the PPU was meant to provide that, but I cannot
 > find any "summary" register which would give you such a status, must
 > have conflated that with what Broadcom switches support.

It's been some years, but I think the PPU was just to automatically
configure the switch MACs to to match the phy autonegotiation results.

I also don't see any summery register though.
Andrew Lunn Sept. 22, 2015, 2:14 a.m. UTC | #5
On Mon, Sep 14, 2015 at 11:42:54AM +0100, Russell King - ARM Linux wrote:
> Andrew,
> 
> I think you're the current maintainer of the Marvell DSA code, as being
> the most recent author of changes to it. :)

Hi Russell

Sorry for the slow reply, i've been on vacation.

Humm, i suppose i might be the defacto Maintainer for Marvell parts,
but i've no NDA with Marvell, so no access to the data sheets etc.
 
> I've noticed in my testing that the Marvell DSA code seems to poll the
> internal phy link status in mv88e6xxx_poll_link(), and set the network
> device carrier status according to the results.

Peter Korsgaard comment might be correct, the switch needs to know
what the PHY has negotiated. Hence the use of the PPU. There are also
comments in the code that the PPU is needed for indirect access to the
PHY.

So we probably need to keep the PPU, but disable it from changing the
networks stacks idea of the link state, etc.

I will add this to my TODO list to play with it.

  Andrew
diff mbox

Patch

diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
index 26ec2fbfaa89..4c324eafeef2 100644
--- a/drivers/net/dsa/mv88e6xxx.c
+++ b/drivers/net/dsa/mv88e6xxx.c
@@ -400,6 +400,13 @@  void mv88e6xxx_poll_link(struct dsa_switch *ds)
 		if (dev == NULL)
 			continue;
 
+		/*
+		 * Ignore ports which have a phy; phylib will take care
+		 * of polling the link status for these.
+		 */
+		if (dsa_slave_has_phy(dev))
+			continue;
+
 		link = 0;
 		if (dev->flags & IFF_UP) {
 			port_status = mv88e6xxx_reg_read(ds, REG_PORT(i),
diff --git a/include/net/dsa.h b/include/net/dsa.h
index fbca63ba8f73..b31e9da43ea7 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -176,6 +176,8 @@  static inline bool dsa_is_port_initialized(struct dsa_switch *ds, int p)
 	return ds->phys_port_mask & (1 << p) && ds->ports[p];
 }
 
+extern bool dsa_slave_has_phy(struct net_device *);
+
 static inline u8 dsa_upstream_port(struct dsa_switch *ds)
 {
 	struct dsa_switch_tree *dst = ds->dst;
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 35c47ddd04f0..a107242816ff 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -10,6 +10,7 @@ 
 
 #include <linux/list.h>
 #include <linux/etherdevice.h>
+#include <linux/export.h>
 #include <linux/netdevice.h>
 #include <linux/phy.h>
 #include <linux/phy_fixed.h>
@@ -873,6 +874,14 @@  int dsa_slave_resume(struct net_device *slave_dev)
 	return 0;
 }
 
+bool dsa_slave_has_phy(struct net_device *slave_dev)
+{
+	struct dsa_slave_priv *p = netdev_priv(slave_dev);
+
+	return p->phy != NULL;
+}
+EXPORT_SYMBOL_GPL(dsa_slave_has_phy);
+
 int dsa_slave_create(struct dsa_switch *ds, struct device *parent,
 		     int port, char *name)
 {