Message ID | YrWi5oBFn7vR15BH@shell.armlinux.org.uk (mailing list archive) |
---|---|
Headers | show |
Series | net: dsa: always use phylink | expand |
On Fri, Jun 24, 2022 at 12:41:26PM +0100, Russell King (Oracle) wrote: > Hi, > > Currently, the core DSA code conditionally uses phylink for CPU and DSA > ports depending on whether the firmware specifies a fixed-link or a PHY. > If either of these are specified, then phylink is used for these ports, > otherwise phylink is not, and we rely on the DSA drivers to "do the > right thing". However, this detail is not mentioned in the DT binding, > but Andrew has said that this behaviour has always something that DSA > wants. > > mv88e6xxx has had support for this for a long time with its "SPEED_MAX" > thing, which I recently reworked to make use of the mac_capabilities in > preparation to solving this more fully. > > This series is an experiment to solve this properly, and it does this > in two steps. > > The first step consists of the first two patches. Phylink needs to > know the PHY interface mode that is being used so it can (a) pass the > right mode into the MAC/PCS etc and (b) know the properties of the > link and therefore which speeds can be supported across it. > > In order to achieve this, the DSA phylink_get_caps() method has an > extra argument added to it so that DSA drivers can report the > interface mode that they will be using for this port back to the core > DSA code, thereby allowing phylink to be initialised with the correct > interface mode. > > Note that this can only be used for CPU and DSA ports as "user" ports > need a different behaviour - they rely on getting the interface mode > from phylib, which will only happen if phylink is initialised with > PHY_INTERFACE_MODE_NA. Unfortunately, changing this behaviour is likely > to cause widespread regressions. > > Obvious questions: > 1. Should phylink_get_caps() be augmented in this way, or should it be > a separate method? > > 2. DSA has traditionally used "interface mode for the maximum supported > speed on this port" where the interface mode is programmable (via > its internal port_max_speed_mode() method) but this is only present > for a few of the sub-drivers. Is reporting the current interface > mode correct where this method is not implemented? > > The second step is to introduce a function that allows phylink to be > reconfigured after creation time to operate at max-speed fixed-link > mode for the PHY interface mode, also using the MAC capabilities to > determine the speed and duplex mode we should be using. > > Obvious questions: > 1. Should we be allowing half-duplex for this? > 2. If we do allow half-duplex, should we prefer fastest speed over > duplex setting, or should we prefer fastest full-duplex speed > over any half-duplex? > 3. How do we sanely switch DSA from its current behaviour to always > using phylink for these ports without breakage - this is the > difficult one, because it's not obvious which drivers have been > coded to either work around this quirk of the DSA implementation. > For example, if we start forcing the link down before calling > dsa_port_phylink_create(), and we then fail to set max-fixed-link, > then the CPU/DSA port is going to fail, and we're going to have > lots of regressions. > > Please look at the patches and make suggestions on how we can proceed > to clean up this quirk of DSA. An alternative idea has been put forward by Marek on how to solve this without involving changes to DSA drivers, but everyone would have to fill in the supported_interfaces and mac_capabilities. The suggestion is that DSA calls phylink_set_max_fixed_link(), which looks at the above two fields, and finds an interface which gives the maximum link speed if the interface mode has not been specified. In other words, something like this for phylink_set_max_fixed_link(): interface = pl->link_interface; if (interface != PHY_INTERFACE_MODE_NA) { /* Get the speed/duplex capabilities and reduce according to the * specified interface mode. */ caps = pl->config->mac_capabilities; caps &= phylink_interface_to_caps(interface); } else { interfaces = pl->config->supported_interfaces; max_caps = 0; /* Find the supported interface mode which gives the maximum * speed. */ for (intf = 0; intf < PHY_INTERFACE_MODE_MAX; intf++) { if (test_bit(intf, interfaces)) { caps = pl->config->mac_capabilities; caps &= phylink_interface_to_caps(intf); if (caps > max_caps) { max_caps = caps; interface = intf; } } } caps = max_caps; } caps &= ~(MAC_SYM_PAUSE | MAC_ASYM_PAUSE); /* If there are no capabilities, then we are not using this default. */ if (!caps) return -EINVAL; /* Decode to fastest speed and duplex */ duplex = DUPLEX_UNKNOWN; speed = SPEED_UNKNOWN; for (i = 0; i < ARRAY_SIZE(phylink_caps_speeds); i++) { if (caps & phylink_caps_speeds[i].fd_mask) { duplex = DUPLEX_FULL; speed = phylink_caps_speeds[i].speed; break; } else if (caps & phylink_caps_speeds[i].hd_mask) { duplex = DUPLEX_HALF; speed = phylink_caps_speeds[i].speed; break; } } /* If we didn't find anything, bail. */ if (speed == SPEED_UNKNOWN) return -EINVAL; pl->link_interface = interface; pl->link_config.interface = interface; pl->link_config.speed = speed; pl->link_config.duplex = duplex; pl->link_config.link = 1; pl->cfg_link_an_mode = MLO_AN_FIXED; pl->cur_link_an_mode = MLO_AN_FIXED; This would have the effect of selecting the first interface mode in numerical order that gives us the fastest link speed. I should point out that if a DSA port can be programmed in software to support both SGMII and 1000baseX, this will end up selecting SGMII irrespective of what the hardware was wire-strapped to and how it was initially configured. Do we believe that would be acceptable? Some comments would be really useful on this.
> I should point out that if a DSA port can be programmed in software to > support both SGMII and 1000baseX, this will end up selecting SGMII > irrespective of what the hardware was wire-strapped to and how it was > initially configured. Do we believe that would be acceptable? I'm pretty sure the devel b board has 1000BaseX DSA links between its two switches. Since both should end up SGMII that should be O.K. Where we potentially have issues is 1000BaseX to the CPU. This is not an issue for the Vybrid based boards, since they are fast Ethernet only, but there are some boards with an IMX6 with 1G ethernet. I guess they currently use 1000BaseX, and the CPU side of the link probably has a fixed-link with phy-mode = 1000BaseX. So we might have an issue there. Andrew
On Wed, 29 Jun 2022 09:18:10 +0200 Andrew Lunn <andrew@lunn.ch> wrote: > > I should point out that if a DSA port can be programmed in software to > > support both SGMII and 1000baseX, this will end up selecting SGMII > > irrespective of what the hardware was wire-strapped to and how it was > > initially configured. Do we believe that would be acceptable? > > I'm pretty sure the devel b board has 1000BaseX DSA links between its > two switches. Since both should end up SGMII that should be O.K. > > Where we potentially have issues is 1000BaseX to the CPU. This is not > an issue for the Vybrid based boards, since they are fast Ethernet > only, but there are some boards with an IMX6 with 1G ethernet. I guess > they currently use 1000BaseX, and the CPU side of the link probably > has a fixed-link with phy-mode = 1000BaseX. So we might have an issue > there. If one side of the link (e.g. only the CPU eth interface) has 1000base-x specified in device-tree explicitly, the code should keep it at 1000base-x for the DSA CPU port... Marek
On Wed, Jun 29, 2022 at 11:27:50AM +0200, Marek Behún wrote: > On Wed, 29 Jun 2022 09:18:10 +0200 > Andrew Lunn <andrew@lunn.ch> wrote: > > > > I should point out that if a DSA port can be programmed in software to > > > support both SGMII and 1000baseX, this will end up selecting SGMII > > > irrespective of what the hardware was wire-strapped to and how it was > > > initially configured. Do we believe that would be acceptable? > > > > I'm pretty sure the devel b board has 1000BaseX DSA links between its > > two switches. Since both should end up SGMII that should be O.K. > > > > Where we potentially have issues is 1000BaseX to the CPU. This is not > > an issue for the Vybrid based boards, since they are fast Ethernet > > only, but there are some boards with an IMX6 with 1G ethernet. I guess > > they currently use 1000BaseX, and the CPU side of the link probably > > has a fixed-link with phy-mode = 1000BaseX. So we might have an issue > > there. > > If one side of the link (e.g. only the CPU eth interface) has 1000base-x > specified in device-tree explicitly, the code should keep it at > 1000base-x for the DSA CPU port... So does that mean that, if we don't find a phy-mode property in the cpu port node, we should chase the ethernet property and check there? This seems to be adding functionality that wasn't there before.
On Wed, 29 Jun 2022 10:34:28 +0100 "Russell King (Oracle)" <linux@armlinux.org.uk> wrote: > On Wed, Jun 29, 2022 at 11:27:50AM +0200, Marek Behún wrote: > > On Wed, 29 Jun 2022 09:18:10 +0200 > > Andrew Lunn <andrew@lunn.ch> wrote: > > > > > > I should point out that if a DSA port can be programmed in software to > > > > support both SGMII and 1000baseX, this will end up selecting SGMII > > > > irrespective of what the hardware was wire-strapped to and how it was > > > > initially configured. Do we believe that would be acceptable? > > > > > > I'm pretty sure the devel b board has 1000BaseX DSA links between its > > > two switches. Since both should end up SGMII that should be O.K. > > > > > > Where we potentially have issues is 1000BaseX to the CPU. This is not > > > an issue for the Vybrid based boards, since they are fast Ethernet > > > only, but there are some boards with an IMX6 with 1G ethernet. I guess > > > they currently use 1000BaseX, and the CPU side of the link probably > > > has a fixed-link with phy-mode = 1000BaseX. So we might have an issue > > > there. > > > > If one side of the link (e.g. only the CPU eth interface) has 1000base-x > > specified in device-tree explicitly, the code should keep it at > > 1000base-x for the DSA CPU port... > > So does that mean that, if we don't find a phy-mode property in the cpu > port node, we should chase the ethernet property and check there? This > seems to be adding functionality that wasn't there before. It wasn't there before, but it would make sense IMO. 1. if cpu port has explicit phy-mode, use that 2. otherwise look at the mode defined for peer 3. otherwise try to compute the best possible mode for both peers Marek
On Wed, Jun 29, 2022 at 09:18:10AM +0200, Andrew Lunn wrote: > > I should point out that if a DSA port can be programmed in software to > > support both SGMII and 1000baseX, this will end up selecting SGMII > > irrespective of what the hardware was wire-strapped to and how it was > > initially configured. Do we believe that would be acceptable? > > I'm pretty sure the devel b board has 1000BaseX DSA links between its > two switches. Since both should end up SGMII that should be O.K. Would such a port have a programmable C_Mode, and would it specify that it supports both SGMII and 1000BaseX ? Without going through a lot of boards and documentation for every switch, I can't say. I don't think we can come to any conclusion on what the right way to deal with this actually is - we don't have enough information about how this is used across all the platforms we have. I think we can only try something, get it merged into net-next, and wait to see whether anyone complains. When we have a CPU or DSA port without a fixed-link, phy or sfp specified, I think we should: (a) use the phy-mode property if present, otherwise, (b,i) have the DSA driver return the interface mode that it wants to use for max speed for CPU and DSA ports. (b,ii) in the absence of the DSA driver returning a valid interface mode, we use the supported_interfaces to find an interface which gives the maximum speed (irrespective of duplex?) that falls within the mac capabilities. If all those fail, then things will break, and we will have to wait for people to report that breakage. Does this sound a sane approach, or does anyone have any other suggestions how to solve this?
On Wed, 29 Jun 2022 10:43:23 +0100 "Russell King (Oracle)" <linux@armlinux.org.uk> wrote: > On Wed, Jun 29, 2022 at 09:18:10AM +0200, Andrew Lunn wrote: > > > I should point out that if a DSA port can be programmed in software to > > > support both SGMII and 1000baseX, this will end up selecting SGMII > > > irrespective of what the hardware was wire-strapped to and how it was > > > initially configured. Do we believe that would be acceptable? > > > > I'm pretty sure the devel b board has 1000BaseX DSA links between its > > two switches. Since both should end up SGMII that should be O.K. > > Would such a port have a programmable C_Mode, and would it specify that > it supports both SGMII and 1000BaseX ? Without going through a lot of > boards and documentation for every switch, I can't say. > > I don't think we can come to any conclusion on what the right way to > deal with this actually is - we don't have enough information about how > this is used across all the platforms we have. I think we can only try > something, get it merged into net-next, and wait to see whether anyone > complains. > > When we have a CPU or DSA port without a fixed-link, phy or sfp specified, > I think we should: > (a) use the phy-mode property if present, otherwise, > (b,i) have the DSA driver return the interface mode that it wants to use > for max speed for CPU and DSA ports. > (b,ii) in the absence of the DSA driver returning a valid interface mode, > we use the supported_interfaces to find an interface which gives the > maximum speed (irrespective of duplex?) that falls within the > mac capabilities. > > If all those fail, then things will break, and we will have to wait for > people to report that breakage. Does this sound a sane approach, or > does anyone have any other suggestions how to solve this? It is a sane approach. But in the future I think we should get rid of (b,i): I always considered the max_speed_interface() method a temporary solution, until the drivers report what a specific port support and the subsystem can then choose whichever mode it wants that is wired and supported by hardware. Then we could also make it possible to change the CPU interface mode via ethtool, which would be cool... Marek
On Wed, Jun 29, 2022 at 12:10:20PM +0200, Marek Behún wrote: > On Wed, 29 Jun 2022 10:43:23 +0100 > "Russell King (Oracle)" <linux@armlinux.org.uk> wrote: > > > On Wed, Jun 29, 2022 at 09:18:10AM +0200, Andrew Lunn wrote: > > > > I should point out that if a DSA port can be programmed in software to > > > > support both SGMII and 1000baseX, this will end up selecting SGMII > > > > irrespective of what the hardware was wire-strapped to and how it was > > > > initially configured. Do we believe that would be acceptable? > > > > > > I'm pretty sure the devel b board has 1000BaseX DSA links between its > > > two switches. Since both should end up SGMII that should be O.K. > > > > Would such a port have a programmable C_Mode, and would it specify that > > it supports both SGMII and 1000BaseX ? Without going through a lot of > > boards and documentation for every switch, I can't say. > > > > I don't think we can come to any conclusion on what the right way to > > deal with this actually is - we don't have enough information about how > > this is used across all the platforms we have. I think we can only try > > something, get it merged into net-next, and wait to see whether anyone > > complains. > > > > When we have a CPU or DSA port without a fixed-link, phy or sfp specified, > > I think we should: > > (a) use the phy-mode property if present, otherwise, > > (b,i) have the DSA driver return the interface mode that it wants to use > > for max speed for CPU and DSA ports. > > (b,ii) in the absence of the DSA driver returning a valid interface mode, > > we use the supported_interfaces to find an interface which gives the > > maximum speed (irrespective of duplex?) that falls within the > > mac capabilities. > > > > If all those fail, then things will break, and we will have to wait for > > people to report that breakage. Does this sound a sane approach, or > > does anyone have any other suggestions how to solve this? > > It is a sane approach. But in the future I think we should get rid of > (b,i): I always considered the max_speed_interface() method a temporary > solution, until the drivers report what a specific port support and the > subsystem can then choose whichever mode it wants that is wired and > supported by hardware. Then we could also make it possible to change > the CPU interface mode via ethtool, which would be cool... I can remotely test clearfog, which seems to do the right thing: [ 5.707839] mv88e6085 f1072004.mdio-mii:04: sif=21 if=21(1000base-x) cap=bd [ 5.715114] mv88e6085 f1072004.mdio-mii:04: configuring for fixed/1000base-x link mode meaning that the supported interfaces (sif) mask only contains 1000base-x, phylink_create() was called with (if) 1000base-x, and the capabilities (cap) indicates 1000-fd, 100-(h,f)d, and 10-(h,f)d. I don't think port 5 on the 88e6176 can support any other modes, so this isn't a particularly good test. My ZII boards aren't powered up so can't test those with the extra debugging print. I'll cut a new RFC which includes the debug print so folk can try it out.