Message ID | 20220723164635.1621911-1-vladimir.oltean@nxp.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net-next] net: dsa: validate that DT nodes of shared ports have the properties they need | expand |
Hi Vladimir
I think this is a first good step.
> +static const char * const dsa_switches_skipping_validation[] = {
One thing to consider is do we want to go one step further and make
this dsa_switches_dont_enforce_validation
Meaning always run the validation, and report what is not valid, but
don't enforce with an -EINVAL for switches on the list?
Maybe it is too early for that, we first need to submit patches to the
mainline DT files to fixes those we can?
Looking at the mv88e6xxx instances, adding fixed-links should not be
too hard. What might be harder is the phy-mode, in particular, what
RGMII delay should be specified.
Andrew
Hi Vladimir,
I love your patch! Yet something to improve:
[auto build test ERROR on net-next/master]
url: https://github.com/intel-lab-lkp/linux/commits/Vladimir-Oltean/net-dsa-validate-that-DT-nodes-of-shared-ports-have-the-properties-they-need/20220724-004902
base: https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git 502c6f8cedcce7889ccdefeb88ce36b39acd522f
config: i386-randconfig-a005 (https://download.01.org/0day-ci/archive/20220724/202207240531.wQJyi2N1-lkp@intel.com/config)
compiler: gcc-11 (Debian 11.3.0-3) 11.3.0
reproduce (this is a W=1 build):
# https://github.com/intel-lab-lkp/linux/commit/67a281495504b3a0a6d5575d7b79121efb83f785
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Vladimir-Oltean/net-dsa-validate-that-DT-nodes-of-shared-ports-have-the-properties-they-need/20220724-004902
git checkout 67a281495504b3a0a6d5575d7b79121efb83f785
# save the config file
mkdir build_dir && cp config build_dir/.config
make W=1 O=build_dir ARCH=i386 SHELL=/bin/bash
If you fix the issue, kindly add following tag where applicable
Reported-by: kernel test robot <lkp@intel.com>
All errors (new ones prefixed by >>, old ones prefixed by <<):
>> ERROR: modpost: "of_device_compatible_match" [net/dsa/dsa_core.ko] undefined!
On Sat, Jul 23, 2022 at 07:46:35PM +0300, Vladimir Oltean wrote: > Russell King points out that the phylink_pcs conversion of DSA drivers > he is working on is likely to break drivers, because of the movement of > code that didn't use to depend on phylink towards phylink callbacks. <snip> > In this change we add central validation in DSA for the OF properties > required by phylink, in an attempt to sanitize the environment for > future driver writers, and as much as possible for existing driver > maintainers. <snip> > realtek > ~~~~~~~ > > compatible strings: > - realtek,rtl8366rb > - realtek,rtl8365mb > > 2 occurrences in mainline, both descriptions are fine, additionally > rtl8365mb.c has a comment "The device tree firmware should also > specify the link partner of the extension port - either via a > fixed-link or other phy-handle." > > Verdict: opt into validation. Acked-by: Alvin Šipraga <alsi@bang-olufsen.dk>
On Sat, Jul 23, 2022 at 08:09:34PM +0200, Andrew Lunn wrote: > Hi Vladimir > > I think this is a first good step. > > > +static const char * const dsa_switches_skipping_validation[] = { > > One thing to consider is do we want to go one step further and make > this dsa_switches_dont_enforce_validation > > Meaning always run the validation, and report what is not valid, but > don't enforce with an -EINVAL for switches on the list? Can do. The question is what are our prospects of eventually getting rid of incompletely specified DT blobs. If they're likely to stay around forever, maybe printing with dev_err() doesn't sound like such a great idea? I know what's said in Documentation/devicetree/bindings/net/dsa/marvell.txt about not putting a DT blob somewhere where you can't update it, but will that be the case for everyone? Florian, I think some bcm_sf2 device trees are pretty much permanent, based on some of your past commits? > Maybe it is too early for that, we first need to submit patches to the > mainline DT files to fixes those we can? > > Looking at the mv88e6xxx instances, adding fixed-links should not be > too hard. What might be harder is the phy-mode, in particular, what > RGMII delay should be specified. Since DT blobs and kernels have essentially separate lifetimes, I'm thinking it doesn't matter too much if we first fix the mainline DT blobs or not; it's not like that would avoid users seeing errors. Anyway I'm thinking it would be useful in general to verbally resolve some of the incomplete DT descriptions I've pointed out here. This would be a good indication whether we can add automatic logic that comes to the same resolution at least for all known cases.
On Sun, Jul 24, 2022 at 02:21:59PM +0000, Vladimir Oltean wrote: > On Sat, Jul 23, 2022 at 08:09:34PM +0200, Andrew Lunn wrote: > > Hi Vladimir > > > > I think this is a first good step. > > > > > +static const char * const dsa_switches_skipping_validation[] = { > > > > One thing to consider is do we want to go one step further and make > > this dsa_switches_dont_enforce_validation > > > > Meaning always run the validation, and report what is not valid, but > > don't enforce with an -EINVAL for switches on the list? > > Can do. The question is what are our prospects of eventually getting rid > of incompletely specified DT blobs. If they're likely to stay around > forever, maybe printing with dev_err() doesn't sound like such a great > idea? That is a question we need to ask ourselves. Do we want to continue to live with this unhappy situation, or do we want to try to spend some time to make it better. I _think_ i can produce reasonably safe patches for mv88e6xxx DT descriptions in mainline. Maybe once that is done, we can have the mv88e6xxx compatibles warn but don't error out, so encouraging out of tree blobs to be updated. And unlike bcm_sf2, i've not heard of boards where the blob cannot be update. Andrew
Le 24/07/2022 à 07:21, Vladimir Oltean a écrit : > On Sat, Jul 23, 2022 at 08:09:34PM +0200, Andrew Lunn wrote: >> Hi Vladimir >> >> I think this is a first good step. >> >>> +static const char * const dsa_switches_skipping_validation[] = { >> >> One thing to consider is do we want to go one step further and make >> this dsa_switches_dont_enforce_validation >> >> Meaning always run the validation, and report what is not valid, but >> don't enforce with an -EINVAL for switches on the list? > > Can do. The question is what are our prospects of eventually getting rid > of incompletely specified DT blobs. If they're likely to stay around > forever, maybe printing with dev_err() doesn't sound like such a great > idea? > > I know what's said in Documentation/devicetree/bindings/net/dsa/marvell.txt > about not putting a DT blob somewhere where you can't update it, but > will that be the case for everyone? Florian, I think some bcm_sf2 device > trees are pretty much permanent, based on some of your past commits? The Device Tree blob is provided and runtime populated by the bootloader, so we can definitively make changes and missing properties are always easy to add as long as we can keep a reasonable window of time (2 to 3 years seems to be about the right window) for people to update their systems. FWIW, all of the bcm_sf2 based systems have had a fixed-link property for the CPU port. > >> Maybe it is too early for that, we first need to submit patches to the >> mainline DT files to fixes those we can? >> >> Looking at the mv88e6xxx instances, adding fixed-links should not be >> too hard. What might be harder is the phy-mode, in particular, what >> RGMII delay should be specified. > > Since DT blobs and kernels have essentially separate lifetimes, I'm > thinking it doesn't matter too much if we first fix the mainline DT > blobs or not; it's not like that would avoid users seeing errors. > > Anyway I'm thinking it would be useful in general to verbally resolve > some of the incomplete DT descriptions I've pointed out here. This would > be a good indication whether we can add automatic logic that comes to > the same resolution at least for all known cases. Agreed.
On Sun, Jul 24, 2022 at 09:59:14AM -0700, Florian Fainelli wrote: > Le 24/07/2022 à 07:21, Vladimir Oltean a écrit : > > On Sat, Jul 23, 2022 at 08:09:34PM +0200, Andrew Lunn wrote: > > > Hi Vladimir > > > > > > I think this is a first good step. > > > > > > > +static const char * const dsa_switches_skipping_validation[] = { > > > > > > One thing to consider is do we want to go one step further and make > > > this dsa_switches_dont_enforce_validation > > > > > > Meaning always run the validation, and report what is not valid, but > > > don't enforce with an -EINVAL for switches on the list? > > > > Can do. The question is what are our prospects of eventually getting rid > > of incompletely specified DT blobs. If they're likely to stay around > > forever, maybe printing with dev_err() doesn't sound like such a great > > idea? > > > > I know what's said in Documentation/devicetree/bindings/net/dsa/marvell.txt > > about not putting a DT blob somewhere where you can't update it, but > > will that be the case for everyone? Florian, I think some bcm_sf2 device > > trees are pretty much permanent, based on some of your past commits? > > The Device Tree blob is provided and runtime populated by the bootloader, so > we can definitively make changes and missing properties are always easy to > add as long as we can keep a reasonable window of time (2 to 3 years seems > to be about the right window) for people to update their systems. FWIW, all > of the bcm_sf2 based systems have had a fixed-link property for the CPU > port. Ok, so does this mean I can remove these from dsa_switches_dont_enforce_validation? #if IS_ENABLED(CONFIG_NET_DSA_BCM_SF2) "brcm,bcm7445-switch-v4.0", "brcm,bcm7278-switch-v4.0", "brcm,bcm7278-switch-v4.8", #endif > > > Maybe it is too early for that, we first need to submit patches to the > > > mainline DT files to fixes those we can? > > > > > > Looking at the mv88e6xxx instances, adding fixed-links should not be > > > too hard. What might be harder is the phy-mode, in particular, what > > > RGMII delay should be specified. > > > > Since DT blobs and kernels have essentially separate lifetimes, I'm > > thinking it doesn't matter too much if we first fix the mainline DT > > blobs or not; it's not like that would avoid users seeing errors. > > > > Anyway I'm thinking it would be useful in general to verbally resolve > > some of the incomplete DT descriptions I've pointed out here. This would > > be a good indication whether we can add automatic logic that comes to > > the same resolution at least for all known cases. > > Agreed. Ok, so let's start with b53? Correct me if I'm wrong, I'm just looking at code and it's been a while since I've transitioned my drivers from adjust_link. Essentially since b53 still implements b53_adjust_link, this makes dsa_port_link_register_of() (what is called for DSA_PORT_TYPE_CPU and DSA_PORT_TYPE_DSA) take the following route: int dsa_port_link_register_of(struct dsa_port *dp) { struct dsa_switch *ds = dp->ds; struct device_node *phy_np; int port = dp->index; if (!ds->ops->adjust_link) { // this is false, b53 has adjust_link phy_np = of_parse_phandle(dp->dn, "phy-handle", 0); if (of_phy_is_fixed_link(dp->dn) || phy_np) { if (ds->ops->phylink_mac_link_down) ds->ops->phylink_mac_link_down(ds, port, MLO_AN_FIXED, PHY_INTERFACE_MODE_NA); of_node_put(phy_np); return dsa_port_phylink_register(dp); } of_node_put(phy_np); return 0; } // as a result, we never register with phylink for CPU/DSA ports, Russell's logic is avoided dev_warn(ds->dev, "Using legacy PHYLIB callbacks. Please migrate to PHYLINK!\n"); // you do see this warning if (of_phy_is_fixed_link(dp->dn)) return dsa_port_fixed_link_register_of(dp); else return dsa_port_setup_phy_of(dp, true); } So one of dsa_port_fixed_link_register_of() or dsa_port_setup_phy_of() is going to get called in your case. If you have a fixed-link in your device tree, dsa_port_fixed_link_register_of() will fake a call to adjust_link() with the fixed PHY that has its phydev->interface populated based on phy-mode (if missing, this defaults to NA). The b53_adjust_link() function cares about phydev->interface only to the extent of checking for RGMII delays, otherwise it doesn't matter that the phy-mode is missing (arch/arm/boot/dts/bcm47094-linksys-panamera.dts), for practical purposes. If your description is missing a fixed-link (arch/arm/boot/dts/bcm47081-buffalo-wzr-600dhp2.dts), the other function will get called, dsa_port_setup_phy_of(). Essentially this will call dsa_port_get_phy_device(), which will return NULL, so we will exit early, do nothing and return 0. Right? So b53 is going to be unaffected by Russell's changes, due to it still implementing adjust_link. Now on to the device trees, let's imagine for a second you'll actually delete b53_adjust_link: arch/arm/boot/dts/bcm47094-linksys-panamera.dts - lacks phy-mode phylink will call b53_srab_phylink_get_caps() to determine the maximum supported interface. b53_srab_phylink_get_caps() will circularly look at its current interface, p->mode (which is PHY_INTERFACE_MODE_NA, right? because we lack a phy-mode), and not populate config->supported_interfaces with anything. So what is the expected phy-mode here? phylink couldn't find it :) I think this is a case where the b53 driver would need to be patched to report a default_interface for ports 5, 7 and 8 of brcm,bcm53012-srab. arch/arm/boot/dts/bcm47189-tenda-ac9.dts - lacks phy-mode and fixed-link If my above logic was correct, things here are even worse, because when phylink can't determine the supported_interfaces (no phy-mode), it can't determine the speed for the fixed-link either. arch/arm/boot/dts/bcm47081-luxul-xap-1410.dts arch/arm/boot/dts/bcm47081-luxul-xwr-1200.dts arch/arm/boot/dts/bcm47081-buffalo-wzr-600dhp2.dts - lacks phy-mode and fixed-link arch/arm/boot/dts/bcm47094-luxul-xbr-4500.dts arch/arm/boot/dts/bcm4708-smartrg-sr400ac.dts arch/arm/boot/dts/bcm4708-luxul-xap-1510.dts arch/arm/boot/dts/bcm953012er.dts arch/arm/boot/dts/bcm4708-netgear-r6250.dts arch/arm/boot/dts/bcm4708-buffalo-wzr-1166dhp-common.dtsi arch/arm/boot/dts/bcm4708-luxul-xwc-1000.dts arch/arm/boot/dts/bcm47094-luxul-abr-4500.dts - lacks phy-mode and fixed-link arch/arm/boot/dts/bcm53016-meraki-mr32.dts - lacks phy-mode I guess that the bottom line is that the b53 driver is safe due to adjust_link. Beyond that, it's up to you how much you want to polish things up. It's going to be quite a bit of work to bring everything up to date.
On 7/23/22 09:46, Vladimir Oltean wrote: [snip] > +static int dsa_shared_port_validate_of_node(struct dsa_port *dp, > + const char *description) > +{ > + struct device_node *dn = dp->dn, *phy_np; > + struct dsa_switch *ds = dp->ds; > + phy_interface_t mode; > + > + /* Suppress validation if using platform data */ > + if (!dn) > + return 0; > + > + if (of_device_compatible_match(ds->dev->of_node, > + dsa_switches_skipping_validation)) > + return 0; > + > + if (of_get_phy_mode(dn, &mode)) { > + dev_err(ds->dev, > + "%s port %d lacks the required \"phy-mode\" property\n", > + description, dp->index); > + return -EINVAL; > + } > + > + phy_np = of_parse_phandle(dn, "phy-handle", 0); > + if (phy_np) { > + of_node_put(phy_np); > + return 0; > + } > + > + /* Note: of_phy_is_fixed_link() also returns true for > + * managed = "in-band-status" > + */ > + if (of_phy_is_fixed_link(dn)) > + return 0; To echo back my reply from the other email here, if we look beyond the rabbit hole and also attempt to parse these properties from the device_node pointed to us by the "ethernet" property, then I think we can trim down the list, and we still have some assurance that we can use phylink and a fixed link property, except we have to replicate the information from the CPU-port connected Ethernet controller. > + > + /* TODO support SFP cages on DSA/CPU ports, > + * here and in dsa_port_link_register_of() > + */ > + dev_err(ds->dev, > + "%s port %d lacks the required \"phy-handle\", \"fixed-link\" or \"managed\" properties\n", > + description, dp->index); > + > + return -EINVAL; > +} > + > static int dsa_port_parse_user(struct dsa_port *dp, const char *name) > { > if (!name) > @@ -1373,6 +1534,12 @@ static int dsa_port_parse_user(struct dsa_port *dp, const char *name) > > static int dsa_port_parse_dsa(struct dsa_port *dp) > { > + int err; > + > + err = dsa_shared_port_validate_of_node(dp, "DSA"); > + if (err) > + return err; > + > dp->type = DSA_PORT_TYPE_DSA; Move up the assignment so you can just read the type from dsa_shared_port_validate_of_node()? > > return 0; > @@ -1411,6 +1578,11 @@ static int dsa_port_parse_cpu(struct dsa_port *dp, struct net_device *master, > struct dsa_switch_tree *dst = ds->dst; > const struct dsa_device_ops *tag_ops; > enum dsa_tag_protocol default_proto; > + int err; > + > + err = dsa_shared_port_validate_of_node(dp, "CPU"); > + if (err) > + return err; Likewise, I don't think there are adverse effects of moving up the dp->type assignment all the way to the top? > > /* Find out which protocol the switch would prefer. */ > default_proto = dsa_get_tag_protocol(dp, master);
On 7/24/22 13:28, Vladimir Oltean wrote: > On Sun, Jul 24, 2022 at 09:59:14AM -0700, Florian Fainelli wrote: >> Le 24/07/2022 à 07:21, Vladimir Oltean a écrit : >>> On Sat, Jul 23, 2022 at 08:09:34PM +0200, Andrew Lunn wrote: >>>> Hi Vladimir >>>> >>>> I think this is a first good step. >>>> >>>>> +static const char * const dsa_switches_skipping_validation[] = { >>>> >>>> One thing to consider is do we want to go one step further and make >>>> this dsa_switches_dont_enforce_validation >>>> >>>> Meaning always run the validation, and report what is not valid, but >>>> don't enforce with an -EINVAL for switches on the list? >>> >>> Can do. The question is what are our prospects of eventually getting rid >>> of incompletely specified DT blobs. If they're likely to stay around >>> forever, maybe printing with dev_err() doesn't sound like such a great >>> idea? >>> >>> I know what's said in Documentation/devicetree/bindings/net/dsa/marvell.txt >>> about not putting a DT blob somewhere where you can't update it, but >>> will that be the case for everyone? Florian, I think some bcm_sf2 device >>> trees are pretty much permanent, based on some of your past commits? >> >> The Device Tree blob is provided and runtime populated by the bootloader, so >> we can definitively make changes and missing properties are always easy to >> add as long as we can keep a reasonable window of time (2 to 3 years seems >> to be about the right window) for people to update their systems. FWIW, all >> of the bcm_sf2 based systems have had a fixed-link property for the CPU >> port. > > Ok, so does this mean I can remove these from dsa_switches_dont_enforce_validation? > > #if IS_ENABLED(CONFIG_NET_DSA_BCM_SF2) > "brcm,bcm7445-switch-v4.0", > "brcm,bcm7278-switch-v4.0", > "brcm,bcm7278-switch-v4.8", > #endif The DTB that is being passed just specifies port 8 with its 'reg' and 'label = "cpu"' and 'ethernet' phandle to the Ethernet controller node. The Ethernet controller node does specify the correct 'phy-mode' and 'fixed-link' properties, so I suppose we can turn on validation if we do support such an use case of "looking beyond" what the CPU port node provided us and replicating that into the CPU-port local phylink creation. There are actually many DTSes in that situation, now there are a few cases where we do have some interesting combinations: - internal switches are typically wired up over GMII or some internal fabric, but the end result is that the ports sort of "automatically" work provided that you have the same link parameters on both side, either explicitly or just by having the hardware and or driver default to some sane settings. In that case, omitting 'fixed-link' on the CPU port node seems reasonable provided that we can retrieve that same piece information from looking at the 'ethernet' or 'link' phandle properties and use that to configure the phylink instance on the CPU/DSA port(s) - external switches do require providing information on either side of the link, especially for 'phy-mode' (especially true with RGMII), however 'fixed-link' could be somehow omitted if we just matched what the CPU/DSA port has > >>>> Maybe it is too early for that, we first need to submit patches to the >>>> mainline DT files to fixes those we can? >>>> >>>> Looking at the mv88e6xxx instances, adding fixed-links should not be >>>> too hard. What might be harder is the phy-mode, in particular, what >>>> RGMII delay should be specified. >>> >>> Since DT blobs and kernels have essentially separate lifetimes, I'm >>> thinking it doesn't matter too much if we first fix the mainline DT >>> blobs or not; it's not like that would avoid users seeing errors. >>> >>> Anyway I'm thinking it would be useful in general to verbally resolve >>> some of the incomplete DT descriptions I've pointed out here. This would >>> be a good indication whether we can add automatic logic that comes to >>> the same resolution at least for all known cases. >> >> Agreed. > > Ok, so let's start with b53? > > Correct me if I'm wrong, I'm just looking at code and it's been a while > since I've transitioned my drivers from adjust_link. > > Essentially since b53 still implements b53_adjust_link, this makes > dsa_port_link_register_of() (what is called for DSA_PORT_TYPE_CPU and > DSA_PORT_TYPE_DSA) take the following route: > > int dsa_port_link_register_of(struct dsa_port *dp) > { > struct dsa_switch *ds = dp->ds; > struct device_node *phy_np; > int port = dp->index; > > if (!ds->ops->adjust_link) { // this is false, b53 has adjust_link > phy_np = of_parse_phandle(dp->dn, "phy-handle", 0); > if (of_phy_is_fixed_link(dp->dn) || phy_np) { > if (ds->ops->phylink_mac_link_down) > ds->ops->phylink_mac_link_down(ds, port, > MLO_AN_FIXED, PHY_INTERFACE_MODE_NA); > of_node_put(phy_np); > return dsa_port_phylink_register(dp); > } > of_node_put(phy_np); > return 0; > } // as a result, we never register with phylink for CPU/DSA ports, Russell's logic is avoided > > dev_warn(ds->dev, > "Using legacy PHYLIB callbacks. Please migrate to PHYLINK!\n"); // you do see this warning > > if (of_phy_is_fixed_link(dp->dn)) > return dsa_port_fixed_link_register_of(dp); > else > return dsa_port_setup_phy_of(dp, true); > } > > So one of dsa_port_fixed_link_register_of() or dsa_port_setup_phy_of() > is going to get called in your case. > > If you have a fixed-link in your device tree, dsa_port_fixed_link_register_of() > will fake a call to adjust_link() with the fixed PHY that has its > phydev->interface populated based on phy-mode (if missing, this defaults to NA). > The b53_adjust_link() function cares about phydev->interface only to the > extent of checking for RGMII delays, otherwise it doesn't matter that > the phy-mode is missing (arch/arm/boot/dts/bcm47094-linksys-panamera.dts), > for practical purposes. > > If your description is missing a fixed-link (arch/arm/boot/dts/bcm47081-buffalo-wzr-600dhp2.dts), > the other function will get called, dsa_port_setup_phy_of(). > Essentially this will call dsa_port_get_phy_device(), which will return > NULL, so we will exit early, do nothing and return 0. Right? > > So b53 is going to be unaffected by Russell's changes, due to it still > implementing adjust_link. Pretty much yes. > > > > Now on to the device trees, let's imagine for a second you'll actually > delete b53_adjust_link: > > arch/arm/boot/dts/bcm47094-linksys-panamera.dts > - lacks phy-mode > > phylink will call b53_srab_phylink_get_caps() to determine the maximum > supported interface. b53_srab_phylink_get_caps() will circularly look at > its current interface, p->mode (which is PHY_INTERFACE_MODE_NA, right? > because we lack a phy-mode), and not populate config->supported_interfaces > with anything. > > So what is the expected phy-mode here? phylink couldn't find it :) > I think this is a case where the b53 driver would need to be patched to > report a default_interface for ports 5, 7 and 8 of brcm,bcm53012-srab. SRAB means the switch is internal to the SoC, so GMII would be an appropriate PHY interface if we cared so much as to representation *exactly* how the RTL was designed, but that could also be substituted by PHY_INTERFACE_MODE_INTERNAL or PHY_INTERFACE_MODE_NA IMHO, as long as we match the CPU port's 'fixed-link' property. > > arch/arm/boot/dts/bcm47189-tenda-ac9.dts > - lacks phy-mode and fixed-link > > If my above logic was correct, things here are even worse, because when > phylink can't determine the supported_interfaces (no phy-mode), it can't > determine the speed for the fixed-link either. > > arch/arm/boot/dts/bcm47081-luxul-xap-1410.dts > arch/arm/boot/dts/bcm47081-luxul-xwr-1200.dts > arch/arm/boot/dts/bcm47081-buffalo-wzr-600dhp2.dts > - lacks phy-mode and fixed-link > arch/arm/boot/dts/bcm47094-luxul-xbr-4500.dts > arch/arm/boot/dts/bcm4708-smartrg-sr400ac.dts > arch/arm/boot/dts/bcm4708-luxul-xap-1510.dts > arch/arm/boot/dts/bcm953012er.dts > arch/arm/boot/dts/bcm4708-netgear-r6250.dts > arch/arm/boot/dts/bcm4708-buffalo-wzr-1166dhp-common.dtsi > arch/arm/boot/dts/bcm4708-luxul-xwc-1000.dts > arch/arm/boot/dts/bcm47094-luxul-abr-4500.dts > - lacks phy-mode and fixed-link > arch/arm/boot/dts/bcm53016-meraki-mr32.dts > - lacks phy-mode > > I guess that the bottom line is that the b53 driver is safe due to adjust_link. > Beyond that, it's up to you how much you want to polish things up. > It's going to be quite a bit of work to bring everything up to date. The plan is still to remove adjust_link, the same plan that has been not worked on for years, so do quote me on that and remind me again in 4 years. That said, the bgmac driver does register a fixed link for its CPU port, sort of "on its own" if it does find one. This is largely historical due to the driver having started on MIPS-based chips where there was no DT to describe anything. I do think that we should however fix these various DTSes to have the necessary properties, and it should not be an issue because those devices do not have any bootloader awareness of DT and the DTB is always shipped with the kernel image that OpenWrt/Buildroot uses do install.
Hi Vladimir, On Sat, Jul 23, 2022 at 6:47 PM Vladimir Oltean <vladimir.oltean@nxp.com> wrote: [...] > lantiq_gswip > ~~~~~~~~~~~~ > > compatible strings: > - lantiq,xrx200-gswip > - lantiq,xrx300-gswip > - lantiq,xrx330-gswip > > No occurrences in mainline device trees. > > Verdict: opt out of validation, because we don't know. Downstream OpenWrt uses the same format as documented in Documentation/devicetree/bindings/net/dsa/lantiq-gswip.txt A fixed-link is not present, meaning that it's good to opt out for now. I added a TODO so we can fix the .dts downstream in OpenWrt. Upstreaming the .dts files from downstream is also on this ever growing list... Best regards, Martin
Hi Vladimir, On Sat Jul 23 2022, Vladimir Oltean wrote: > hellcreek > ~~~~~~~~~ > > compatible strings: > - hirschmann,hellcreek-de1soc-r1 > > No occurrence in mainline device trees, we don't know. > > Verdict: opt out of validation. The device tree lacks phy-mode and fixed link. Anyway, I'm going to fix the downstream dts now. Thanks, Kurt
diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c index cac48a741f27..f1ffdbdfd6a1 100644 --- a/net/dsa/dsa2.c +++ b/net/dsa/dsa2.c @@ -1360,6 +1360,167 @@ static struct dsa_port *dsa_port_touch(struct dsa_switch *ds, int index) return dp; } +/* During the initial DSA driver migration to OF, port nodes were sometimes + * added to device trees with no indication of how they should operate from a + * link management perspective (phy-handle, fixed-link, etc). Additionally, the + * phy-mode may be absent. The interpretation of these port OF nodes depends on + * their type. + * + * User ports with no phy-handle or fixed-link are expected to connect to an + * internal PHY located on the ds->slave_mii_bus at an MDIO address equal to + * the port number. This description is still actively supported. + * + * Shared (CPU and DSA) ports with no phy-handle or fixed-link are expected to + * operate at the maximum speed that their phy-mode is capable of. If the + * phy-mode is absent, they are expected to operate using the phy-mode + * supported by the port that gives the highest link speed. It is unspecified + * if the port should use flow control or not, half duplex or full duplex, or + * if the phy-mode is a SERDES link, whether in-band autoneg is expected to be + * enabled or not. + * + * In the latter case of shared ports, omitting the link management description + * from the firmware node is deprecated and strongly discouraged. DSA uses + * phylink, which rejects the firmware nodes of these ports for lacking + * required properties. + * + * For switches in this table, DSA will skip validation and will later omit + * registering a phylink instance for the shared ports, if they lack a + * fixed-link, a phy-handle, or a managed = "in-band-status" property. + * It becomes the responsibility of the driver to ensure that these ports + * operate at the maximum speed (whatever this means) and will interoperate + * with the DSA master or other cascade port, since phylink methods will not be + * invoked for them. + * + * If you are considering expanding this table for newly introduced switches, + * think again. It is OK to remove switches from this table if there aren't DT + * blobs in circulation which rely on defaulting the shared ports. + */ +static const char * const dsa_switches_skipping_validation[] = { +#if IS_ENABLED(CONFIG_NET_DSA_XRS700X) + "arrow,xrs7003e", + "arrow,xrs7003f", + "arrow,xrs7004e", + "arrow,xrs7004f", +#endif +#if IS_ENABLED(CONFIG_B53) + "brcm,bcm5325", + "brcm,bcm53115", + "brcm,bcm53125", + "brcm,bcm53128", + "brcm,bcm5365", + "brcm,bcm5389", + "brcm,bcm5395", + "brcm,bcm5397", + "brcm,bcm5398", + "brcm,bcm53010-srab", + "brcm,bcm53011-srab", + "brcm,bcm53012-srab", + "brcm,bcm53018-srab", + "brcm,bcm53019-srab", + "brcm,bcm5301x-srab", + "brcm,bcm11360-srab", + "brcm,bcm58522-srab", + "brcm,bcm58525-srab", + "brcm,bcm58535-srab", + "brcm,bcm58622-srab", + "brcm,bcm58623-srab", + "brcm,bcm58625-srab", + "brcm,bcm88312-srab", + "brcm,cygnus-srab", + "brcm,nsp-srab", + "brcm,omega-srab", + "brcm,bcm3384-switch", + "brcm,bcm6328-switch", + "brcm,bcm6368-switch", + "brcm,bcm63xx-switch", +#endif +#if IS_ENABLED(CONFIG_NET_DSA_BCM_SF2) + "brcm,bcm7445-switch-v4.0", + "brcm,bcm7278-switch-v4.0", + "brcm,bcm7278-switch-v4.8", +#endif +#if IS_ENABLED(CONFIG_NET_DSA_HIRSCHMANN_HELLCREEK) + "hirschmann,hellcreek-de1soc-r1", +#endif +#if IS_ENABLED(CONFIG_NET_DSA_LANTIQ_GSWIP) + "lantiq,xrx200-gswip", + "lantiq,xrx300-gswip", + "lantiq,xrx330-gswip", +#endif +#if IS_ENABLED(CONFIG_NET_DSA_MV88E6060) + "marvell,mv88e6060", +#endif +#if IS_ENABLED(CONFIG_NET_DSA_MV88E6XXX) + "marvell,mv88e6085", + "marvell,mv88e6190", + "marvell,mv88e6250", +#endif +#if IS_ENABLED(CONFIG_NET_DSA_MICROCHIP_KSZ_COMMON) + "microchip,ksz8765", + "microchip,ksz8794", + "microchip,ksz8795", + "microchip,ksz8863", + "microchip,ksz8873", + "microchip,ksz9477", + "microchip,ksz9897", + "microchip,ksz9893", + "microchip,ksz9563", + "microchip,ksz8563", + "microchip,ksz9567", +#endif +#if IS_ENABLED(CONFIG_NET_DSA_SMSC_LAN9303_MDIO) + "smsc,lan9303-mdio", +#endif +#if IS_ENABLED(CONFIG_NET_DSA_SMSC_LAN9303_I2C) + "smsc,lan9303-i2c", +#endif + NULL, +}; + +static int dsa_shared_port_validate_of_node(struct dsa_port *dp, + const char *description) +{ + struct device_node *dn = dp->dn, *phy_np; + struct dsa_switch *ds = dp->ds; + phy_interface_t mode; + + /* Suppress validation if using platform data */ + if (!dn) + return 0; + + if (of_device_compatible_match(ds->dev->of_node, + dsa_switches_skipping_validation)) + return 0; + + if (of_get_phy_mode(dn, &mode)) { + dev_err(ds->dev, + "%s port %d lacks the required \"phy-mode\" property\n", + description, dp->index); + return -EINVAL; + } + + phy_np = of_parse_phandle(dn, "phy-handle", 0); + if (phy_np) { + of_node_put(phy_np); + return 0; + } + + /* Note: of_phy_is_fixed_link() also returns true for + * managed = "in-band-status" + */ + if (of_phy_is_fixed_link(dn)) + return 0; + + /* TODO support SFP cages on DSA/CPU ports, + * here and in dsa_port_link_register_of() + */ + dev_err(ds->dev, + "%s port %d lacks the required \"phy-handle\", \"fixed-link\" or \"managed\" properties\n", + description, dp->index); + + return -EINVAL; +} + static int dsa_port_parse_user(struct dsa_port *dp, const char *name) { if (!name) @@ -1373,6 +1534,12 @@ static int dsa_port_parse_user(struct dsa_port *dp, const char *name) static int dsa_port_parse_dsa(struct dsa_port *dp) { + int err; + + err = dsa_shared_port_validate_of_node(dp, "DSA"); + if (err) + return err; + dp->type = DSA_PORT_TYPE_DSA; return 0; @@ -1411,6 +1578,11 @@ static int dsa_port_parse_cpu(struct dsa_port *dp, struct net_device *master, struct dsa_switch_tree *dst = ds->dst; const struct dsa_device_ops *tag_ops; enum dsa_tag_protocol default_proto; + int err; + + err = dsa_shared_port_validate_of_node(dp, "CPU"); + if (err) + return err; /* Find out which protocol the switch would prefer. */ default_proto = dsa_get_tag_protocol(dp, master);
Russell King points out that the phylink_pcs conversion of DSA drivers he is working on is likely to break drivers, because of the movement of code that didn't use to depend on phylink towards phylink callbacks. One example is mv88e6xxx, where DSA and CPU ports are configured during mv88e6xxx_setup() -> mv88e6xxx_setup_port() (therefore outside of phylink). DSA was not always integrated with phylink, and when the early drivers were converted from platform data to the new DSA bindings, there was no information kept in the platform data structures about port link speeds, so as a result, there was no information translated into the first DT bindings. https://lore.kernel.org/all/YtXFtTsf++AeDm1l@lunn.ch/ DSA first became integrated with phylink for user ports, where the interpretation of a port OF node with lacking information is different. Then there was an initial attempt to integrate phylink with CPU and DSA ports as well (these have no net devices), which was fixed up by a workaround added in commit a20f997010c4 ("net: dsa: Don't instantiate phylink for CPU/DSA ports unless needed"). The above workaround checks for the presence of phy-handle/fixed-link/ managed properties inside the OF nodes of CPU and DSA ports, and avoids registering phylink if they're missing. This is the state of things today, but what the workaround commit could have done better is that it didn't stop the proliferation of port OF nodes with lacking information. Today we have drivers introduced years after the phylink migration of CPU/DSA ports, and yet we're still not completely sure whether all new drivers use phylink, because this depends on dynamic information (DT blob, which may very well not be upstream, because why would it). Driver maintainers may even be unaware about the fact that not specifying fixed-link/phy-handle for CPU/DSA ports is legal for the old drivers, and even works. In this change we add central validation in DSA for the OF properties required by phylink, in an attempt to sanitize the environment for future driver writers, and as much as possible for existing driver maintainers. Technically no driver except sja1105 and felix (partially) validates these properties, but perhaps due to subtle reasons, some of the other existing drivers may not actually work properly with a port OF node that lacks a complete description. There isn't any way to know except by deleting the fixed-link (or phy-mode or both) on a CPU port and trying. There isn't a desire to make drivers that never worked start working with these DT blobs, but rather to eventually move all drivers towards using phylink on shared ports, including when the DT information is lacking. There is a parallel effort to artificially create a description for phylink for these ports; however it involves guesswork and may get things wrong. This is where this change comes in: drivers which do not opt out of strict validation do not need to concern themselves with how to artificially create the description, and how to configure themselves for the "maximum speed" mode. We can't fully know what is the situation with downstream DT blobs, but we can guess the overall trend by studying the DT blobs that were submitted upstream. If there are upstream blobs that have lacking descriptions, we take it as very likely that there are many more downstream blobs that do so too. If all upstream blobs have complete descriptions, we take that as a hint that the driver is a candidate for strict validation (considering that most bindings are copy-pasted). If there are no upstream DT blobs, we take the conservative route of skipping validation, unless the driver maintainer instructs us otherwise. The driver situation is as follows: mv88e6xxx ~~~~~~~~~ compatible strings: - marvell,mv88e6085 - marvell,mv88e6190 - marvell,mv88e6250 Device trees that have incomplete descriptions of CPU or DSA ports: arch/arm64/boot/dts/freescale/imx8mq-zii-ultra.dtsi - lacks phy-mode arch/arm64/boot/dts/marvell/cn9130-crb.dtsi - lacks phy-mode and fixed-link arch/arm/boot/dts/vf610-zii-ssmb-spu3.dts - lacks phy-mode arch/arm/boot/dts/kirkwood-mv88f6281gtw-ge.dts - lacks phy-mode arch/arm/boot/dts/vf610-zii-spb4.dts - lacks phy-mode arch/arm/boot/dts/vf610-zii-cfu1.dts - lacks phy-mode arch/arm/boot/dts/vf610-zii-dev-rev-c.dts - lacks phy-mode on CPU port, fixed-link on DSA ports arch/arm/boot/dts/vf610-zii-dev-rev-b.dts - lacks phy-mode on CPU port arch/arm/boot/dts/armada-381-netgear-gs110emx.dts - lacks phy-mode arch/arm/boot/dts/vf610-zii-scu4-aib.dts - lacks fixed-link on xgmii DSA ports and/or in-band-status on 2500base-x DSA ports, and phy-mode on CPU port arch/arm/boot/dts/imx6qdl-gw5904.dtsi - lacks phy-mode and fixed-link arch/arm/boot/dts/armada-385-clearfog-gtr-l8.dts - lacks phy-mode and fixed-link arch/arm/boot/dts/vf610-zii-ssmb-dtu.dts - lacks phy-mode arch/arm/boot/dts/kirkwood-dir665.dts - lacks phy-mode arch/arm/boot/dts/kirkwood-rd88f6281.dtsi - lacks phy-mode arch/arm/boot/dts/orion5x-netgear-wnr854t.dts - lacks phy-mode and fixed-link arch/arm/boot/dts/armada-388-clearfog.dts - lacks phy-mode arch/arm/boot/dts/armada-xp-linksys-mamba.dts - lacks phy-mode arch/arm/boot/dts/armada-385-linksys.dtsi - lacks phy-mode arch/arm/boot/dts/imx6q-b450v3.dts arch/arm/boot/dts/imx6q-b850v3.dts - has a phy-handle but not a phy-mode? arch/arm/boot/dts/armada-370-rd.dts - lacks phy-mode arch/arm/boot/dts/kirkwood-linksys-viper.dts - lacks phy-mode arch/arm/boot/dts/imx51-zii-rdu1.dts - lacks phy-mode arch/arm/boot/dts/imx51-zii-scu2-mezz.dts - lacks phy-mode arch/arm/boot/dts/imx6qdl-zii-rdu2.dtsi - lacks phy-mode arch/arm/boot/dts/armada-385-clearfog-gtr-s4.dts - lacks phy-mode and fixed-link Verdict: opt out of validation. ar9331 ~~~~~~ compatible strings: - qca,ar9331-switch 1 occurrence in mainline device trees, part of SoC dtsi (arch/mips/boot/dts/qca/ar9331.dtsi), description is not problematic. Verdict: opt into validation. qca8k ~~~~~ compatible strings: - qca,qca8327 - qca,qca8328 - qca,qca8334 - qca,qca8337 5 occurrences in mainline device trees, none of the descriptions are problematic. Verdict: opt into validation. hellcreek ~~~~~~~~~ compatible strings: - hirschmann,hellcreek-de1soc-r1 No occurrence in mainline device trees, we don't know. Verdict: opt out of validation. lan9303 ~~~~~~~ compatible strings: - smsc,lan9303-mdio - smsc,lan9303-i2c 1 occurrence in mainline device trees: arch/arm/boot/dts/imx53-kp-hsc.dts - no phy-mode, no fixed-link Verdict: opt out of validation. microchip ksz ~~~~~~~~~~~~~ compatible strings: - microchip,ksz8765 - microchip,ksz8794 - microchip,ksz8795 - microchip,ksz8863 - microchip,ksz8873 - microchip,ksz9477 - microchip,ksz9897 - microchip,ksz9893 - microchip,ksz9563 - microchip,ksz8563 - microchip,ksz9567 - microchip,lan9370 - microchip,lan9371 - microchip,lan9372 - microchip,lan9373 - microchip,lan9374 5 occurrences in mainline device trees, all descriptions are valid. But we had a snafu for the ksz8795 and ksz9477 drivers where the phy-mode property would be expected to be located directly under the 'switch' node rather than under a port OF node. It was fixed by commit edecfa98f602 ("net: dsa: microchip: look for phy-mode in port nodes"). The driver still has compatibility with the old DT blobs. The lan937x support was added later than the above snafu was fixed, and even though it has support for the broken DT blobs by virtue of sharing a common probing function, I'll take it that its DT blobs are correct. Verdict: opt lan937x into validation, and the others out. bcm_sf2 ~~~~~~~ compatible strings: - brcm,bcm4908-switch - brcm,bcm7445-switch-v4.0 - brcm,bcm7278-switch-v4.0 - brcm,bcm7278-switch-v4.8 A single occurrence in mainline (arch/arm64/boot/dts/broadcom/bcm4908/bcm4908.dtsi), part of a SoC dtsi, valid description. Verdict: opt into strict validation the switch we know, and opt out the ones we don't. ocelot ~~~~~~ compatible strings: - mscc,vsc9953-switch - felix (arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi) is a PCI device, has no compatible string 2 occurrences in mainline, both are part of SoC dtsi and complete. Verdict: opt into strict validation. mv88e6060 ~~~~~~~~~ compatible string: - marvell,mv88e6060 no occurrences in mainline, nobody knows anybody who uses it. Verdict: opt out of strict validation. xrs700x ~~~~~~~ compatible strings: - arrow,xrs7003e - arrow,xrs7003f - arrow,xrs7004e - arrow,xrs7004f no occurrences in mainline Verdict: opt out of strict validation. mt7530 ~~~~~~ compatible strings - mediatek,mt7621 - mediatek,mt7530 - mediatek,mt7531 Multiple occurrences in mainline device trees, one is part of an SoC dtsi (arch/mips/boot/dts/ralink/mt7621.dtsi), all descriptions are fine. Verdict: opt into strict validation. lantiq_gswip ~~~~~~~~~~~~ compatible strings: - lantiq,xrx200-gswip - lantiq,xrx300-gswip - lantiq,xrx330-gswip No occurrences in mainline device trees. Verdict: opt out of validation, because we don't know. vsc73xx ~~~~~~~ compatible strings: - vitesse,vsc7385 - vitesse,vsc7388 - vitesse,vsc7395 - vitesse,vsc7398 2 occurrences in mainline device trees, both descriptions are fine. Verdict: opt into validation. rzn1_a5psw ~~~~~~~~~~ compatible strings: - renesas,rzn1-a5psw One single occurrence, part of SoC dtsi (arch/arm/boot/dts/r9a06g032.dtsi), description is fine. Verdict: opt into validation. sja1105 ~~~~~~~ Driver already validates its port OF nodes in sja1105_parse_ports_node(). Verdict: opt into validation. realtek ~~~~~~~ compatible strings: - realtek,rtl8366rb - realtek,rtl8365mb 2 occurrences in mainline, both descriptions are fine, additionally rtl8365mb.c has a comment "The device tree firmware should also specify the link partner of the extension port - either via a fixed-link or other phy-handle." Verdict: opt into validation. b53 ~~~ compatible strings: - brcm,bcm5325 - brcm,bcm53115 - brcm,bcm53125 - brcm,bcm53128 - brcm,bcm5365 - brcm,bcm5389 - brcm,bcm5395 - brcm,bcm5397 - brcm,bcm5398 - brcm,bcm53010-srab - brcm,bcm53011-srab - brcm,bcm53012-srab - brcm,bcm53018-srab - brcm,bcm53019-srab - brcm,bcm5301x-srab - brcm,bcm11360-srab - brcm,bcm58522-srab - brcm,bcm58525-srab - brcm,bcm58535-srab - brcm,bcm58622-srab - brcm,bcm58623-srab - brcm,bcm58625-srab - brcm,bcm88312-srab - brcm,cygnus-srab - brcm,nsp-srab - brcm,omega-srab - brcm,bcm3384-switch - brcm,bcm6328-switch - brcm,bcm6368-switch - brcm,bcm63xx-switch I've found at least these mainline DT blobs with problems: arch/arm/boot/dts/bcm47094-linksys-panamera.dts - lacks phy-mode arch/arm/boot/dts/bcm47189-tenda-ac9.dts - lacks phy-mode and fixed-link arch/arm/boot/dts/bcm47081-luxul-xap-1410.dts arch/arm/boot/dts/bcm47081-luxul-xwr-1200.dts arch/arm/boot/dts/bcm47081-buffalo-wzr-600dhp2.dts - lacks phy-mode and fixed-link arch/arm/boot/dts/bcm47094-luxul-xbr-4500.dts arch/arm/boot/dts/bcm4708-smartrg-sr400ac.dts arch/arm/boot/dts/bcm4708-luxul-xap-1510.dts arch/arm/boot/dts/bcm953012er.dts arch/arm/boot/dts/bcm4708-netgear-r6250.dts arch/arm/boot/dts/bcm4708-buffalo-wzr-1166dhp-common.dtsi arch/arm/boot/dts/bcm4708-luxul-xwc-1000.dts arch/arm/boot/dts/bcm47094-luxul-abr-4500.dts - lacks phy-mode and fixed-link arch/arm/boot/dts/bcm53016-meraki-mr32.dts - lacks phy-mode Verdict: opt all switches out of strict validation. Because there is a pattern where newly added switches reuse existing drivers more often than introducing new ones, I've opted for deciding who gets to skip validation based on an OF compatible match table in the DSA core. The alternative would have been to add another boolean property to struct dsa_switch, like configure_vlan_while_not_filtering. But this avoids situations where sometimes driver maintainers obfuscate what goes on by sharing a common probing function, and therefore making new switches inherit old quirks. This change puts an upper bound to the number of switches for which a link management description must be faked, and also makes it clearer which switches need to be tested using that logic and which don't. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> --- This patch shouldn't need testing per se (I've tested it on a validating and on a non-validating driver), but it would still benefit from some ACKs/NACKs from driver maintainers ("hey, move me from this camp to that camp", or "I'm ok in this camp, thanks"). It seems a pity especially to opt out of validation just because we don't know what DT blobs of a driver generally look like. Don't be shy if you aren't in the MAINTAINERS file or even copied to this email but stumbled upon it; some switch drivers don't have a dedicated maintainer of their own. I've selected some people in CC based on their past contributions. net/dsa/dsa2.c | 172 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 172 insertions(+)