Message ID | 20231208045054.27966-3-luizluca@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | net: dsa: realtek: variants to drivers, interfaces to a common module | expand |
> diff --git a/drivers/net/dsa/realtek/realtek-smi.c b/drivers/net/dsa/realtek/realtek-smi.c > index 755546ed8db6..ddcae546afbc 100644 > --- a/drivers/net/dsa/realtek/realtek-smi.c > +++ b/drivers/net/dsa/realtek/realtek-smi.c > @@ -389,15 +389,15 @@ static int realtek_smi_setup_mdio(struct dsa_switch *ds) > priv->user_mii_bus->write = realtek_smi_mdio_write; > snprintf(priv->user_mii_bus->id, MII_BUS_ID_SIZE, "SMI-%d", > ds->index); > - priv->user_mii_bus->dev.of_node = mdio_np; > priv->user_mii_bus->parent = priv->dev; > ds->user_mii_bus = priv->user_mii_bus; > > ret = devm_of_mdiobus_register(priv->dev, priv->user_mii_bus, mdio_np); > + of_node_put(mdio_np); I would like some advice on this line. I have seen similar code like this but I'm not sure if a function that receives that node as an argument should be responsible to call kobject_get() (or alike) if it keeps a reference for that node. The of_mdiobus_register does not keep that node but it does get some child nodes. I don't know if it is ok to free the parent node (if that ever happens when a child is still in use). Regards, Luiz
On Fri, Dec 08, 2023 at 02:13:25AM -0300, Luiz Angelo Daros de Luca wrote: > > diff --git a/drivers/net/dsa/realtek/realtek-smi.c b/drivers/net/dsa/realtek/realtek-smi.c > > index 755546ed8db6..ddcae546afbc 100644 > > --- a/drivers/net/dsa/realtek/realtek-smi.c > > +++ b/drivers/net/dsa/realtek/realtek-smi.c > > @@ -389,15 +389,15 @@ static int realtek_smi_setup_mdio(struct dsa_switch *ds) > > priv->user_mii_bus->write = realtek_smi_mdio_write; > > snprintf(priv->user_mii_bus->id, MII_BUS_ID_SIZE, "SMI-%d", > > ds->index); > > - priv->user_mii_bus->dev.of_node = mdio_np; You do not really justify removing this in your patch. This is not a purely cosmetic change because now the associated mdiodev will not be associated with the OF node. I don't know if there is any consequence to that but it is usually nice to populate this info in the device struct when it is actually available. > > priv->user_mii_bus->parent = priv->dev; > > ds->user_mii_bus = priv->user_mii_bus; > > > > ret = devm_of_mdiobus_register(priv->dev, priv->user_mii_bus, mdio_np); > > + of_node_put(mdio_np); > > I would like some advice on this line. I have seen similar code like > this but I'm not sure if a function that receives that node as an > argument should be responsible to call kobject_get() (or alike) if it > keeps a reference for that node. The of_mdiobus_register does not keep > that node but it does get some child nodes. I don't know if it is ok > to free the parent node (if that ever happens when a child is still in > use). Yes, it's OK to do that. > > Regards, > > Luiz
> On Fri, Dec 08, 2023 at 02:13:25AM -0300, Luiz Angelo Daros de Luca wrote: > > > diff --git a/drivers/net/dsa/realtek/realtek-smi.c b/drivers/net/dsa/realtek/realtek-smi.c > > > index 755546ed8db6..ddcae546afbc 100644 > > > --- a/drivers/net/dsa/realtek/realtek-smi.c > > > +++ b/drivers/net/dsa/realtek/realtek-smi.c > > > @@ -389,15 +389,15 @@ static int realtek_smi_setup_mdio(struct dsa_switch *ds) > > > priv->user_mii_bus->write = realtek_smi_mdio_write; > > > snprintf(priv->user_mii_bus->id, MII_BUS_ID_SIZE, "SMI-%d", > > > ds->index); > > > - priv->user_mii_bus->dev.of_node = mdio_np; > > You do not really justify removing this in your patch. This is not a > purely cosmetic change because now the associated mdiodev will not be > associated with the OF node. I don't know if there is any consequence to > that but it is usually nice to populate this info in the device struct > when it is actually available. Reviewing the code again, I believe it was not just misplacing the of_put_node() but probably calling it twice. devm_mdiobus_alloc() doesn't set the dev in mii_bus. So, dev is all zeros. The dev.of_node normal place to be defined is: devm_of_mdiobus_register() __devm_of_mdiobus_register() __of_mdiobus_register() device_set_node() The only way for that value, set by the line I removed, to persist is when the devm_of_mdiobus_register() fails before device_set_node(). My guess is that it was set to be used by realtek_smi_remove() if it is called when registration fails. However, in that case, both realtek_smi_setup_mdio() and realtek_smi_setup_mdio() would put the node. So, either the line is useless or it will effectively result in calling of_node_put() twice. If I really needed to put that node in the realtek_smi_remove(), I would use a dedicated field in realtek_priv instead of reusing a reference for it inside another structure. I'll add some notes to the commit message about all these but moving the of_node_put() to the same function that gets the node solved all the issues. Regards, Luiz
On Fri, Dec 08, 2023 at 03:05:41PM -0300, Luiz Angelo Daros de Luca wrote: > Reviewing the code again, I believe it was not just misplacing the > of_put_node() but probably calling it twice. > > devm_mdiobus_alloc() doesn't set the dev in mii_bus. So, dev is all > zeros. The dev.of_node normal place to be defined is: > > devm_of_mdiobus_register() > __devm_of_mdiobus_register() > __of_mdiobus_register() > device_set_node() > > The only way for that value, set by the line I removed, to persist is > when the devm_of_mdiobus_register() fails before device_set_node(). Did you consider that __of_mdiobus_register() -> device_set_node() is actually overwriting priv->user_mii_bus->dev.of_node with the same value? So the reference to mdio_np persists even if technically overwritten. The fact that the assignment looks redundant is another story. > My guess is that it was set to be used by realtek_smi_remove() if it > is called when registration fails. However, in that case, both > realtek_smi_setup_mdio() and realtek_smi_setup_mdio() would put the You listed the same function name twice. You meant realtek_smi_remove() the second time? > node. So, either the line is useless or it will effectively result in > calling of_node_put() twice. False logic, since realtek_smi_remove() is not called when probe() fails. ds->ops->setup() is called from probe() context. So no double of_node_put(). That's a general rule with the kernel API. When a setup API function fails, it is responsible of cleaning up the temporary things it did. The teardown API function is only called when the setup was performed fully. (the only exception I'm aware of is the Qdisc API, but that's not exactly the best model to follow) > If I really needed to put that node in the realtek_smi_remove(), I > would use a dedicated field in realtek_priv instead of reusing a > reference for it inside another structure. > > I'll add some notes to the commit message about all these but moving > the of_node_put() to the same function that gets the node solved all > the issues. "Solved all the issues" - what are those issues, first of all? The simple fact is: of_get_compatible_child() returns an OF node with an elevated refcount. It passes it to of_mdiobus_register() which does not take ownership of it per se, but assigns it to bus->dev.of_node, which is accessible until device_del() from mdiobus_unregister(). The PHY library does not make the ownership rules of the of_node very clear, but since it takes no reference on it, it will fail in subtle ways if you pull the carpet from under its feet. For example, I expect of_mdio_find_bus() to fail. That is used only rarely, like by the MDIO mux driver which I suppose you haven't tested. If you want, you could make the OF MDIO API get() and put() the reference, instead of using something it doesn't fully own. But currently the code doesn't do that. Try to acknowledge what exists, first.
> > Reviewing the code again, I believe it was not just misplacing the > > of_put_node() but probably calling it twice. > > > > devm_mdiobus_alloc() doesn't set the dev in mii_bus. So, dev is all > > zeros. The dev.of_node normal place to be defined is: > > > > devm_of_mdiobus_register() > > __devm_of_mdiobus_register() > > __of_mdiobus_register() > > device_set_node() > > > > The only way for that value, set by the line I removed, to persist is > > when the devm_of_mdiobus_register() fails before device_set_node(). > > Did you consider that __of_mdiobus_register() -> device_set_node() is > actually overwriting priv->user_mii_bus->dev.of_node with the same value? > So the reference to mdio_np persists even if technically overwritten. > The fact that the assignment looks redundant is another story. Yes, I believe it is just redundant. > > My guess is that it was set to be used by realtek_smi_remove() if it > > is called when registration fails. However, in that case, both > > realtek_smi_setup_mdio() and realtek_smi_setup_mdio() would put the > > You listed the same function name twice. You meant realtek_smi_remove() > the second time? yes > > node. So, either the line is useless or it will effectively result in > > calling of_node_put() twice. > > False logic, since realtek_smi_remove() is not called when probe() fails. > ds->ops->setup() is called from probe() context. So no double of_node_put(). > That's a general rule with the kernel API. When a setup API function fails, > it is responsible of cleaning up the temporary things it did. The > teardown API function is only called when the setup was performed fully. So, "the line is useless". > (the only exception I'm aware of is the Qdisc API, but that's not > exactly the best model to follow) > > > If I really needed to put that node in the realtek_smi_remove(), I > > would use a dedicated field in realtek_priv instead of reusing a > > reference for it inside another structure. > > > > I'll add some notes to the commit message about all these but moving > > the of_node_put() to the same function that gets the node solved all > > the issues. > > "Solved all the issues" - what are those issues, first of all? 1) the useless assignment 2) the (possible) double of_node_put(), which proved to be false. > The simple fact is: of_get_compatible_child() returns an OF node with an > elevated refcount. It passes it to of_mdiobus_register() which does not > take ownership of it per se, but assigns it to bus->dev.of_node, which > is accessible until device_del() from mdiobus_unregister(). Normally, when you have a refcount system, whenever you have a reference to an object, you should increase the refcount. I thought that every time you assign a kobject to a structure, you should get it as well (and put it when you deallocate it). But that's just what I would expect, not something I found in docs. I see distinct behaviors with methods that assign the dev.of_node using device_set_node() in OF MDIO API code and that's not good: static int of_mdiobus_register_device(struct mii_bus *mdio, struct device_node *child, u32 addr) { (...) fwnode_handle_get(fwnode); device_set_node(&mdiodev->dev, fwnode); (...) } int fwnode_mdiobus_phy_device_register(struct mii_bus *mdio, struct phy_device *phy, struct fwnode_handle *child, u32 addr) { (...) fwnode_handle_get(child); device_set_node(&phy->mdio.dev, child); (...) } int of_mdiobus_phy_device_register(struct mii_bus *mdio, struct phy_device *phy, struct device_node *child, u32 addr) { return fwnode_mdiobus_phy_device_register(mdio, phy, of_fwnode_handle(child), } int __of_mdiobus_register(struct mii_bus *mdio, struct device_node *np, struct module *owner) { (...) device_set_node(&mdio->dev, of_fwnode_handle(np)); (...) for_each_available_child_of_node(np, child) { (...) if (of_mdiobus_child_is_phy(child)) rc = of_mdiobus_register_phy(mdio, child, addr); else rc = of_mdiobus_register_device(mdio, child, addr); } }) Each deals differently with the device_node it receives. Both of_mdiobus_register_phy and of_mdiobus_register_device gets the child before assigning it to the device but not __of_mdiobus_register. Why? After some more study, I think it is just because, while an of_node_get() just before device_set_node() fits nicely in __of_mdiobus_register(), there is not a good place in of_mdio to put it. We don't have an of_mdiobus_unregister(). The unregistration happens only in mdiobus_unregister(), where, I guess, it should avoid OF-specific code. Even if we put OF code there, we would need to know during mdiobus_unregister() if the bus->dev.of_node was gotten by of_mdio or someone else. I believe it is not nice to externally assign dev.of_node directly to mdiobus but realtek switch driver is doing just that and others might be doing the same thing. The delegation of of_node_get/put to the caller seems to be just an easy workaround the fact that there is no good place to put a node that of_mdio would get. For devm functions, we could include the get/put call creating a new devm_of_mdiobus_unregister() but I believe devm and non-devm needs to be equivalent (except for the resource deallocation). > The PHY library does not make the ownership rules of the of_node very > clear, but since it takes no reference on it, it will fail in subtle > ways if you pull the carpet from under its feet. > > For example, I expect of_mdio_find_bus() to fail. That is used only > rarely, like by the MDIO mux driver which I suppose you haven't tested. No, I didn't test it. In fact, most embedded devices will not use dynamic OF and all those of_node_get/put will be a noop. > If you want, you could make the OF MDIO API get() and put() the reference, > instead of using something it doesn't fully own. But currently the code > doesn't do that. Try to acknowledge what exists, first. What I saw in other drivers outside drivers/net is that one that allocates the dev will get the node before assigning dev.of_node and put it before freeing the device. The mdiobus case seems to be different. I believe it would make the code more robust if we could fix that inside OF MDIO API and not just document its behavior. It will also not break existing uses as extra get/put's are OK. I believe we could add an unregister callback to mii_bus. It wouldn't be too complex: >From b5b059ea4491e9f745872220fb94d8105e2d7d43 Mon Sep 17 00:00:00 2001 From: Luiz Angelo Daros de Luca <luizluca@gmail.com> Date: Tue, 12 Dec 2023 00:26:06 -0300 Subject: [PATCH] net: mdio: get/put device node during (un)registration __of_mdiobus_register() was storing the device node in dev.of_node without increasing its refcount. It was implicitly delegating to the caller to maintain the node allocated until mdiobus was unregistered. Now, the __of_mdiobus_register() will get the node before assigning it, and of_mdiobus_unregister_callback() will be called at the end of mdio_unregister(). Drivers can now put the node just after the MDIO registration. Signed-off-by: Luiz Angelo Daros de Luca <luizluca@gmail.com> --- drivers/net/mdio/of_mdio.c | 12 +++++++++++- drivers/net/phy/mdio_bus.c | 3 +++ include/linux/phy.h | 3 +++ 3 files changed, 17 insertions(+), 1 deletion(-) diff --git a/drivers/net/mdio/of_mdio.c b/drivers/net/mdio/of_mdio.c index 64ebcb6d235c..9b6cab6154e0 100644 --- a/drivers/net/mdio/of_mdio.c +++ b/drivers/net/mdio/of_mdio.c @@ -139,6 +139,11 @@ bool of_mdiobus_child_is_phy(struct device_node *child) } EXPORT_SYMBOL(of_mdiobus_child_is_phy); +static void __of_mdiobus_unregister_callback(struct mii_bus *mdio) +{ + of_node_put(mdio->dev.of_node); +} + /** * __of_mdiobus_register - Register mii_bus and create PHYs from the device tree * @mdio: pointer to mii_bus structure @@ -166,6 +171,8 @@ int __of_mdiobus_register(struct mii_bus *mdio, struct device_node *np, * the device tree are populated after the bus has been registered */ mdio->phy_mask = ~0; + mdio->__unregister_callback = __of_mdiobus_unregister_callback; + of_node_get(np); device_set_node(&mdio->dev, of_fwnode_handle(np)); /* Get bus level PHY reset GPIO details */ @@ -177,7 +184,7 @@ int __of_mdiobus_register(struct mii_bus *mdio, struct device_node *np, /* Register the MDIO bus */ rc = __mdiobus_register(mdio, owner); if (rc) - return rc; + goto put_node; /* Loop over the child nodes and register a phy_device for each phy */ for_each_available_child_of_node(np, child) { @@ -237,6 +244,9 @@ int __of_mdiobus_register(struct mii_bus *mdio, struct device_node *np, unregister: of_node_put(child); mdiobus_unregister(mdio); + +put_node: + of_node_put(np); return rc; } EXPORT_SYMBOL(__of_mdiobus_register); diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c index 25dcaa49ab8b..1229b8e4c53b 100644 --- a/drivers/net/phy/mdio_bus.c +++ b/drivers/net/phy/mdio_bus.c @@ -787,6 +787,9 @@ void mdiobus_unregister(struct mii_bus *bus) gpiod_set_value_cansleep(bus->reset_gpiod, 1); device_del(&bus->dev); + + if (bus->__unregister_callback) + bus->__unregister_callback(bus); } EXPORT_SYMBOL(mdiobus_unregister); diff --git a/include/linux/phy.h b/include/linux/phy.h index e5f1f41e399c..2b383da4d825 100644 --- a/include/linux/phy.h +++ b/include/linux/phy.h @@ -433,6 +433,9 @@ struct mii_bus { /** @shared: shared state across different PHYs */ struct phy_package_shared *shared[PHY_MAX_ADDR]; + + /** @__unregister_callback: called at the last step of unregistration */ + void (*__unregister_callback)(struct mii_bus *bus); }; #define to_mii_bus(d) container_of(d, struct mii_bus, dev) -- 2.43.0 If we don't fix that in OF MDIO API, we would need to fix fe7324b932222 as well, moving the put to the dsa_switch_teardown(). Regards, Luiz
On Tue, Dec 12, 2023 at 12:47:57AM -0300, Luiz Angelo Daros de Luca wrote: > The unregistration happens only in mdiobus_unregister(), where, I > guess, it should avoid OF-specific code. Even if we put OF code there, > we would need to know during mdiobus_unregister() if the > bus->dev.of_node was gotten by of_mdio or someone else. > > I believe it is not nice to externally assign dev.of_node directly to > mdiobus but realtek switch driver is doing just that and others might > be doing the same thing. Well, make up your mind: earlier you said the user_mii_bus->dev.of_node assignment from the Realtek DSA driver is redundant, because devm_of_mdiobus_register() -> ... -> __of_mdiobus_register() does it anyway. So if it's redundant, you can remove it and nothing changes. What's so "not nice" about it that's worth complaining? Are you trying to say that you're concerned that some drivers might be populating the mii_bus->dev.of_node manually, and then proceeding to call the _non-OF_ mdiobus_register() variant? Some drivers like bcm_sf2.c? :) That will be a problem, yes. If a clean result is the goal, I guess some consolidation needs to be done before any new rule could be added. Otherwise, yeah, we can just snap on one more lazy layer of complexity, no problem. My 2 cents. > The delegation of of_node_get/put to the caller seems to be just an > easy workaround the fact that there is no good place to put a node > that of_mdio would get. For devm functions, we could include the > get/put call creating a new devm_of_mdiobus_unregister() but I believe > devm and non-devm needs to be equivalent (except for the resource > deallocation). How did we get here, who suggested to get and put the references to the OF node outside of the OF MDIO API? > > If you want, you could make the OF MDIO API get() and put() the reference, > > instead of using something it doesn't fully own. But currently the code > > doesn't do that. Try to acknowledge what exists, first. > > What I saw in other drivers outside drivers/net is that one that > allocates the dev will get the node before assigning dev.of_node and > put it before freeing the device. The mdiobus case seems to be > different. I believe it would make the code more robust if we could > fix that inside OF MDIO API and not just document its behavior. It > will also not break existing uses as extra get/put's are OK. > > I believe we could add an unregister callback to mii_bus. It wouldn't > be too complex: > > From b5b059ea4491e9f745872220fb94d8105e2d7d43 Mon Sep 17 00:00:00 2001 > From: Luiz Angelo Daros de Luca <luizluca@gmail.com> > Date: Tue, 12 Dec 2023 00:26:06 -0300 > Subject: [PATCH] net: mdio: get/put device node during (un)registration > > __of_mdiobus_register() was storing the device node in dev.of_node > without increasing its refcount. It was implicitly delegating to the > caller to maintain the node allocated until mdiobus was unregistered. > > Now, the __of_mdiobus_register() will get the node before assigning it, > and of_mdiobus_unregister_callback() will be called at the end of > mdio_unregister(). > > Drivers can now put the node just after the MDIO registration. > > Signed-off-by: Luiz Angelo Daros de Luca <luizluca@gmail.com> > --- > drivers/net/mdio/of_mdio.c | 12 +++++++++++- > drivers/net/phy/mdio_bus.c | 3 +++ > include/linux/phy.h | 3 +++ > 3 files changed, 17 insertions(+), 1 deletion(-) I don't mean to be rude, but I don't have the time to dig into this any further, sorry. If you are truly committed to better the phylib API, please bring it up with the phylib people instead. I literally only care about the thing that Alvin pointed out, which is that you made unjustified changes to a DSA driver. > If we don't fix that in OF MDIO API, we would need to fix > fe7324b932222 as well, moving the put to the dsa_switch_teardown(). Oh, couldn't we straight-up revert that instead? :) The user_mii_bus is created by DSA for compatibility with non-OF. I cannot understand why you insist to attach an OF node to it. But otherwise, yes, it is the same situation: of_node_put(), called before unregistering an MDIO bus registered with of_mdiobus_register(), means that the full OF API on this MDIO bus may not work correctly. I don't know the exact conditions though. It might be marginal or even a bug that's impossible to trigger. I haven't tested anything. In any case, while I encourage you to make OF node refcounting work in the way that you think is intuitive, I want to be clear about one thing, and that is that I'm not onboard with modifying phylib to make a non use-case in DSA work, aka OF-aware user_mii_bus (an oxymoron). I understand why a driver may want a ds->user_mii_bus. And I understand why a driver may want an MDIO bus with an of_node. What I don't understand is who might want both at the same time, and why.
> > The unregistration happens only in mdiobus_unregister(), where, I > > guess, it should avoid OF-specific code. Even if we put OF code there, > > we would need to know during mdiobus_unregister() if the > > bus->dev.of_node was gotten by of_mdio or someone else. > > > > I believe it is not nice to externally assign dev.of_node directly to > > mdiobus but realtek switch driver is doing just that and others might > > be doing the same thing. > > Well, make up your mind: earlier you said the user_mii_bus->dev.of_node > assignment from the Realtek DSA driver is redundant, because > devm_of_mdiobus_register() -> ... -> __of_mdiobus_register() does it > anyway. So if it's redundant, you can remove it and nothing changes. > What's so "not nice" about it that's worth complaining? > > Are you trying to say that you're concerned that some drivers might be > populating the mii_bus->dev.of_node manually, and then proceeding to > call the _non-OF_ mdiobus_register() variant? Yes. Just like that. :) > Some drivers like bcm_sf2.c? :) Yeah. > That will be a problem, yes. If a clean result is the goal, I guess some > consolidation needs to be done before any new rule could be added. > Otherwise, yeah, we can just snap on one more lazy layer of complexity, > no problem. My 2 cents. > > > The delegation of of_node_get/put to the caller seems to be just an > > easy workaround the fact that there is no good place to put a node > > that of_mdio would get. For devm functions, we could include the > > get/put call creating a new devm_of_mdiobus_unregister() but I believe > > devm and non-devm needs to be equivalent (except for the resource > > deallocation). > > How did we get here, who suggested to get and put the references to the > OF node outside of the OF MDIO API? It is not a suggestion. If it was a suggestion (like in a comment), it would be a little bit better. Some got it right and some didn't. The OF API will only return you a node with the refcount incremented. You need to put it in somewhere after that. That will happen no matter how you use the node and that's OK. The problem is when I pass that reference to another function, I need to somehow know if it keeps a reference to that node and not increments the refconf. If it does not keep the reference, it is OK. If it keeps the reference and gets it, it is also OK. The answer "just read (all the multiple level and different) code (paths)" is fated to fail. The put after registration in DSA core code is just an example of how it did not work. > > > If you want, you could make the OF MDIO API get() and put() the reference, > > > instead of using something it doesn't fully own. But currently the code > > > doesn't do that. Try to acknowledge what exists, first. > > > > What I saw in other drivers outside drivers/net is that one that > > allocates the dev will get the node before assigning dev.of_node and > > put it before freeing the device. The mdiobus case seems to be > > different. I believe it would make the code more robust if we could > > fix that inside OF MDIO API and not just document its behavior. It > > will also not break existing uses as extra get/put's are OK. > > > > I believe we could add an unregister callback to mii_bus. It wouldn't > > be too complex: > > > > From b5b059ea4491e9f745872220fb94d8105e2d7d43 Mon Sep 17 00:00:00 2001 > > From: Luiz Angelo Daros de Luca <luizluca@gmail.com> > > Date: Tue, 12 Dec 2023 00:26:06 -0300 > > Subject: [PATCH] net: mdio: get/put device node during (un)registration > > > > __of_mdiobus_register() was storing the device node in dev.of_node > > without increasing its refcount. It was implicitly delegating to the > > caller to maintain the node allocated until mdiobus was unregistered. > > > > Now, the __of_mdiobus_register() will get the node before assigning it, > > and of_mdiobus_unregister_callback() will be called at the end of > > mdio_unregister(). > > > > Drivers can now put the node just after the MDIO registration. > > > > Signed-off-by: Luiz Angelo Daros de Luca <luizluca@gmail.com> > > --- > > drivers/net/mdio/of_mdio.c | 12 +++++++++++- > > drivers/net/phy/mdio_bus.c | 3 +++ > > include/linux/phy.h | 3 +++ > > 3 files changed, 17 insertions(+), 1 deletion(-) > > I don't mean to be rude, but I don't have the time to dig into this any > further, sorry. If you are truly committed to better the phylib API, > please bring it up with the phylib people instead. I literally only > care about the thing that Alvin pointed out, which is that you made > unjustified changes to a DSA driver. Sure, phylib is still for netdev, right? I'll redo this patch to avoid putting the node before unregistration. > > If we don't fix that in OF MDIO API, we would need to fix > > fe7324b932222 as well, moving the put to the dsa_switch_teardown(). > > Oh, couldn't we straight-up revert that instead? :) The user_mii_bus > is created by DSA for compatibility with non-OF. I cannot understand > why you insist to attach an OF node to it. Please, not before this patch series gets merged or you'll break MDIO-connected Realtek DSA switches, at least the IRQ handling. I'll send the revert myself afterwards. > But otherwise, yes, it is the same situation: of_node_put(), called > before unregistering an MDIO bus registered with of_mdiobus_register(), > means that the full OF API on this MDIO bus may not work correctly. > I don't know the exact conditions though. It might be marginal or even > a bug that's impossible to trigger. I haven't tested anything. OK. I'll not try to fix that but revert it as soon as possible without breaking existing code. You need too much conditions to make it trigger a bug: 1) use dynamic OF 2) no other code also keep a reference to that node 3) a call that actually reads the of_node from user_mii.dev But, as you pointed out, OF and user_mii should not work together. > In any case, while I encourage you to make OF node refcounting work in > the way that you think is intuitive, I want to be clear about one thing, > and that is that I'm not onboard with modifying phylib to make a non > use-case in DSA work, aka OF-aware user_mii_bus (an oxymoron). The change to MDIO code would not be a requirement for this patch series. I'll deal with each front independently. > I understand why a driver may want a ds->user_mii_bus. And I understand > why a driver may want an MDIO bus with an of_node. What I don't understand > is who might want both at the same time, and why. That one I might be novice enough to answer :). When you start to write a new driver, you read the docs to get a general idea. However, as code moves faster than docs, you mainly rely on code. So, you just choose a driver (or a couple of them) to inspire you. You normally prefer a small driver because it is less code to read and it might be just enough to get started. As long as it is mainline, nothing indicates it should not be used as a reference. I wasn't the one that wrote most of the Realtek DSA driver but I see the same OF + user_mii_bus pattern in more than one driver. If you want to stop spreading, as rewriting all affected drivers might not be an option, a nice /* TODO: convert to user YXZ */ comment might do the trick. An updated doc suggesting a driver to be used as an example would also be nice. The doc update you sent about the "user MDIO bus documentation" telling us we should not use user_mii_bus when we describe the internal MDIO in the device-tree made me more confused. But I'll comment on that thread. Regards, Luiz
On Wed, Dec 13, 2023 at 01:37:27AM -0300, Luiz Angelo Daros de Luca wrote: > The answer "just read (all the multiple level and different) code > (paths)" is fated to fail. The put after registration in DSA core code > is just an example of how it did not work. If you try to think critically about the changes you make, you'll get better at it with time. Reading code is what we all do, really, and making educated guesses about what we saw. It's error prone for everyone involved, which is why we use review to confront what we all understand. This is not only an encouragement, but also a subtle hint that you will endlessly frustrate those who have to read more than you did, in order to review your proposals, only for you to complain that you have to read too much when making changes. Anyway. > > I don't mean to be rude, but I don't have the time to dig into this any > > further, sorry. If you are truly committed to better the phylib API, > > please bring it up with the phylib people instead. I literally only > > care about the thing that Alvin pointed out, which is that you made > > unjustified changes to a DSA driver. > > Sure, phylib is still for netdev, right? ETHERNET PHY LIBRARY M: Andrew Lunn <andrew@lunn.ch> M: Heiner Kallweit <hkallweit1@gmail.com> R: Russell King <linux@armlinux.org.uk> L: netdev@vger.kernel.org S: Maintained F: Documentation/ABI/testing/sysfs-class-net-phydev F: Documentation/devicetree/bindings/net/ethernet-phy.yaml F: Documentation/devicetree/bindings/net/mdio* F: Documentation/devicetree/bindings/net/qca,ar803x.yaml F: Documentation/networking/phy.rst F: drivers/net/mdio/ F: drivers/net/mdio/acpi_mdio.c F: drivers/net/mdio/fwnode_mdio.c F: drivers/net/mdio/of_mdio.c F: drivers/net/pcs/ F: drivers/net/phy/ F: include/dt-bindings/net/qca-ar803x.h F: include/linux/*mdio*.h F: include/linux/linkmode.h F: include/linux/mdio/*.h F: include/linux/mii.h F: include/linux/of_net.h F: include/linux/phy.h F: include/linux/phy_fixed.h F: include/linux/phylib_stubs.h F: include/linux/platform_data/mdio-bcm-unimac.h F: include/linux/platform_data/mdio-gpio.h F: include/trace/events/mdio.h F: include/uapi/linux/mdio.h F: include/uapi/linux/mii.h F: net/core/of_net.c > > Oh, couldn't we straight-up revert that instead? :) The user_mii_bus > > is created by DSA for compatibility with non-OF. I cannot understand > > why you insist to attach an OF node to it. > > Please, not before this patch series gets merged or you'll break > MDIO-connected Realtek DSA switches, at least the IRQ handling. > I'll send the revert myself afterwards. > > > But otherwise, yes, it is the same situation: of_node_put(), called > > before unregistering an MDIO bus registered with of_mdiobus_register(), > > means that the full OF API on this MDIO bus may not work correctly. > > I don't know the exact conditions though. It might be marginal or even > > a bug that's impossible to trigger. I haven't tested anything. > > OK. I'll not try to fix that but revert it as soon as possible without > breaking existing code. Ok, I'm not saying "revert it NOW". But it would be good if you could revert it as part of the realtek-common series, as a last patch (provided that you've done your homework and nobody else relies on it). Or at least not forget about it. > > I understand why a driver may want a ds->user_mii_bus. And I understand > > why a driver may want an MDIO bus with an of_node. What I don't understand > > is who might want both at the same time, and why. > > That one I might be novice enough to answer :). > > When you start to write a new driver, you read the docs to get a > general idea. However, as code moves faster than docs, you mainly rely > on code. So, you just choose a driver (or a couple of them) to inspire > you. You normally prefer a small driver because it is less code to > read and it might be just enough to get started. As long as it is > mainline, nothing indicates it should not be used as a reference. And when you consider that DSA has better documentation than most subsystems out there.... Would it blow your mind away if I told you that the documentation is written based on the code? The same code from which you draw a lazy conclusion. You have perfectly laid out why the code is not the problem, and why the documentation is not the solution. The problem is the unwillingness to spend time to understand, but to want to push your changes nonetheless. The problem is on your end. I'm sorry, it has to be said. > I wasn't the one that wrote most of the Realtek DSA driver but I see > the same OF + user_mii_bus pattern in more than one driver. If you > want to stop spreading, as rewriting all affected drivers might not be > an option, a nice /* TODO: convert to user YXZ */ comment might do the > trick. An updated doc suggesting a driver to be used as an example > would also be nice. I don't have time, Luiz. I spent 1 and a half hours today replying to just your emails, and one and a half hours yesterday. I have a job. You've made me see that I'm wasting my time writing documentation for people who want instant gratification instead. I don't know how to get to them. I'll think about it some more.
> On Wed, Dec 13, 2023 at 01:37:27AM -0300, Luiz Angelo Daros de Luca wrote: > > The answer "just read (all the multiple level and different) code > > (paths)" is fated to fail. The put after registration in DSA core code > > is just an example of how it did not work. > > If you try to think critically about the changes you make, you'll get > better at it with time. Reading code is what we all do, really, and > making educated guesses about what we saw. It's error prone for everyone > involved, which is why we use review to confront what we all understand. I'm not trying to blame anyone. We all make mistakes. However, when we have the same mistake happening in different situations, maybe we are only dealing with the consequences and not the cause. > This is not only an encouragement, but also a subtle hint that you will > endlessly frustrate those who have to read more than you did, in order > to review your proposals, only for you to complain that you have to read > too much when making changes. Sorry if that's what my words said. It really wasn't my intention. I'm not complaining about reading the code. It is actually a pleasure. But I'll bother the MDIO guys on that matter :) > Anyway. > > > > I don't mean to be rude, but I don't have the time to dig into this any > > > further, sorry. If you are truly committed to better the phylib API, > > > please bring it up with the phylib people instead. I literally only > > > care about the thing that Alvin pointed out, which is that you made > > > unjustified changes to a DSA driver. > > > > Sure, phylib is still for netdev, right? > > ETHERNET PHY LIBRARY > M: Andrew Lunn <andrew@lunn.ch> > M: Heiner Kallweit <hkallweit1@gmail.com> > R: Russell King <linux@armlinux.org.uk> > L: netdev@vger.kernel.org > S: Maintained > F: Documentation/ABI/testing/sysfs-class-net-phydev > F: Documentation/devicetree/bindings/net/ethernet-phy.yaml > F: Documentation/devicetree/bindings/net/mdio* > F: Documentation/devicetree/bindings/net/qca,ar803x.yaml > F: Documentation/networking/phy.rst > F: drivers/net/mdio/ > F: drivers/net/mdio/acpi_mdio.c > F: drivers/net/mdio/fwnode_mdio.c > F: drivers/net/mdio/of_mdio.c > F: drivers/net/pcs/ > F: drivers/net/phy/ > F: include/dt-bindings/net/qca-ar803x.h > F: include/linux/*mdio*.h > F: include/linux/linkmode.h > F: include/linux/mdio/*.h > F: include/linux/mii.h > F: include/linux/of_net.h > F: include/linux/phy.h > F: include/linux/phy_fixed.h > F: include/linux/phylib_stubs.h > F: include/linux/platform_data/mdio-bcm-unimac.h > F: include/linux/platform_data/mdio-gpio.h > F: include/trace/events/mdio.h > F: include/uapi/linux/mdio.h > F: include/uapi/linux/mii.h > F: net/core/of_net.c > > > > Oh, couldn't we straight-up revert that instead? :) The user_mii_bus > > > is created by DSA for compatibility with non-OF. I cannot understand > > > why you insist to attach an OF node to it. > > > > Please, not before this patch series gets merged or you'll break > > MDIO-connected Realtek DSA switches, at least the IRQ handling. > > I'll send the revert myself afterwards. > > > > > But otherwise, yes, it is the same situation: of_node_put(), called > > > before unregistering an MDIO bus registered with of_mdiobus_register(), > > > means that the full OF API on this MDIO bus may not work correctly. > > > I don't know the exact conditions though. It might be marginal or even > > > a bug that's impossible to trigger. I haven't tested anything. > > > > OK. I'll not try to fix that but revert it as soon as possible without > > breaking existing code. > > Ok, I'm not saying "revert it NOW". But it would be good if you could > revert it as part of the realtek-common series, as a last patch > (provided that you've done your homework and nobody else relies on it). > Or at least not forget about it. I'll add the revert patch. I believe I just need to pay attention to those cases where phy_read/phy_write are included in ds_ops. > > > I understand why a driver may want a ds->user_mii_bus. And I understand > > > why a driver may want an MDIO bus with an of_node. What I don't understand > > > is who might want both at the same time, and why. > > > > That one I might be novice enough to answer :). > > > > When you start to write a new driver, you read the docs to get a > > general idea. However, as code moves faster than docs, you mainly rely > > on code. So, you just choose a driver (or a couple of them) to inspire > > you. You normally prefer a small driver because it is less code to > > read and it might be just enough to get started. As long as it is > > mainline, nothing indicates it should not be used as a reference. > > And when you consider that DSA has better documentation than most > subsystems out there.... > > Would it blow your mind away if I told you that the documentation is > written based on the code? The same code from which you draw a lazy > conclusion. > > You have perfectly laid out why the code is not the problem, and why the > documentation is not the solution. The problem is the unwillingness to > spend time to understand, but to want to push your changes nonetheless. > The problem is on your end. I'm sorry, it has to be said. Oh, I did spend a lot of time on it, hundreds from my free and sleep time :). Sometimes you just cannot understand by yourself. And yes, I do believe in most cases I'm wrong. > > I wasn't the one that wrote most of the Realtek DSA driver but I see > > the same OF + user_mii_bus pattern in more than one driver. If you > > want to stop spreading, as rewriting all affected drivers might not be > > an option, a nice /* TODO: convert to user YXZ */ comment might do the > > trick. An updated doc suggesting a driver to be used as an example > > would also be nice. > > I don't have time, Luiz. I spent 1 and a half hours today replying > to just your emails, and one and a half hours yesterday. I have a job. Vladimir, I cannot thank you enough. I probably wouldn't get it (and maybe I still didn't) without your help. > You've made me see that I'm wasting my time writing documentation for > people who want instant gratification instead. I don't know how to get > to them. I'll think about it some more. I wouldn't say submitting patches to the kernel brings some kind of gratification (except if you are a type of masochist). I'm just trying to help. Probably you are spending more time answering my emails than you should. ML are great but their knowledge tends to fade away with time. The answer to some topics I brought should be there, not here. Save your time for reviewing the code. I normally work a lot more than I should just to avoid the chance of reworking on it. A comment in code like /* This behavior is deprecated. Please see https://kernel.org/doc/xxx */ would both avoid the code to be spread and alert maintainers that there is some work to be done. We all have nicer things to do instead but it could save some future headaches. My 2 cents. Regards, Luiz
diff --git a/drivers/net/dsa/realtek/realtek-smi.c b/drivers/net/dsa/realtek/realtek-smi.c index 755546ed8db6..ddcae546afbc 100644 --- a/drivers/net/dsa/realtek/realtek-smi.c +++ b/drivers/net/dsa/realtek/realtek-smi.c @@ -389,15 +389,15 @@ static int realtek_smi_setup_mdio(struct dsa_switch *ds) priv->user_mii_bus->write = realtek_smi_mdio_write; snprintf(priv->user_mii_bus->id, MII_BUS_ID_SIZE, "SMI-%d", ds->index); - priv->user_mii_bus->dev.of_node = mdio_np; priv->user_mii_bus->parent = priv->dev; ds->user_mii_bus = priv->user_mii_bus; ret = devm_of_mdiobus_register(priv->dev, priv->user_mii_bus, mdio_np); + of_node_put(mdio_np); if (ret) { dev_err(priv->dev, "unable to register MDIO bus %s\n", priv->user_mii_bus->id); - goto err_put_node; + return ret; } return 0; @@ -514,8 +514,6 @@ static void realtek_smi_remove(struct platform_device *pdev) return; dsa_unregister_switch(priv->ds); - if (priv->user_mii_bus) - of_node_put(priv->user_mii_bus->dev.of_node); /* leave the device reset asserted */ if (priv->reset)
If we don't keep a reference to the OF node, we can put it right after we use it during MDIO registration. Signed-off-by: Luiz Angelo Daros de Luca <luizluca@gmail.com> --- drivers/net/dsa/realtek/realtek-smi.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)