diff mbox series

[net-next,2/6] software node: allow named software node to be created

Message ID E1oCNky-006e3g-KA@rmk-PC.armlinux.org.uk (mailing list archive)
State New, archived
Headers show
Series net: dsa: always use phylink | expand

Commit Message

Russell King (Oracle) July 15, 2022, 4:01 p.m. UTC
From: Vladimir Oltean <vladimir.oltean@nxp.com>

Allow a named software node to be created, which is needed for software
nodes for a fixed-link specification for DSA.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
---
 drivers/base/swnode.c    | 14 ++++++++++++--
 include/linux/property.h |  4 ++++
 2 files changed, 16 insertions(+), 2 deletions(-)

Comments

Andy Shevchenko July 15, 2022, 7:57 p.m. UTC | #1
On Fri, Jul 15, 2022 at 05:01:32PM +0100, Russell King wrote:
> From: Vladimir Oltean <vladimir.oltean@nxp.com>
> 
> Allow a named software node to be created, which is needed for software
> nodes for a fixed-link specification for DSA.

In general I have no objection, but what's worrying me is a possibility to
collide in namespace. With the current code the name is generated based on
unique IDs, how can we make this one more robust?
Vladimir Oltean July 15, 2022, 8:17 p.m. UTC | #2
On Fri, Jul 15, 2022 at 10:57:55PM +0300, Andy Shevchenko wrote:
> On Fri, Jul 15, 2022 at 05:01:32PM +0100, Russell King wrote:
> > From: Vladimir Oltean <vladimir.oltean@nxp.com>
> > 
> > Allow a named software node to be created, which is needed for software
> > nodes for a fixed-link specification for DSA.
> 
> In general I have no objection, but what's worrying me is a possibility to
> collide in namespace. With the current code the name is generated based on
> unique IDs, how can we make this one more robust?

Could you be more clear about the exact concern?
Andy Shevchenko July 15, 2022, 8:33 p.m. UTC | #3
On Fri, Jul 15, 2022 at 11:17:15PM +0300, Vladimir Oltean wrote:
> On Fri, Jul 15, 2022 at 10:57:55PM +0300, Andy Shevchenko wrote:
> > On Fri, Jul 15, 2022 at 05:01:32PM +0100, Russell King wrote:
> > > From: Vladimir Oltean <vladimir.oltean@nxp.com>
> > > 
> > > Allow a named software node to be created, which is needed for software
> > > nodes for a fixed-link specification for DSA.
> > 
> > In general I have no objection, but what's worrying me is a possibility to
> > collide in namespace. With the current code the name is generated based on
> > unique IDs, how can we make this one more robust?
> 
> Could you be more clear about the exact concern?

Each software node can be created with a name. The hierarchy should be unique,
means that there can't be two or more nodes with the same path (like on file
system or more specifically here, Device Tree). Allowing to pass names we may
end up with the situation when it will be a path collision. Yet, the static
names are easier to check, because one may run `git grep ...` or coccinelle
script to see what's in the kernel.
Vladimir Oltean July 15, 2022, 8:48 p.m. UTC | #4
On Fri, Jul 15, 2022 at 11:33:40PM +0300, Andy Shevchenko wrote:
> On Fri, Jul 15, 2022 at 11:17:15PM +0300, Vladimir Oltean wrote:
> > On Fri, Jul 15, 2022 at 10:57:55PM +0300, Andy Shevchenko wrote:
> > > On Fri, Jul 15, 2022 at 05:01:32PM +0100, Russell King wrote:
> > > > From: Vladimir Oltean <vladimir.oltean@nxp.com>
> > > > 
> > > > Allow a named software node to be created, which is needed for software
> > > > nodes for a fixed-link specification for DSA.
> > > 
> > > In general I have no objection, but what's worrying me is a possibility to
> > > collide in namespace. With the current code the name is generated based on
> > > unique IDs, how can we make this one more robust?
> > 
> > Could you be more clear about the exact concern?
> 
> Each software node can be created with a name. The hierarchy should be unique,
> means that there can't be two or more nodes with the same path (like on file
> system or more specifically here, Device Tree). Allowing to pass names we may
> end up with the situation when it will be a path collision. Yet, the static
> names are easier to check, because one may run `git grep ...` or coccinelle
> script to see what's in the kernel.

So won't kobject_init_and_add() fail on namespace collision? Is it the
problem that it's going to fail, or that it's not trivial to statically
determine whether it'll fail?

Sorry, but I don't see something actionable about this.
Andy Shevchenko July 18, 2022, 12:29 p.m. UTC | #5
On Fri, Jul 15, 2022 at 11:48:41PM +0300, Vladimir Oltean wrote:
> On Fri, Jul 15, 2022 at 11:33:40PM +0300, Andy Shevchenko wrote:
> > On Fri, Jul 15, 2022 at 11:17:15PM +0300, Vladimir Oltean wrote:
> > > On Fri, Jul 15, 2022 at 10:57:55PM +0300, Andy Shevchenko wrote:
> > > > On Fri, Jul 15, 2022 at 05:01:32PM +0100, Russell King wrote:
> > > > > From: Vladimir Oltean <vladimir.oltean@nxp.com>
> > > > > 
> > > > > Allow a named software node to be created, which is needed for software
> > > > > nodes for a fixed-link specification for DSA.
> > > > 
> > > > In general I have no objection, but what's worrying me is a possibility to
> > > > collide in namespace. With the current code the name is generated based on
> > > > unique IDs, how can we make this one more robust?
> > > 
> > > Could you be more clear about the exact concern?
> > 
> > Each software node can be created with a name. The hierarchy should be unique,
> > means that there can't be two or more nodes with the same path (like on file
> > system or more specifically here, Device Tree). Allowing to pass names we may
> > end up with the situation when it will be a path collision. Yet, the static
> > names are easier to check, because one may run `git grep ...` or coccinelle
> > script to see what's in the kernel.
> 
> So won't kobject_init_and_add() fail on namespace collision? Is it the
> problem that it's going to fail, or that it's not trivial to statically
> determine whether it'll fail?
> 
> Sorry, but I don't see something actionable about this.

I'm talking about validation before a runtime. But if you think that is fine,
let's fail it at runtime, okay, and consume more backtraces in the future.
Russell King (Oracle) July 18, 2022, 1:27 p.m. UTC | #6
On Mon, Jul 18, 2022 at 03:29:52PM +0300, Andy Shevchenko wrote:
> On Fri, Jul 15, 2022 at 11:48:41PM +0300, Vladimir Oltean wrote:
> > So won't kobject_init_and_add() fail on namespace collision? Is it the
> > problem that it's going to fail, or that it's not trivial to statically
> > determine whether it'll fail?
> > 
> > Sorry, but I don't see something actionable about this.
> 
> I'm talking about validation before a runtime. But if you think that is fine,
> let's fail it at runtime, okay, and consume more backtraces in the future.

Is there any sane way to do validation of this namespace before
runtime?

The problem in this instance is we need a node named "fixed-link" that
is attached to the parent node as that is defined in the binding doc,
and we're creating swnodes to provide software generated nodes for
this binding.

There could be several such nodes scattered around, but in this
instance they are very short-lived before they are destroyed, they
don't even need to be published to userspace (and its probably a waste
of CPU cycles for them to be published there.)

So, for this specific case, is this the best approach, or is there
some better way to achieve what we need here?

Thanks.
Andy Shevchenko July 18, 2022, 6:43 p.m. UTC | #7
On Mon, Jul 18, 2022 at 02:27:02PM +0100, Russell King (Oracle) wrote:
> On Mon, Jul 18, 2022 at 03:29:52PM +0300, Andy Shevchenko wrote:
> > On Fri, Jul 15, 2022 at 11:48:41PM +0300, Vladimir Oltean wrote:
> > > So won't kobject_init_and_add() fail on namespace collision? Is it the
> > > problem that it's going to fail, or that it's not trivial to statically
> > > determine whether it'll fail?
> > > 
> > > Sorry, but I don't see something actionable about this.
> > 
> > I'm talking about validation before a runtime. But if you think that is fine,
> > let's fail it at runtime, okay, and consume more backtraces in the future.
> 
> Is there any sane way to do validation of this namespace before
> runtime?

For statically compiled, I think we can do it (to some extent).
Currently only three drivers, if I'm not mistaken, define software nodes with
names. It's easy to check that their node names are unique.

When you allow such an API then we might have tracebacks (from sysfs) bout name
collisions. Not that is something new to kernel (we have seen many of a kind),
but I prefer, if possible, to validate this before sysfs issues a traceback.

> The problem in this instance is we need a node named "fixed-link" that
> is attached to the parent node as that is defined in the binding doc,
> and we're creating swnodes to provide software generated nodes for
> this binding.

And how you guarantee that it will be only a single one with unique pathname?

For example, you have two DSA cards (or whatever it's called) in the SMP system,
it mean that there is non-zero probability of coexisting swnodes for them.

> There could be several such nodes scattered around, but in this
> instance they are very short-lived before they are destroyed, they
> don't even need to be published to userspace (and its probably a waste
> of CPU cycles for them to be published there.)
> 
> So, for this specific case, is this the best approach, or is there
> some better way to achieve what we need here?

Honestly, I don't know.

The "workaround" (but it looks to me rather a hack) is to create unique swnode
and make fixed-link as a child of it.

Or entire concept of the root swnodes (when name is provided) should be
reconsidered, so somehow we will have a uniqueness so that the entire
path(s) behind it will be caller-dependent. But this I also don't like.

Maybe Heikki, Sakari, Rafael can share their thoughts...

Just for my learning, why PHY uses "fixed-link" instead of relying on a
(firmware) graph? It might be the actual solution to your problem.

How graphs are used with swnodes, you may look into IPU3 (Intel Camera)
glue driver to support devices before MIPI standardisation of the
respective properties.
Andy Shevchenko July 18, 2022, 6:53 p.m. UTC | #8
On Mon, Jul 18, 2022 at 09:43:42PM +0300, Andy Shevchenko wrote:
> On Mon, Jul 18, 2022 at 02:27:02PM +0100, Russell King (Oracle) wrote:
> > On Mon, Jul 18, 2022 at 03:29:52PM +0300, Andy Shevchenko wrote:
> > > On Fri, Jul 15, 2022 at 11:48:41PM +0300, Vladimir Oltean wrote:
> > > > So won't kobject_init_and_add() fail on namespace collision? Is it the
> > > > problem that it's going to fail, or that it's not trivial to statically
> > > > determine whether it'll fail?
> > > > 
> > > > Sorry, but I don't see something actionable about this.
> > > 
> > > I'm talking about validation before a runtime. But if you think that is fine,
> > > let's fail it at runtime, okay, and consume more backtraces in the future.
> > 
> > Is there any sane way to do validation of this namespace before
> > runtime?
> 
> For statically compiled, I think we can do it (to some extent).
> Currently only three drivers, if I'm not mistaken, define software nodes with
> names. It's easy to check that their node names are unique.
> 
> When you allow such an API then we might have tracebacks (from sysfs) bout name
> collisions. Not that is something new to kernel (we have seen many of a kind),
> but I prefer, if possible, to validate this before sysfs issues a traceback.
> 
> > The problem in this instance is we need a node named "fixed-link" that
> > is attached to the parent node as that is defined in the binding doc,
> > and we're creating swnodes to provide software generated nodes for
> > this binding.
> 
> And how you guarantee that it will be only a single one with unique pathname?
> 
> For example, you have two DSA cards (or whatever it's called) in the SMP system,
> it mean that there is non-zero probability of coexisting swnodes for them.
> 
> > There could be several such nodes scattered around, but in this
> > instance they are very short-lived before they are destroyed, they
> > don't even need to be published to userspace (and its probably a waste
> > of CPU cycles for them to be published there.)
> > 
> > So, for this specific case, is this the best approach, or is there
> > some better way to achieve what we need here?
> 
> Honestly, I don't know.
> 
> The "workaround" (but it looks to me rather a hack) is to create unique swnode
> and make fixed-link as a child of it.
> 
> Or entire concept of the root swnodes (when name is provided) should be
> reconsidered, so somehow we will have a uniqueness so that the entire
> path(s) behind it will be caller-dependent. But this I also don't like.
> 
> Maybe Heikki, Sakari, Rafael can share their thoughts...
> 
> Just for my learning, why PHY uses "fixed-link" instead of relying on a
> (firmware) graph? It might be the actual solution to your problem.
> 
> How graphs are used with swnodes, you may look into IPU3 (Intel Camera)
> glue driver to support devices before MIPI standardisation of the
> respective properties.

Forgot to say (yes, it maybe obvious) that this API will be exported,
anyone can use it and trap into the similar issue, because, for example,
of testing in environment with a single instance of the caller.
Russell King (Oracle) July 18, 2022, 7:11 p.m. UTC | #9
On Mon, Jul 18, 2022 at 09:43:41PM +0300, Andy Shevchenko wrote:
> On Mon, Jul 18, 2022 at 02:27:02PM +0100, Russell King (Oracle) wrote:
> > On Mon, Jul 18, 2022 at 03:29:52PM +0300, Andy Shevchenko wrote:
> > > On Fri, Jul 15, 2022 at 11:48:41PM +0300, Vladimir Oltean wrote:
> > > > So won't kobject_init_and_add() fail on namespace collision? Is it the
> > > > problem that it's going to fail, or that it's not trivial to statically
> > > > determine whether it'll fail?
> > > > 
> > > > Sorry, but I don't see something actionable about this.
> > > 
> > > I'm talking about validation before a runtime. But if you think that is fine,
> > > let's fail it at runtime, okay, and consume more backtraces in the future.
> > 
> > Is there any sane way to do validation of this namespace before
> > runtime?
> 
> For statically compiled, I think we can do it (to some extent).
> Currently only three drivers, if I'm not mistaken, define software nodes with
> names. It's easy to check that their node names are unique.
> 
> When you allow such an API then we might have tracebacks (from sysfs) bout name
> collisions. Not that is something new to kernel (we have seen many of a kind),
> but I prefer, if possible, to validate this before sysfs issues a traceback.
> 
> > The problem in this instance is we need a node named "fixed-link" that
> > is attached to the parent node as that is defined in the binding doc,
> > and we're creating swnodes to provide software generated nodes for
> > this binding.
> 
> And how you guarantee that it will be only a single one with unique pathname?
> 
> For example, you have two DSA cards (or whatever it's called) in the SMP system,
> it mean that there is non-zero probability of coexisting swnodes for them.

Good point - I guess we at least need to attach the swnode parent to the
device so its path is unique, because right now that isn't the case. I'm
guessing that:

        new_port_fwnode = fwnode_create_software_node(port_props, NULL);

will create something at the root of the swnode tree, and then:

        fixed_link_fwnode = fwnode_create_named_software_node(fixed_link_props,
                                                              new_port_fwnode,
                                                              "fixed-link");

will create a node with a fixed name. I guess it in part depends what
pathname the first node gets (which we don't specify.) I'm not familiar
with the swnode code to know what happens with the naming for the first
node.

However, it seems sensible to me to attach the first node to the device
node, thus giving it a unique fwnode path. Does that solve the problem
in swnode land?

> > There could be several such nodes scattered around, but in this
> > instance they are very short-lived before they are destroyed, they
> > don't even need to be published to userspace (and its probably a waste
> > of CPU cycles for them to be published there.)
> > 
> > So, for this specific case, is this the best approach, or is there
> > some better way to achieve what we need here?
> 
> Honestly, I don't know.
> 
> The "workaround" (but it looks to me rather a hack) is to create unique swnode
> and make fixed-link as a child of it.
> 
> Or entire concept of the root swnodes (when name is provided) should be
> reconsidered, so somehow we will have a uniqueness so that the entire
> path(s) behind it will be caller-dependent. But this I also don't like.
> 
> Maybe Heikki, Sakari, Rafael can share their thoughts...
> 
> Just for my learning, why PHY uses "fixed-link" instead of relying on a
> (firmware) graph? It might be the actual solution to your problem.

That's a question for Andrew, but I've tried to solicit his comments on
several occasions concerning this "feature" of DSA but I keep getting
no reply. Honestly, I don't know the answer to your question.

The only thing that I know is that Andrew has been promoting this
feature where a switch port, whether it be connected to the CPU or
to another switch, which doesn't specify any link parameters will
automatically use the fastest "phy interface mode" and the fastest
link speed that can be supported by the DSA device.

This has caused issues over the last few years which we've bodged
around in various ways, and with updates to one of the DSA drivers
this bodging is becoming more of a wart that's spreading. So, I'm
trying to find a way to solve this.

My initial approach was to avoid fiddling with the firmware tree,
but Vladimir proposed this approach as being cleaner - and it means
the "bodge" becomes completely localised in the DSA (distributed
switch architecture) code rather than being spread into phylink.

I wish we could get rid of this "feature" but since it's been
established for many years, and we have at least one known driver
that uses it, getting rid of it breaks existing firmware trees.
I think we also have one other driver that makes use of it as
well, but I can't say for certain (because it's not really possible
to discern which drivers use this feature from reading the driver
code.) I've tried asking Andrew if he knows and got no response.

So I'm in a complete information vacuum here - all that I know is
that trying to convert the mv88e6xxx DSA driver to use phylink_pcs
will break it (as reported by Marek Behún), because phylink doesn't
get used if firmware is using this "defaulting" feature.

It's part of the DT binding, and remains so today - the properties
specifying the "phy-mode", "fixed-link" etc all remain optional.
Russell King (Oracle) July 18, 2022, 7:14 p.m. UTC | #10
On Mon, Jul 18, 2022 at 09:53:39PM +0300, Andy Shevchenko wrote:
> On Mon, Jul 18, 2022 at 09:43:42PM +0300, Andy Shevchenko wrote:
> > On Mon, Jul 18, 2022 at 02:27:02PM +0100, Russell King (Oracle) wrote:
> > > On Mon, Jul 18, 2022 at 03:29:52PM +0300, Andy Shevchenko wrote:
> > > > On Fri, Jul 15, 2022 at 11:48:41PM +0300, Vladimir Oltean wrote:
> > > > > So won't kobject_init_and_add() fail on namespace collision? Is it the
> > > > > problem that it's going to fail, or that it's not trivial to statically
> > > > > determine whether it'll fail?
> > > > > 
> > > > > Sorry, but I don't see something actionable about this.
> > > > 
> > > > I'm talking about validation before a runtime. But if you think that is fine,
> > > > let's fail it at runtime, okay, and consume more backtraces in the future.
> > > 
> > > Is there any sane way to do validation of this namespace before
> > > runtime?
> > 
> > For statically compiled, I think we can do it (to some extent).
> > Currently only three drivers, if I'm not mistaken, define software nodes with
> > names. It's easy to check that their node names are unique.
> > 
> > When you allow such an API then we might have tracebacks (from sysfs) bout name
> > collisions. Not that is something new to kernel (we have seen many of a kind),
> > but I prefer, if possible, to validate this before sysfs issues a traceback.
> > 
> > > The problem in this instance is we need a node named "fixed-link" that
> > > is attached to the parent node as that is defined in the binding doc,
> > > and we're creating swnodes to provide software generated nodes for
> > > this binding.
> > 
> > And how you guarantee that it will be only a single one with unique pathname?
> > 
> > For example, you have two DSA cards (or whatever it's called) in the SMP system,
> > it mean that there is non-zero probability of coexisting swnodes for them.
> > 
> > > There could be several such nodes scattered around, but in this
> > > instance they are very short-lived before they are destroyed, they
> > > don't even need to be published to userspace (and its probably a waste
> > > of CPU cycles for them to be published there.)
> > > 
> > > So, for this specific case, is this the best approach, or is there
> > > some better way to achieve what we need here?
> > 
> > Honestly, I don't know.
> > 
> > The "workaround" (but it looks to me rather a hack) is to create unique swnode
> > and make fixed-link as a child of it.
> > 
> > Or entire concept of the root swnodes (when name is provided) should be
> > reconsidered, so somehow we will have a uniqueness so that the entire
> > path(s) behind it will be caller-dependent. But this I also don't like.
> > 
> > Maybe Heikki, Sakari, Rafael can share their thoughts...
> > 
> > Just for my learning, why PHY uses "fixed-link" instead of relying on a
> > (firmware) graph? It might be the actual solution to your problem.
> > 
> > How graphs are used with swnodes, you may look into IPU3 (Intel Camera)
> > glue driver to support devices before MIPI standardisation of the
> > respective properties.
> 
> Forgot to say (yes, it maybe obvious) that this API will be exported,
> anyone can use it and trap into the similar issue, because, for example,
> of testing in environment with a single instance of the caller.

I think we're coming to the conclusion that using swnodes is not the
correct approach for this problem, correct?
Andy Shevchenko July 18, 2022, 7:24 p.m. UTC | #11
On Mon, Jul 18, 2022 at 08:14:58PM +0100, Russell King (Oracle) wrote:
> On Mon, Jul 18, 2022 at 09:53:39PM +0300, Andy Shevchenko wrote:
> > On Mon, Jul 18, 2022 at 09:43:42PM +0300, Andy Shevchenko wrote:
> > > On Mon, Jul 18, 2022 at 02:27:02PM +0100, Russell King (Oracle) wrote:
> > > > On Mon, Jul 18, 2022 at 03:29:52PM +0300, Andy Shevchenko wrote:
> > > > > On Fri, Jul 15, 2022 at 11:48:41PM +0300, Vladimir Oltean wrote:
> > > > > > So won't kobject_init_and_add() fail on namespace collision? Is it the
> > > > > > problem that it's going to fail, or that it's not trivial to statically
> > > > > > determine whether it'll fail?
> > > > > > 
> > > > > > Sorry, but I don't see something actionable about this.
> > > > > 
> > > > > I'm talking about validation before a runtime. But if you think that is fine,
> > > > > let's fail it at runtime, okay, and consume more backtraces in the future.
> > > > 
> > > > Is there any sane way to do validation of this namespace before
> > > > runtime?
> > > 
> > > For statically compiled, I think we can do it (to some extent).
> > > Currently only three drivers, if I'm not mistaken, define software nodes with
> > > names. It's easy to check that their node names are unique.
> > > 
> > > When you allow such an API then we might have tracebacks (from sysfs) bout name
> > > collisions. Not that is something new to kernel (we have seen many of a kind),
> > > but I prefer, if possible, to validate this before sysfs issues a traceback.
> > > 
> > > > The problem in this instance is we need a node named "fixed-link" that
> > > > is attached to the parent node as that is defined in the binding doc,
> > > > and we're creating swnodes to provide software generated nodes for
> > > > this binding.
> > > 
> > > And how you guarantee that it will be only a single one with unique pathname?
> > > 
> > > For example, you have two DSA cards (or whatever it's called) in the SMP system,
> > > it mean that there is non-zero probability of coexisting swnodes for them.
> > > 
> > > > There could be several such nodes scattered around, but in this
> > > > instance they are very short-lived before they are destroyed, they
> > > > don't even need to be published to userspace (and its probably a waste
> > > > of CPU cycles for them to be published there.)
> > > > 
> > > > So, for this specific case, is this the best approach, or is there
> > > > some better way to achieve what we need here?
> > > 
> > > Honestly, I don't know.
> > > 
> > > The "workaround" (but it looks to me rather a hack) is to create unique swnode
> > > and make fixed-link as a child of it.
> > > 
> > > Or entire concept of the root swnodes (when name is provided) should be
> > > reconsidered, so somehow we will have a uniqueness so that the entire
> > > path(s) behind it will be caller-dependent. But this I also don't like.
> > > 
> > > Maybe Heikki, Sakari, Rafael can share their thoughts...
> > > 
> > > Just for my learning, why PHY uses "fixed-link" instead of relying on a
> > > (firmware) graph? It might be the actual solution to your problem.
> > > 
> > > How graphs are used with swnodes, you may look into IPU3 (Intel Camera)
> > > glue driver to support devices before MIPI standardisation of the
> > > respective properties.
> > 
> > Forgot to say (yes, it maybe obvious) that this API will be exported,
> > anyone can use it and trap into the similar issue, because, for example,
> > of testing in environment with a single instance of the caller.
> 
> I think we're coming to the conclusion that using swnodes is not the
> correct approach for this problem, correct?

If I understand the possibilities of the usage in _this_ case, then it's
would be problematic (it does not mean it's incorrect). It might be due to
swnode design restrictions which shouldn't be made, I dunno. That' why
it's better to ask the others for their opinions.

By design swnode's name makes not much sense, because the payload there
is a property set, where _name_ is a must.

Now, telling you this, I'm questioning myself why the heck I added names
to swnodes in the intel_quark_i2c_gpio driver...
Andy Shevchenko July 18, 2022, 8:07 p.m. UTC | #12
On Mon, Jul 18, 2022 at 08:11:40PM +0100, Russell King (Oracle) wrote:
> On Mon, Jul 18, 2022 at 09:43:41PM +0300, Andy Shevchenko wrote:
> > On Mon, Jul 18, 2022 at 02:27:02PM +0100, Russell King (Oracle) wrote:
> > > On Mon, Jul 18, 2022 at 03:29:52PM +0300, Andy Shevchenko wrote:
> > > > On Fri, Jul 15, 2022 at 11:48:41PM +0300, Vladimir Oltean wrote:
> > > > > So won't kobject_init_and_add() fail on namespace collision? Is it the
> > > > > problem that it's going to fail, or that it's not trivial to statically
> > > > > determine whether it'll fail?
> > > > > 
> > > > > Sorry, but I don't see something actionable about this.
> > > > 
> > > > I'm talking about validation before a runtime. But if you think that is fine,
> > > > let's fail it at runtime, okay, and consume more backtraces in the future.
> > > 
> > > Is there any sane way to do validation of this namespace before
> > > runtime?
> > 
> > For statically compiled, I think we can do it (to some extent).
> > Currently only three drivers, if I'm not mistaken, define software nodes with
> > names. It's easy to check that their node names are unique.
> > 
> > When you allow such an API then we might have tracebacks (from sysfs) bout name
> > collisions. Not that is something new to kernel (we have seen many of a kind),
> > but I prefer, if possible, to validate this before sysfs issues a traceback.
> > 
> > > The problem in this instance is we need a node named "fixed-link" that
> > > is attached to the parent node as that is defined in the binding doc,
> > > and we're creating swnodes to provide software generated nodes for
> > > this binding.
> > 
> > And how you guarantee that it will be only a single one with unique pathname?
> > 
> > For example, you have two DSA cards (or whatever it's called) in the SMP system,
> > it mean that there is non-zero probability of coexisting swnodes for them.
> 
> Good point - I guess we at least need to attach the swnode parent to the
> device so its path is unique, because right now that isn't the case. I'm
> guessing that:
> 
>         new_port_fwnode = fwnode_create_software_node(port_props, NULL);
> 
> will create something at the root of the swnode tree, and then:
> 
>         fixed_link_fwnode = fwnode_create_named_software_node(fixed_link_props,
>                                                               new_port_fwnode,
>                                                               "fixed-link");
> 
> will create a node with a fixed name. I guess it in part depends what
> pathname the first node gets (which we don't specify.) I'm not familiar
> with the swnode code to know what happens with the naming for the first
> node.

First node's name will be unique which is guaranteed by IDA framework. If we
have already 2B nodes, then yes, it would be problematic (but 2^31 ought to be
enough :-).

> However, it seems sensible to me to attach the first node to the device
> node, thus giving it a unique fwnode path. Does that solve the problem
> in swnode land?

Yes, but in the driver you will have that as child of the device, analogue in DT

  my_root_node { // equal the level of device node you attach it to
	  fixed-link {
	  }
  }

(Sorry, I don't know the DT syntax by heart, but I hope you got the idea.)

To access it will be something like

  child = fwnode_get_named_child_node(fwnode, "fixed-link");

And reading properties, if needed,

  ret = fnode_property_read_...(child, ...);


But this might require to adopt drivers, no? Or I misunderstand the hierarchy.

> > > There could be several such nodes scattered around, but in this
> > > instance they are very short-lived before they are destroyed, they
> > > don't even need to be published to userspace (and its probably a waste
> > > of CPU cycles for them to be published there.)
> > > 
> > > So, for this specific case, is this the best approach, or is there
> > > some better way to achieve what we need here?
> > 
> > Honestly, I don't know.
> > 
> > The "workaround" (but it looks to me rather a hack) is to create unique swnode
> > and make fixed-link as a child of it.
> > 
> > Or entire concept of the root swnodes (when name is provided) should be
> > reconsidered, so somehow we will have a uniqueness so that the entire
> > path(s) behind it will be caller-dependent. But this I also don't like.
> > 
> > Maybe Heikki, Sakari, Rafael can share their thoughts...
> > 
> > Just for my learning, why PHY uses "fixed-link" instead of relying on a
> > (firmware) graph? It might be the actual solution to your problem.
> 
> That's a question for Andrew, but I've tried to solicit his comments on
> several occasions concerning this "feature" of DSA but I keep getting
> no reply. Honestly, I don't know the answer to your question.
> 
> The only thing that I know is that Andrew has been promoting this
> feature where a switch port, whether it be connected to the CPU or
> to another switch, which doesn't specify any link parameters will
> automatically use the fastest "phy interface mode" and the fastest
> link speed that can be supported by the DSA device.
> 
> This has caused issues over the last few years which we've bodged
> around in various ways, and with updates to one of the DSA drivers
> this bodging is becoming more of a wart that's spreading. So, I'm
> trying to find a way to solve this.
> 
> My initial approach was to avoid fiddling with the firmware tree,
> but Vladimir proposed this approach as being cleaner - and it means
> the "bodge" becomes completely localised in the DSA (distributed
> switch architecture) code rather than being spread into phylink.
> 
> I wish we could get rid of this "feature" but since it's been
> established for many years, and we have at least one known driver
> that uses it, getting rid of it breaks existing firmware trees.
> I think we also have one other driver that makes use of it as
> well, but I can't say for certain (because it's not really possible
> to discern which drivers use this feature from reading the driver
> code.) I've tried asking Andrew if he knows and got no response.
> 
> So I'm in a complete information vacuum here - all that I know is
> that trying to convert the mv88e6xxx DSA driver to use phylink_pcs
> will break it (as reported by Marek Behún), because phylink doesn't
> get used if firmware is using this "defaulting" feature.
> 
> It's part of the DT binding, and remains so today - the properties
> specifying the "phy-mode", "fixed-link" etc all remain optional.

Okay, grepping the kernel I see this:

	dn = fwnode_get_named_child_node(fwnode, "fixed-link");

This seems the same what you need. I dunno why swnode should be created with
a name for this?

Eliminating an empty root node sounds plausible effect, but the consequences
are not 1:1 mapping of swnodes as it's designed for

  firmware device node		+=	unique root swnode
    property "X"		+=	property "Y"
    child "A"			+=	child "B"

Resulting firmware node as driver sees it:

	device node
		property "X"
		property "Y"
		child "A"
		child "B"

That's all said, I guess the way with a two swnodes (hierarhy) is the correct
one from the beginning.

To the API, now I can tell you how to validate!

Just be sure if there is no name provided, we are just fine. Otherwise
parent _swnode_ should be non-NULL. In such case parent can be only set
either dynamically _or_ statically assigned with a name.
Russell King (Oracle) July 18, 2022, 8:38 p.m. UTC | #13
On Mon, Jul 18, 2022 at 11:07:30PM +0300, Andy Shevchenko wrote:
> On Mon, Jul 18, 2022 at 08:11:40PM +0100, Russell King (Oracle) wrote:
> > Good point - I guess we at least need to attach the swnode parent to the
> > device so its path is unique, because right now that isn't the case. I'm
> > guessing that:
> > 
> >         new_port_fwnode = fwnode_create_software_node(port_props, NULL);
> > 
> > will create something at the root of the swnode tree, and then:
> > 
> >         fixed_link_fwnode = fwnode_create_named_software_node(fixed_link_props,
> >                                                               new_port_fwnode,
> >                                                               "fixed-link");
> > 
> > will create a node with a fixed name. I guess it in part depends what
> > pathname the first node gets (which we don't specify.) I'm not familiar
> > with the swnode code to know what happens with the naming for the first
> > node.
> 
> First node's name will be unique which is guaranteed by IDA framework. If we
> have already 2B nodes, then yes, it would be problematic (but 2^31 ought to be
> enough :-).
> 
> > However, it seems sensible to me to attach the first node to the device
> > node, thus giving it a unique fwnode path. Does that solve the problem
> > in swnode land?
> 
> Yes, but in the driver you will have that as child of the device, analogue in DT
> 
>   my_root_node { // equal the level of device node you attach it to
> 	  fixed-link {
> 	  }
>   }
> 
> (Sorry, I don't know the DT syntax by heart, but I hope you got the idea.)

Yes, that looks about right.

What we're attempting to do here is create the swnode equivalent of this
DT description:

	some_node {
		phy-mode = "foo";

		fixed-link {
			speed = X;
			full-duplex;
		};
	};

and the some_node fwnode handle gets passed into phylink for it to
parse - we never attach it to the firmware tree itself. Once phylink
has parsed it, we destroy the swnode tree since it's no longer useful.

This would get used in this situation as an example:

	switch@4 {
		compatible = "marvell,mv88e6085";

		ports {
			port@0 {
				reg = <0>;
				phy-mode = "internal";
				phy-handle = <&sw_phy_0>;
			};
			...
			port@5 {
				reg = <5>;
				label = "cpu";
				ethernet = <&eth1>;
			};
		};
	};

The DSA driver knows the capabilities of the chip, so it knows what the
fastest "phy-mode" and speed would be, and whether full or half duplex
are supported.

We need to get this information into phylink some how, and my initial
approach was to add a new function to phylink to achieve that.

We would normally have passed the "port@5" node to phylink, just as we
pass the "port@0" node. However, because the "port@5" operates as a
fixed-link as determined by the hardware/driver, we need some way to
get that into phylink.

So, Vladimir's approach is to create a swnode tree that reflects the
DT layout, and rather than passing the "port@5" as a fwnode to phylink,
we instead pass that "some_node" swnode instead. Phylink then uses
normal fwnode APIs to parse the swnode tree it's been given, resulting
in it picking up the fixed-link specification as if it had been in the
original DT.

We don't augment the existing firmware tree for "port@5", we are
effectively creating a small sub-tree and using it as a subsitute
description.

I hope that clarifies what is going on here and why.
Marek Behún July 18, 2022, 8:39 p.m. UTC | #14
On Mon, 18 Jul 2022 22:24:09 +0300
Andy Shevchenko <andriy.shevchenko@linux.intel.com> wrote:

> On Mon, Jul 18, 2022 at 08:14:58PM +0100, Russell King (Oracle) wrote:
> > On Mon, Jul 18, 2022 at 09:53:39PM +0300, Andy Shevchenko wrote:  
> > > On Mon, Jul 18, 2022 at 09:43:42PM +0300, Andy Shevchenko wrote:  
> > > > On Mon, Jul 18, 2022 at 02:27:02PM +0100, Russell King (Oracle) wrote:  
> > > > > On Mon, Jul 18, 2022 at 03:29:52PM +0300, Andy Shevchenko wrote:  
> > > > > > On Fri, Jul 15, 2022 at 11:48:41PM +0300, Vladimir Oltean wrote:  
> > > > > > > So won't kobject_init_and_add() fail on namespace collision? Is it the
> > > > > > > problem that it's going to fail, or that it's not trivial to statically
> > > > > > > determine whether it'll fail?
> > > > > > > 
> > > > > > > Sorry, but I don't see something actionable about this.  
> > > > > > 
> > > > > > I'm talking about validation before a runtime. But if you think that is fine,
> > > > > > let's fail it at runtime, okay, and consume more backtraces in the future.  
> > > > > 
> > > > > Is there any sane way to do validation of this namespace before
> > > > > runtime?  
> > > > 
> > > > For statically compiled, I think we can do it (to some extent).
> > > > Currently only three drivers, if I'm not mistaken, define software nodes with
> > > > names. It's easy to check that their node names are unique.
> > > > 
> > > > When you allow such an API then we might have tracebacks (from sysfs) bout name
> > > > collisions. Not that is something new to kernel (we have seen many of a kind),
> > > > but I prefer, if possible, to validate this before sysfs issues a traceback.
> > > >   
> > > > > The problem in this instance is we need a node named "fixed-link" that
> > > > > is attached to the parent node as that is defined in the binding doc,
> > > > > and we're creating swnodes to provide software generated nodes for
> > > > > this binding.  
> > > > 
> > > > And how you guarantee that it will be only a single one with unique pathname?
> > > > 
> > > > For example, you have two DSA cards (or whatever it's called) in the SMP system,
> > > > it mean that there is non-zero probability of coexisting swnodes for them.
> > > >   
> > > > > There could be several such nodes scattered around, but in this
> > > > > instance they are very short-lived before they are destroyed, they
> > > > > don't even need to be published to userspace (and its probably a waste
> > > > > of CPU cycles for them to be published there.)
> > > > > 
> > > > > So, for this specific case, is this the best approach, or is there
> > > > > some better way to achieve what we need here?  
> > > > 
> > > > Honestly, I don't know.
> > > > 
> > > > The "workaround" (but it looks to me rather a hack) is to create unique swnode
> > > > and make fixed-link as a child of it.
> > > > 
> > > > Or entire concept of the root swnodes (when name is provided) should be
> > > > reconsidered, so somehow we will have a uniqueness so that the entire
> > > > path(s) behind it will be caller-dependent. But this I also don't like.
> > > > 
> > > > Maybe Heikki, Sakari, Rafael can share their thoughts...
> > > > 
> > > > Just for my learning, why PHY uses "fixed-link" instead of relying on a
> > > > (firmware) graph? It might be the actual solution to your problem.
> > > > 
> > > > How graphs are used with swnodes, you may look into IPU3 (Intel Camera)
> > > > glue driver to support devices before MIPI standardisation of the
> > > > respective properties.  
> > > 
> > > Forgot to say (yes, it maybe obvious) that this API will be exported,
> > > anyone can use it and trap into the similar issue, because, for example,
> > > of testing in environment with a single instance of the caller.  
> > 
> > I think we're coming to the conclusion that using swnodes is not the
> > correct approach for this problem, correct?  
> 
> If I understand the possibilities of the usage in _this_ case, then it's
> would be problematic (it does not mean it's incorrect). It might be due to
> swnode design restrictions which shouldn't be made, I dunno. That' why
> it's better to ask the others for their opinions.
> 
> By design swnode's name makes not much sense, because the payload there
> is a property set, where _name_ is a must.
> 
> Now, telling you this, I'm questioning myself why the heck I added names
> to swnodes in the intel_quark_i2c_gpio driver...

1. the way we use this new named swnode (in patch 5/6 of this series) is
   that it gets destroyed immediately after being parsed, so I don't
   think there will be collisions in the namespace for forseeable future

   also, we first create an unnamed swnode for port and only then
   fixed-link swnode as a child.

      new_port_fwnode = fwnode_create_software_node(port_props, NULL);
      ...
      fixed_link_fwnode =
        fwnode_create_named_software_node(fixed_link_props,
                                          new_port_fwnode, "fixed-link");

   so there shouldn't be a name collision, since the port node gets a
   unique name, or am I misunderstanding this?

2. even if there was a problem with name collision, I think the place
   that needs to be fixed is swnode system. What use are swnodes if
   they cannot be used like this?

Marek
Andrew Lunn July 18, 2022, 8:42 p.m. UTC | #15
> > Just for my learning, why PHY uses "fixed-link" instead of relying on a
> > (firmware) graph? It might be the actual solution to your problem.
> 
> That's a question for Andrew, but I've tried to solicit his comments on
> several occasions concerning this "feature" of DSA but I keep getting
> no reply. Honestly, I don't know the answer to your question.
> 
> The only thing that I know is that Andrew has been promoting this
> feature where a switch port, whether it be connected to the CPU or
> to another switch, which doesn't specify any link parameters will
> automatically use the fastest "phy interface mode" and the fastest
> link speed that can be supported by the DSA device.

This goes back to the very beginning of DSA, as far as i know. This
was before the times of DT. Platform data was used to describe the
switch tree, and it was pretty minimalist. It just listed the ports of
the switches and their names. The 'cpu' port had the name 'cpu', and
DSA ports either did not have a name, or 'dsa'. I don't
remember. There was also a table describing the routing between
switches in the tree. The platform data had nothing to describe
interface speeds, and i'm not sure phylib was even involved to control
the integrated PHYs. Marvell switches would power up their PHYs in
autoneg mode, meaning they just worked. In order to make the CPU port
work, which did not have a PHY, the driver would configure the CPU
port into its fastest mode. Same for the DSA ports. A Marvell Switch
connected to a Marvell SoC NIC worked.

Sometime later DT became the way to describe ARM boards, and pretty
much all boards with switches were ARM boards. If i remember
correctly, Florian did the first binding, which was basically
translate the platform data straight into DT. Since the platform data
had no way to describe port speed, the DT binding had no way to
describe port speed. It just kept on defaulting to the maximum speed.

The DT world evolved, and DT bindings were produced for phylib.

At some point, DSA got an interface to phylib. Maybe it was there from
the beginning, maybe it was added later. I don't know. As a result,
the DT properties for phylib became valid for switch user ports.

Without looking at git, i'm a bit hazy why fixed-link was introduced
for CPU and DSA ports. At some point in time, i was asked to make a
Marvell switch work with a Freescale FEC. The FEC had a Fast Ethernet,
where as the switch CPU port was 1G. It could be that in order to make
this work, i added fixed-link support to CPU ports, so i could specify
the CPU port speed, rather than use the default which would not work.

At some point, i had a board with an RGMII interface used as a DSA
link between two switches, and i needed to specify the RGMII
delays. It could of been using a fixed-link allowed the phy-mode to be
specified for a DSA port, thus allowing the delays to be specified?

Basically, fixed-phy for CPU and DSA was added to solve a limitation
of the default fastest port speed not always working.

With time, more vendors got behind DSA and switches other than Marvell
were added. There was not much documentation about the expectations of
switch drivers, and i doubt this default maximum speed behaviour of
CPU ports was documented. I probably commented on earlier drivers that
fixed-link could be used, or that the Marvell behaviour could be
copied.

And some devices probably power up ports at their maximum speed,
others can probably be strapped to specific modes. Some drivers
default to maximum speed, others required fixed links to specific the
speed. As Russell says, code inspection is not enough to tell what is
going on.

	Andrew
Andy Shevchenko July 18, 2022, 8:48 p.m. UTC | #16
On Mon, Jul 18, 2022 at 10:39:42PM +0200, Marek Behún wrote:
> On Mon, 18 Jul 2022 22:24:09 +0300
> Andy Shevchenko <andriy.shevchenko@linux.intel.com> wrote:
> 
> > On Mon, Jul 18, 2022 at 08:14:58PM +0100, Russell King (Oracle) wrote:
> > > On Mon, Jul 18, 2022 at 09:53:39PM +0300, Andy Shevchenko wrote:  
> > > > On Mon, Jul 18, 2022 at 09:43:42PM +0300, Andy Shevchenko wrote:  
> > > > > On Mon, Jul 18, 2022 at 02:27:02PM +0100, Russell King (Oracle) wrote:  
> > > > > > On Mon, Jul 18, 2022 at 03:29:52PM +0300, Andy Shevchenko wrote:  
> > > > > > > On Fri, Jul 15, 2022 at 11:48:41PM +0300, Vladimir Oltean wrote:  
> > > > > > > > So won't kobject_init_and_add() fail on namespace collision? Is it the
> > > > > > > > problem that it's going to fail, or that it's not trivial to statically
> > > > > > > > determine whether it'll fail?
> > > > > > > > 
> > > > > > > > Sorry, but I don't see something actionable about this.  
> > > > > > > 
> > > > > > > I'm talking about validation before a runtime. But if you think that is fine,
> > > > > > > let's fail it at runtime, okay, and consume more backtraces in the future.  
> > > > > > 
> > > > > > Is there any sane way to do validation of this namespace before
> > > > > > runtime?  
> > > > > 
> > > > > For statically compiled, I think we can do it (to some extent).
> > > > > Currently only three drivers, if I'm not mistaken, define software nodes with
> > > > > names. It's easy to check that their node names are unique.
> > > > > 
> > > > > When you allow such an API then we might have tracebacks (from sysfs) bout name
> > > > > collisions. Not that is something new to kernel (we have seen many of a kind),
> > > > > but I prefer, if possible, to validate this before sysfs issues a traceback.
> > > > >   
> > > > > > The problem in this instance is we need a node named "fixed-link" that
> > > > > > is attached to the parent node as that is defined in the binding doc,
> > > > > > and we're creating swnodes to provide software generated nodes for
> > > > > > this binding.  
> > > > > 
> > > > > And how you guarantee that it will be only a single one with unique pathname?
> > > > > 
> > > > > For example, you have two DSA cards (or whatever it's called) in the SMP system,
> > > > > it mean that there is non-zero probability of coexisting swnodes for them.
> > > > >   
> > > > > > There could be several such nodes scattered around, but in this
> > > > > > instance they are very short-lived before they are destroyed, they
> > > > > > don't even need to be published to userspace (and its probably a waste
> > > > > > of CPU cycles for them to be published there.)
> > > > > > 
> > > > > > So, for this specific case, is this the best approach, or is there
> > > > > > some better way to achieve what we need here?  
> > > > > 
> > > > > Honestly, I don't know.
> > > > > 
> > > > > The "workaround" (but it looks to me rather a hack) is to create unique swnode
> > > > > and make fixed-link as a child of it.
> > > > > 
> > > > > Or entire concept of the root swnodes (when name is provided) should be
> > > > > reconsidered, so somehow we will have a uniqueness so that the entire
> > > > > path(s) behind it will be caller-dependent. But this I also don't like.
> > > > > 
> > > > > Maybe Heikki, Sakari, Rafael can share their thoughts...
> > > > > 
> > > > > Just for my learning, why PHY uses "fixed-link" instead of relying on a
> > > > > (firmware) graph? It might be the actual solution to your problem.
> > > > > 
> > > > > How graphs are used with swnodes, you may look into IPU3 (Intel Camera)
> > > > > glue driver to support devices before MIPI standardisation of the
> > > > > respective properties.  
> > > > 
> > > > Forgot to say (yes, it maybe obvious) that this API will be exported,
> > > > anyone can use it and trap into the similar issue, because, for example,
> > > > of testing in environment with a single instance of the caller.  
> > > 
> > > I think we're coming to the conclusion that using swnodes is not the
> > > correct approach for this problem, correct?  
> > 
> > If I understand the possibilities of the usage in _this_ case, then it's
> > would be problematic (it does not mean it's incorrect). It might be due to
> > swnode design restrictions which shouldn't be made, I dunno. That' why
> > it's better to ask the others for their opinions.
> > 
> > By design swnode's name makes not much sense, because the payload there
> > is a property set, where _name_ is a must.
> > 
> > Now, telling you this, I'm questioning myself why the heck I added names
> > to swnodes in the intel_quark_i2c_gpio driver...
> 
> 1. the way we use this new named swnode (in patch 5/6 of this series) is
>    that it gets destroyed immediately after being parsed, so I don't
>    think there will be collisions in the namespace for forseeable future
> 
>    also, we first create an unnamed swnode for port and only then
>    fixed-link swnode as a child.
> 
>       new_port_fwnode = fwnode_create_software_node(port_props, NULL);
>       ...
>       fixed_link_fwnode =
>         fwnode_create_named_software_node(fixed_link_props,
>                                           new_port_fwnode, "fixed-link");
> 
>    so there shouldn't be a name collision, since the port node gets a
>    unique name, or am I misunderstanding this?

This is not problem, but what I was talking about is how to guarantee this
hierarchy? See what I answered to RNK.

> 2. even if there was a problem with name collision, I think the place
>    that needs to be fixed is swnode system. What use are swnodes if
>    they cannot be used like this?

Precisely, that's why I don't want to introduce an API that needs to be fixed.
Marek Behún July 19, 2022, 7:18 a.m. UTC | #17
On Mon, 18 Jul 2022 23:48:21 +0300
Andy Shevchenko <andriy.shevchenko@linux.intel.com> wrote:

> On Mon, Jul 18, 2022 at 10:39:42PM +0200, Marek Behún wrote:
> > On Mon, 18 Jul 2022 22:24:09 +0300
> > Andy Shevchenko <andriy.shevchenko@linux.intel.com> wrote:
> >   
> > > On Mon, Jul 18, 2022 at 08:14:58PM +0100, Russell King (Oracle) wrote:  
> > > > On Mon, Jul 18, 2022 at 09:53:39PM +0300, Andy Shevchenko wrote:    
> > > > > On Mon, Jul 18, 2022 at 09:43:42PM +0300, Andy Shevchenko wrote:    
> > > > > > On Mon, Jul 18, 2022 at 02:27:02PM +0100, Russell King (Oracle) wrote:    
> > > > > > > On Mon, Jul 18, 2022 at 03:29:52PM +0300, Andy Shevchenko wrote:    
> > > > > > > > On Fri, Jul 15, 2022 at 11:48:41PM +0300, Vladimir Oltean wrote:    
> > > > > > > > > So won't kobject_init_and_add() fail on namespace collision? Is it the
> > > > > > > > > problem that it's going to fail, or that it's not trivial to statically
> > > > > > > > > determine whether it'll fail?
> > > > > > > > > 
> > > > > > > > > Sorry, but I don't see something actionable about this.    
> > > > > > > > 
> > > > > > > > I'm talking about validation before a runtime. But if you think that is fine,
> > > > > > > > let's fail it at runtime, okay, and consume more backtraces in the future.    
> > > > > > > 
> > > > > > > Is there any sane way to do validation of this namespace before
> > > > > > > runtime?    
> > > > > > 
> > > > > > For statically compiled, I think we can do it (to some extent).
> > > > > > Currently only three drivers, if I'm not mistaken, define software nodes with
> > > > > > names. It's easy to check that their node names are unique.
> > > > > > 
> > > > > > When you allow such an API then we might have tracebacks (from sysfs) bout name
> > > > > > collisions. Not that is something new to kernel (we have seen many of a kind),
> > > > > > but I prefer, if possible, to validate this before sysfs issues a traceback.
> > > > > >     
> > > > > > > The problem in this instance is we need a node named "fixed-link" that
> > > > > > > is attached to the parent node as that is defined in the binding doc,
> > > > > > > and we're creating swnodes to provide software generated nodes for
> > > > > > > this binding.    
> > > > > > 
> > > > > > And how you guarantee that it will be only a single one with unique pathname?
> > > > > > 
> > > > > > For example, you have two DSA cards (or whatever it's called) in the SMP system,
> > > > > > it mean that there is non-zero probability of coexisting swnodes for them.
> > > > > >     
> > > > > > > There could be several such nodes scattered around, but in this
> > > > > > > instance they are very short-lived before they are destroyed, they
> > > > > > > don't even need to be published to userspace (and its probably a waste
> > > > > > > of CPU cycles for them to be published there.)
> > > > > > > 
> > > > > > > So, for this specific case, is this the best approach, or is there
> > > > > > > some better way to achieve what we need here?    
> > > > > > 
> > > > > > Honestly, I don't know.
> > > > > > 
> > > > > > The "workaround" (but it looks to me rather a hack) is to create unique swnode
> > > > > > and make fixed-link as a child of it.
> > > > > > 
> > > > > > Or entire concept of the root swnodes (when name is provided) should be
> > > > > > reconsidered, so somehow we will have a uniqueness so that the entire
> > > > > > path(s) behind it will be caller-dependent. But this I also don't like.
> > > > > > 
> > > > > > Maybe Heikki, Sakari, Rafael can share their thoughts...
> > > > > > 
> > > > > > Just for my learning, why PHY uses "fixed-link" instead of relying on a
> > > > > > (firmware) graph? It might be the actual solution to your problem.
> > > > > > 
> > > > > > How graphs are used with swnodes, you may look into IPU3 (Intel Camera)
> > > > > > glue driver to support devices before MIPI standardisation of the
> > > > > > respective properties.    
> > > > > 
> > > > > Forgot to say (yes, it maybe obvious) that this API will be exported,
> > > > > anyone can use it and trap into the similar issue, because, for example,
> > > > > of testing in environment with a single instance of the caller.    
> > > > 
> > > > I think we're coming to the conclusion that using swnodes is not the
> > > > correct approach for this problem, correct?    
> > > 
> > > If I understand the possibilities of the usage in _this_ case, then it's
> > > would be problematic (it does not mean it's incorrect). It might be due to
> > > swnode design restrictions which shouldn't be made, I dunno. That' why
> > > it's better to ask the others for their opinions.
> > > 
> > > By design swnode's name makes not much sense, because the payload there
> > > is a property set, where _name_ is a must.
> > > 
> > > Now, telling you this, I'm questioning myself why the heck I added names
> > > to swnodes in the intel_quark_i2c_gpio driver...  
> > 
> > 1. the way we use this new named swnode (in patch 5/6 of this series) is
> >    that it gets destroyed immediately after being parsed, so I don't
> >    think there will be collisions in the namespace for forseeable future
> > 
> >    also, we first create an unnamed swnode for port and only then
> >    fixed-link swnode as a child.
> > 
> >       new_port_fwnode = fwnode_create_software_node(port_props, NULL);
> >       ...
> >       fixed_link_fwnode =
> >         fwnode_create_named_software_node(fixed_link_props,
> >                                           new_port_fwnode, "fixed-link");
> > 
> >    so there shouldn't be a name collision, since the port node gets a
> >    unique name, or am I misunderstanding this?  
> 
> This is not problem, but what I was talking about is how to guarantee this
> hierarchy? See what I answered to RNK.
> 
> > 2. even if there was a problem with name collision, I think the place
> >    that needs to be fixed is swnode system. What use are swnodes if
> >    they cannot be used like this?  
> 
> Precisely, that's why I don't want to introduce an API that needs to be fixed.
> 

Aha, so you want to ensure that root swnodes are created with unique
name?

Can't we just make it so that named software node must have a parent?

Marek
Sakari Ailus July 19, 2022, 8:50 a.m. UTC | #18
Hi Russell, Andy,

On Mon, Jul 18, 2022 at 09:38:42PM +0100, Russell King (Oracle) wrote:
> On Mon, Jul 18, 2022 at 11:07:30PM +0300, Andy Shevchenko wrote:
> > On Mon, Jul 18, 2022 at 08:11:40PM +0100, Russell King (Oracle) wrote:
> > > Good point - I guess we at least need to attach the swnode parent to the
> > > device so its path is unique, because right now that isn't the case. I'm
> > > guessing that:
> > > 
> > >         new_port_fwnode = fwnode_create_software_node(port_props, NULL);
> > > 
> > > will create something at the root of the swnode tree, and then:
> > > 
> > >         fixed_link_fwnode = fwnode_create_named_software_node(fixed_link_props,
> > >                                                               new_port_fwnode,
> > >                                                               "fixed-link");
> > > 
> > > will create a node with a fixed name. I guess it in part depends what
> > > pathname the first node gets (which we don't specify.) I'm not familiar
> > > with the swnode code to know what happens with the naming for the first
> > > node.
> > 
> > First node's name will be unique which is guaranteed by IDA framework. If we
> > have already 2B nodes, then yes, it would be problematic (but 2^31 ought to be
> > enough :-).
> > 
> > > However, it seems sensible to me to attach the first node to the device
> > > node, thus giving it a unique fwnode path. Does that solve the problem
> > > in swnode land?
> > 
> > Yes, but in the driver you will have that as child of the device, analogue in DT
> > 
> >   my_root_node { // equal the level of device node you attach it to
> > 	  fixed-link {
> > 	  }
> >   }
> > 
> > (Sorry, I don't know the DT syntax by heart, but I hope you got the idea.)
> 
> Yes, that looks about right.
> 
> What we're attempting to do here is create the swnode equivalent of this
> DT description:
> 
> 	some_node {
> 		phy-mode = "foo";
> 
> 		fixed-link {
> 			speed = X;
> 			full-duplex;
> 		};
> 	};
> 
> and the some_node fwnode handle gets passed into phylink for it to
> parse - we never attach it to the firmware tree itself. Once phylink
> has parsed it, we destroy the swnode tree since it's no longer useful.
> 
> This would get used in this situation as an example:
> 
> 	switch@4 {
> 		compatible = "marvell,mv88e6085";
> 
> 		ports {
> 			port@0 {
> 				reg = <0>;
> 				phy-mode = "internal";
> 				phy-handle = <&sw_phy_0>;
> 			};
> 			...
> 			port@5 {
> 				reg = <5>;
> 				label = "cpu";
> 				ethernet = <&eth1>;
> 			};
> 		};
> 	};
> 
> The DSA driver knows the capabilities of the chip, so it knows what the
> fastest "phy-mode" and speed would be, and whether full or half duplex
> are supported.
> 
> We need to get this information into phylink some how, and my initial
> approach was to add a new function to phylink to achieve that.
> 
> We would normally have passed the "port@5" node to phylink, just as we
> pass the "port@0" node. However, because the "port@5" operates as a
> fixed-link as determined by the hardware/driver, we need some way to
> get that into phylink.
> 
> So, Vladimir's approach is to create a swnode tree that reflects the
> DT layout, and rather than passing the "port@5" as a fwnode to phylink,
> we instead pass that "some_node" swnode instead. Phylink then uses
> normal fwnode APIs to parse the swnode tree it's been given, resulting
> in it picking up the fixed-link specification as if it had been in the
> original DT.
> 
> We don't augment the existing firmware tree for "port@5", we are
> effectively creating a small sub-tree and using it as a subsitute
> description.
> 
> I hope that clarifies what is going on here and why.

Basically what your patch is doing is adding a helper function that creates
an fwnode with a given name. This functionality was there previously through
software_node_register_nodes(), with node allocation responsibility residing
on the caller. It's used e.g. here:
drivers/media/pci/intel/ipu3/cio2-bridge.c .

The larger question is perhaps when can you safely remove software nodes.
And which of these two APIs would be preferred. I haven't checked how many
users each has. There's no refcounting nor locking for software nodes, so
once made visible to the rest of the kernel, they're always expected to be
there, unchanged, or at least it needs to be known when they can be removed.
Vladimir Oltean July 20, 2022, 10:56 p.m. UTC | #19
Hi Sakari,

On Tue, Jul 19, 2022 at 08:50:27AM +0000, Sakari Ailus wrote:
> Basically what your patch is doing is adding a helper function that creates
> an fwnode with a given name. This functionality was there previously through
> software_node_register_nodes(), with node allocation responsibility residing
> on the caller. It's used e.g. here:
> drivers/media/pci/intel/ipu3/cio2-bridge.c .
> 
> The larger question is perhaps when can you safely remove software nodes.
> And which of these two APIs would be preferred. I haven't checked how many
> users each has. There's no refcounting nor locking for software nodes, so
> once made visible to the rest of the kernel, they're always expected to be
> there, unchanged, or at least it needs to be known when they can be removed.

Just for my clarity, are you saying that this printf selftest is
violating the software nodes' expectation to always be there unchanged
and never be removed?

static void __init fwnode_pointer(void)
{
	const struct software_node softnodes[] = {
		{ .name = "first", },
		{ .name = "second", .parent = &softnodes[0], },
		{ .name = "third", .parent = &softnodes[1], },
		{ NULL /* Guardian */ }
	};
	const char * const full_name = "first/second/third";
	const char * const full_name_second = "first/second";
	const char * const second_name = "second";
	const char * const third_name = "third";
	int rval;

	rval = software_node_register_nodes(softnodes);
	if (rval) {
		pr_warn("cannot register softnodes; rval %d\n", rval);
		return;
	}

	test(full_name_second, "%pfw", software_node_fwnode(&softnodes[1]));
	test(full_name, "%pfw", software_node_fwnode(&softnodes[2]));
	test(full_name, "%pfwf", software_node_fwnode(&softnodes[2]));
	test(second_name, "%pfwP", software_node_fwnode(&softnodes[1]));
	test(third_name, "%pfwP", software_node_fwnode(&softnodes[2]));

	software_node_unregister_nodes(softnodes);
}

The use case in this patch set is essentially equivalent to what printf
does: exposing the software nodes to the rest of the kernel and to user
space is probably not necessary, it's just that we need to call a
function that parses their structure (essentially an equivalent to
calling "test" above). Could you indicate whether there is a better
alternative of doing this?
Sakari Ailus July 22, 2022, 6:21 a.m. UTC | #20
Hi Vladimir,

On Thu, Jul 21, 2022 at 01:56:52AM +0300, Vladimir Oltean wrote:
> Hi Sakari,
> 
> On Tue, Jul 19, 2022 at 08:50:27AM +0000, Sakari Ailus wrote:
> > Basically what your patch is doing is adding a helper function that creates
> > an fwnode with a given name. This functionality was there previously through
> > software_node_register_nodes(), with node allocation responsibility residing
> > on the caller. It's used e.g. here:
> > drivers/media/pci/intel/ipu3/cio2-bridge.c .
> > 
> > The larger question is perhaps when can you safely remove software nodes.
> > And which of these two APIs would be preferred. I haven't checked how many
> > users each has. There's no refcounting nor locking for software nodes, so
> > once made visible to the rest of the kernel, they're always expected to be
> > there, unchanged, or at least it needs to be known when they can be removed.
> 
> Just for my clarity, are you saying that this printf selftest is
> violating the software nodes' expectation to always be there unchanged
> and never be removed?

No. This is the other case, i.e. it's known the nodes can be removed.

> 
> static void __init fwnode_pointer(void)
> {
> 	const struct software_node softnodes[] = {
> 		{ .name = "first", },
> 		{ .name = "second", .parent = &softnodes[0], },
> 		{ .name = "third", .parent = &softnodes[1], },
> 		{ NULL /* Guardian */ }
> 	};
> 	const char * const full_name = "first/second/third";
> 	const char * const full_name_second = "first/second";
> 	const char * const second_name = "second";
> 	const char * const third_name = "third";
> 	int rval;
> 
> 	rval = software_node_register_nodes(softnodes);
> 	if (rval) {
> 		pr_warn("cannot register softnodes; rval %d\n", rval);
> 		return;
> 	}
> 
> 	test(full_name_second, "%pfw", software_node_fwnode(&softnodes[1]));
> 	test(full_name, "%pfw", software_node_fwnode(&softnodes[2]));
> 	test(full_name, "%pfwf", software_node_fwnode(&softnodes[2]));
> 	test(second_name, "%pfwP", software_node_fwnode(&softnodes[1]));
> 	test(third_name, "%pfwP", software_node_fwnode(&softnodes[2]));
> 
> 	software_node_unregister_nodes(softnodes);
> }
> 
> The use case in this patch set is essentially equivalent to what printf
> does: exposing the software nodes to the rest of the kernel and to user
> space is probably not necessary, it's just that we need to call a
> function that parses their structure (essentially an equivalent to
> calling "test" above). Could you indicate whether there is a better
> alternative of doing this?

I'm actually not suggesting to do otherwise. What I wanted to say was that
it'd be best to settle with a single API to create software nodes while
keeping in mind serialising access to the data structure as well as
the lifetime of the software nodes.

This patch is adding another API function to register software nodes which
expands the scope of another that effectively did not allow sub-nodes.
Lifetime management currently doesn't really exist for ACPI nodes (device
or data) and only exists in somewhat unsatisfactory form for DT nodes. That
might be still the best model for software nodes.

Perhaps the API this patch adds is nicer to use than
software_node_register_nodes() and better lends itself for adding
refcounting later on.

I wonder what Andy or Heikki think.
Andy Shevchenko July 29, 2022, 12:08 p.m. UTC | #21
On Tue, Jul 19, 2022 at 09:18:00AM +0200, Marek Behún wrote:
> On Mon, 18 Jul 2022 23:48:21 +0300
> Andy Shevchenko <andriy.shevchenko@linux.intel.com> wrote:
> > On Mon, Jul 18, 2022 at 10:39:42PM +0200, Marek Behún wrote:
> > > On Mon, 18 Jul 2022 22:24:09 +0300
> > > Andy Shevchenko <andriy.shevchenko@linux.intel.com> wrote:

...

> > > 2. even if there was a problem with name collision, I think the place
> > >    that needs to be fixed is swnode system. What use are swnodes if
> > >    they cannot be used like this?  
> > 
> > Precisely, that's why I don't want to introduce an API that needs to be fixed.
> 
> Aha, so you want to ensure that root swnodes are created with unique
> name?
> 
> Can't we just make it so that named software node must have a parent?

That's my proposal, but it might have a downside effect on the existing users
(chtwc_int33fe).
diff mbox series

Patch

diff --git a/drivers/base/swnode.c b/drivers/base/swnode.c
index 0a482212c7e8..b2ea08f0e898 100644
--- a/drivers/base/swnode.c
+++ b/drivers/base/swnode.c
@@ -972,8 +972,9 @@  void software_node_unregister(const struct software_node *node)
 EXPORT_SYMBOL_GPL(software_node_unregister);
 
 struct fwnode_handle *
-fwnode_create_software_node(const struct property_entry *properties,
-			    const struct fwnode_handle *parent)
+fwnode_create_named_software_node(const struct property_entry *properties,
+				  const struct fwnode_handle *parent,
+				  const char *name)
 {
 	struct fwnode_handle *fwnode;
 	struct software_node *node;
@@ -991,6 +992,7 @@  fwnode_create_software_node(const struct property_entry *properties,
 		return ERR_CAST(node);
 
 	node->parent = p ? p->node : NULL;
+	node->name = name;
 
 	fwnode = swnode_register(node, p, 1);
 	if (IS_ERR(fwnode))
@@ -998,6 +1000,14 @@  fwnode_create_software_node(const struct property_entry *properties,
 
 	return fwnode;
 }
+EXPORT_SYMBOL_GPL(fwnode_create_named_software_node);
+
+struct fwnode_handle *
+fwnode_create_software_node(const struct property_entry *properties,
+			    const struct fwnode_handle *parent)
+{
+	return fwnode_create_named_software_node(properties, parent, NULL);
+}
 EXPORT_SYMBOL_GPL(fwnode_create_software_node);
 
 void fwnode_remove_software_node(struct fwnode_handle *fwnode)
diff --git a/include/linux/property.h b/include/linux/property.h
index a5b429d623f6..23330ae2b1fa 100644
--- a/include/linux/property.h
+++ b/include/linux/property.h
@@ -492,6 +492,10 @@  void software_node_unregister(const struct software_node *node);
 struct fwnode_handle *
 fwnode_create_software_node(const struct property_entry *properties,
 			    const struct fwnode_handle *parent);
+struct fwnode_handle *
+fwnode_create_named_software_node(const struct property_entry *properties,
+				  const struct fwnode_handle *parent,
+				  const char *name);
 void fwnode_remove_software_node(struct fwnode_handle *fwnode);
 
 int device_add_software_node(struct device *dev, const struct software_node *node);