diff mbox

[net-next,v3,06/10] net: dsa: Migrate to device_find_class()

Message ID 20170114214713.28109-7-f.fainelli@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Florian Fainelli Jan. 14, 2017, 9:47 p.m. UTC
Now that the base device driver code provides an identical
implementation of dev_find_class() utilize device_find_class() instead
of our own version of it.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
 net/dsa/dsa.c | 22 ++--------------------
 1 file changed, 2 insertions(+), 20 deletions(-)

Comments

Greg Kroah-Hartman Jan. 15, 2017, 11:06 a.m. UTC | #1
On Sat, Jan 14, 2017 at 01:47:09PM -0800, Florian Fainelli wrote:
> Now that the base device driver code provides an identical
> implementation of dev_find_class() utilize device_find_class() instead
> of our own version of it.
> 
> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
> ---
>  net/dsa/dsa.c | 22 ++--------------------
>  1 file changed, 2 insertions(+), 20 deletions(-)
> 
> diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
> index 2306d1b87c83..77fa4c4f5828 100644
> --- a/net/dsa/dsa.c
> +++ b/net/dsa/dsa.c
> @@ -455,29 +455,11 @@ EXPORT_SYMBOL_GPL(dsa_switch_resume);
>  #endif
>  
>  /* platform driver init and cleanup *****************************************/
> -static int dev_is_class(struct device *dev, void *class)
> -{
> -	if (dev->class != NULL && !strcmp(dev->class->name, class))
> -		return 1;
> -
> -	return 0;
> -}
> -
> -static struct device *dev_find_class(struct device *parent, char *class)
> -{
> -	if (dev_is_class(parent, class)) {
> -		get_device(parent);
> -		return parent;
> -	}
> -
> -	return device_find_child(parent, class, dev_is_class);
> -}
> -
>  struct mii_bus *dsa_host_dev_to_mii_bus(struct device *dev)
>  {
>  	struct device *d;
>  
> -	d = dev_find_class(dev, "mdio_bus");
> +	d = device_find_class(dev, "mdio_bus");
>  	if (d != NULL) {
>  		struct mii_bus *bus;

You want a peer of your device on a specific class?  What is this for?

> @@ -495,7 +477,7 @@ static struct net_device *dev_to_net_device(struct device *dev)
>  {
>  	struct device *d;
>  
> -	d = dev_find_class(dev, "net");
> +	d = device_find_class(dev, "net");
>  	if (d != NULL) {
>  		struct net_device *nd;

Again, huh?  What is the device heirachy here that is so odd that this
type of function is needed?

totally confused,

greg k-h
Florian Fainelli Jan. 15, 2017, 5:27 p.m. UTC | #2
On 01/15/2017 03:06 AM, Greg KH wrote:
> On Sat, Jan 14, 2017 at 01:47:09PM -0800, Florian Fainelli wrote:
>> Now that the base device driver code provides an identical
>> implementation of dev_find_class() utilize device_find_class() instead
>> of our own version of it.
>>
>> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
>> ---
>>  net/dsa/dsa.c | 22 ++--------------------
>>  1 file changed, 2 insertions(+), 20 deletions(-)
>>
>> diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
>> index 2306d1b87c83..77fa4c4f5828 100644
>> --- a/net/dsa/dsa.c
>> +++ b/net/dsa/dsa.c
>> @@ -455,29 +455,11 @@ EXPORT_SYMBOL_GPL(dsa_switch_resume);
>>  #endif
>>  
>>  /* platform driver init and cleanup *****************************************/
>> -static int dev_is_class(struct device *dev, void *class)
>> -{
>> -	if (dev->class != NULL && !strcmp(dev->class->name, class))
>> -		return 1;
>> -
>> -	return 0;
>> -}
>> -
>> -static struct device *dev_find_class(struct device *parent, char *class)
>> -{
>> -	if (dev_is_class(parent, class)) {
>> -		get_device(parent);
>> -		return parent;
>> -	}
>> -
>> -	return device_find_child(parent, class, dev_is_class);
>> -}
>> -
>>  struct mii_bus *dsa_host_dev_to_mii_bus(struct device *dev)
>>  {
>>  	struct device *d;
>>  
>> -	d = dev_find_class(dev, "mdio_bus");
>> +	d = device_find_class(dev, "mdio_bus");
>>  	if (d != NULL) {
>>  		struct mii_bus *bus;
> 
> You want a peer of your device on a specific class?  What is this for?

It's not a peer of our device, it's a separate device reference from the
one looked up in the "net" class. In the classic, and now deprecated DSA
device driver model, a "dsa" platform device would represent one or more
Ethernet switches, connected via a MDIO bus (this reference above), and
one Ethernet device (the CPU/host/management interface). This was
completely violating the Linux device driver model and imposed
limitations on what bus would be used, and we did not have proper struct
device references (therefore no adequate hierarchy either).

Thanks to the work of Andrew, we now have proper MDIO, SPI, GPIO, I2C,
PCI, platform and drivers that allow us to register with DSA as a
specialized kind of device (so we are now finally using the right Linux
Device Driver model). What we still need though, in order to our switch
to the networking stack is a reference to the master/host network device
since we mangle packets in and out of it.

> 
>> @@ -495,7 +477,7 @@ static struct net_device *dev_to_net_device(struct device *dev)
>>  {
>>  	struct device *d;
>>  
>> -	d = dev_find_class(dev, "net");
>> +	d = device_find_class(dev, "net");
>>  	if (d != NULL) {
>>  		struct net_device *nd;
> 
> Again, huh?  What is the device heirachy here that is so odd that this
> type of function is needed?

An Ethernet switch managed by DSA needs to have one ore more references
to a host/CPU/management network interface, this is what this struct
device reference is here for.
--
Florian
Greg Kroah-Hartman Jan. 15, 2017, 5:39 p.m. UTC | #3
On Sun, Jan 15, 2017 at 09:27:22AM -0800, Florian Fainelli wrote:
> 
> 
> On 01/15/2017 03:06 AM, Greg KH wrote:
> > On Sat, Jan 14, 2017 at 01:47:09PM -0800, Florian Fainelli wrote:
> >> Now that the base device driver code provides an identical
> >> implementation of dev_find_class() utilize device_find_class() instead
> >> of our own version of it.
> >>
> >> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
> >> ---
> >>  net/dsa/dsa.c | 22 ++--------------------
> >>  1 file changed, 2 insertions(+), 20 deletions(-)
> >>
> >> diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
> >> index 2306d1b87c83..77fa4c4f5828 100644
> >> --- a/net/dsa/dsa.c
> >> +++ b/net/dsa/dsa.c
> >> @@ -455,29 +455,11 @@ EXPORT_SYMBOL_GPL(dsa_switch_resume);
> >>  #endif
> >>  
> >>  /* platform driver init and cleanup *****************************************/
> >> -static int dev_is_class(struct device *dev, void *class)
> >> -{
> >> -	if (dev->class != NULL && !strcmp(dev->class->name, class))
> >> -		return 1;
> >> -
> >> -	return 0;
> >> -}
> >> -
> >> -static struct device *dev_find_class(struct device *parent, char *class)
> >> -{
> >> -	if (dev_is_class(parent, class)) {
> >> -		get_device(parent);
> >> -		return parent;
> >> -	}
> >> -
> >> -	return device_find_child(parent, class, dev_is_class);
> >> -}
> >> -
> >>  struct mii_bus *dsa_host_dev_to_mii_bus(struct device *dev)
> >>  {
> >>  	struct device *d;
> >>  
> >> -	d = dev_find_class(dev, "mdio_bus");
> >> +	d = device_find_class(dev, "mdio_bus");
> >>  	if (d != NULL) {
> >>  		struct mii_bus *bus;
> > 
> > You want a peer of your device on a specific class?  What is this for?
> 
> It's not a peer of our device, it's a separate device reference from the
> one looked up in the "net" class. In the classic, and now deprecated DSA
> device driver model, a "dsa" platform device would represent one or more
> Ethernet switches, connected via a MDIO bus (this reference above), and
> one Ethernet device (the CPU/host/management interface). This was
> completely violating the Linux device driver model and imposed
> limitations on what bus would be used, and we did not have proper struct
> device references (therefore no adequate hierarchy either).
> 
> Thanks to the work of Andrew, we now have proper MDIO, SPI, GPIO, I2C,
> PCI, platform and drivers that allow us to register with DSA as a
> specialized kind of device (so we are now finally using the right Linux
> Device Driver model). What we still need though, in order to our switch
> to the networking stack is a reference to the master/host network device
> since we mangle packets in and out of it.

Ok, but where in the tree are you trying to find this other device?  Are
you just going to randomly find any device that happens to be of the
specific class type?  That seems really fragile and broken :(

You should NEVER have to walk the device tree to find stuff.  Why can't
you have a pointer to the device you need to talk to some other way?

What exactly is the relationship between these devices (a ascii-art tree
or sysfs tree output might be nice) so I can try to understand what is
going on here.

thanks,

greg k-h
Florian Fainelli Jan. 15, 2017, 5:52 p.m. UTC | #4
On 01/15/2017 09:39 AM, Greg KH wrote:
> On Sun, Jan 15, 2017 at 09:27:22AM -0800, Florian Fainelli wrote:
>>
>>
>> On 01/15/2017 03:06 AM, Greg KH wrote:
>>> On Sat, Jan 14, 2017 at 01:47:09PM -0800, Florian Fainelli wrote:
>>>> Now that the base device driver code provides an identical
>>>> implementation of dev_find_class() utilize device_find_class() instead
>>>> of our own version of it.
>>>>
>>>> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
>>>> ---
>>>>  net/dsa/dsa.c | 22 ++--------------------
>>>>  1 file changed, 2 insertions(+), 20 deletions(-)
>>>>
>>>> diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
>>>> index 2306d1b87c83..77fa4c4f5828 100644
>>>> --- a/net/dsa/dsa.c
>>>> +++ b/net/dsa/dsa.c
>>>> @@ -455,29 +455,11 @@ EXPORT_SYMBOL_GPL(dsa_switch_resume);
>>>>  #endif
>>>>  
>>>>  /* platform driver init and cleanup *****************************************/
>>>> -static int dev_is_class(struct device *dev, void *class)
>>>> -{
>>>> -	if (dev->class != NULL && !strcmp(dev->class->name, class))
>>>> -		return 1;
>>>> -
>>>> -	return 0;
>>>> -}
>>>> -
>>>> -static struct device *dev_find_class(struct device *parent, char *class)
>>>> -{
>>>> -	if (dev_is_class(parent, class)) {
>>>> -		get_device(parent);
>>>> -		return parent;
>>>> -	}
>>>> -
>>>> -	return device_find_child(parent, class, dev_is_class);
>>>> -}
>>>> -
>>>>  struct mii_bus *dsa_host_dev_to_mii_bus(struct device *dev)
>>>>  {
>>>>  	struct device *d;
>>>>  
>>>> -	d = dev_find_class(dev, "mdio_bus");
>>>> +	d = device_find_class(dev, "mdio_bus");
>>>>  	if (d != NULL) {
>>>>  		struct mii_bus *bus;
>>>
>>> You want a peer of your device on a specific class?  What is this for?
>>
>> It's not a peer of our device, it's a separate device reference from the
>> one looked up in the "net" class. In the classic, and now deprecated DSA
>> device driver model, a "dsa" platform device would represent one or more
>> Ethernet switches, connected via a MDIO bus (this reference above), and
>> one Ethernet device (the CPU/host/management interface). This was
>> completely violating the Linux device driver model and imposed
>> limitations on what bus would be used, and we did not have proper struct
>> device references (therefore no adequate hierarchy either).
>>
>> Thanks to the work of Andrew, we now have proper MDIO, SPI, GPIO, I2C,
>> PCI, platform and drivers that allow us to register with DSA as a
>> specialized kind of device (so we are now finally using the right Linux
>> Device Driver model). What we still need though, in order to our switch
>> to the networking stack is a reference to the master/host network device
>> since we mangle packets in and out of it.
> 
> Ok, but where in the tree are you trying to find this other device?  Are
> you just going to randomly find any device that happens to be of the
> specific class type?  That seems really fragile and broken :(

The search is not really random, since the platform data for the DSA
switch we want to register gives us a reference to a device registered
via any bus type, and we know (by contract it has to) that it has to
provide a network device of some kind, we do a specific search in the
"net" class for that purpose. I agree this is not great.

> 
> You should NEVER have to walk the device tree to find stuff.  Why can't
> you have a pointer to the device you need to talk to some other way?

We do, I think the existing code just tries to be extra careful here and
make sure that the device reference we got is actually part of the "net"
class, indicating that the reference passed is indeed pointing to a
network device, and not a random device.

> 
> What exactly is the relationship between these devices (a ascii-art tree
> or sysfs tree output might be nice) so I can try to understand what is
> going on here.

OK, let me try and let's take the case of only one Ethernet switch in
the system for all simplicity:

- switch device drivers get probed by any kind of bus allocate and
register a dsa_switch with dsa_register_switch()

- this switch device has platform data (struct dsa_chip_data) that
describes its port layout (number of ports mostly) and has one struct
device references to which Ethernet network interface these ports connect to

- a dsa_switch_tree is created, this struct dsa_switch is attached to
the dsa_switch_tree at position 0 in the tree, we invoke a bunch of
switch driver operations

- we conclude with attaching this dsa_switch_tree to the network device
we looked up and assigning the tree to the dsa_ptr member

Documentation/networking/dsa/dsa.txt has some ascii art drawing that
sort of capture what is going on.
Andrew Lunn Jan. 15, 2017, 7:16 p.m. UTC | #5
> > What exactly is the relationship between these devices (a ascii-art tree
> > or sysfs tree output might be nice) so I can try to understand what is
> > going on here.

Hi Greg, Florian

A few diagrams and trees which might help understand what is going on.

The first diagram comes from the 2008 patch which added all this code:

            +-----------+       +-----------+
            |           | RGMII |           |
            |           +-------+           +------ 1000baseT MDI ("WAN")
            |           |       |  6-port   +------ 1000baseT MDI ("LAN1")
            |    CPU    |       |  ethernet +------ 1000baseT MDI ("LAN2")
            |           |MIImgmt|  switch   +------ 1000baseT MDI ("LAN3")
            |           +-------+  w/5 PHYs +------ 1000baseT MDI ("LAN4")
            |           |       |           |
            +-----------+       +-----------+

We have an ethernet switch and a host CPU. The switch is connected to
the CPU in two different ways. RGMII allows us to get Ethernet frames
from the CPU into the switch. MIImgmt, is the management bus normally
used for Ethernet PHYs, but Marvell switches also use it for Managing
switches.

The diagram above is the simplest setup. You can have multiple
Ethernet switches, connected together via switch ports. Each switch
has its own MIImgmt connect to the CPU, but there is only one RGMII
link.

When this code was designed back in 2008, it was decided to represent
this is a platform device, and it has a platform_data, which i have
slightly edited to keep it simple:

struct dsa_platform_data {
        /*
         * Reference to a Linux network interface that connects
         * to the root switch chip of the tree.
         */
        struct device   *netdev;

        /*
         * Info structs describing each of the switch chips
         * connected via this network interface.
         */
        int             nr_chips;
        struct dsa_chip_data    *chip;
};

This netdev is the CPU side of the RGMII interface.

Each switch has a dsa_chip_data, again edited:

struct dsa_chip_data {
        /*
         * How to access the switch configuration registers.
         */
        struct device   *host_dev;
        int             sw_addr;
...
}

The host_dev is the CPU side of the MIImgmt, and we have the address
the switch is using on the bus.

During probe of this platform device, we need to get from the
struct device *netdev to a struct net_device *dev.

So the code looks in the device net class to find the device

|   |   |   |-- f1074000.ethernet
|   |   |   |   |-- deferred_probe
|   |   |   |   |-- driver -> ../../../../../bus/platform/drivers/mvneta
|   |   |   |   |-- driver_override
|   |   |   |   |-- modalias
|   |   |   |   |-- net
|   |   |   |   |   `-- eth1
|   |   |   |   |       |-- addr_assign_type
|   |   |   |   |       |-- address
|   |   |   |   |       |-- addr_len
|   |   |   |   |       |-- broadcast
|   |   |   |   |       |-- carrier
|   |   |   |   |       |-- carrier_changes
|   |   |   |   |       |-- deferred_probe
|   |   |   |   |       |-- device -> ../../../f1074000.ethernet

and then use container_of() to get the net_device.

Similarly, the code needs to get from struct device *host_dev to a struct mii_bus *.

|   |   |   |-- f1072004.mdio
|   |   |   |   |-- deferred_probe
|   |   |   |   |-- driver -> ../../../../../bus/platform/drivers/orion-mdio
|   |   |   |   |-- driver_override
|   |   |   |   |-- mdio_bus
|   |   |   |   |   `-- f1072004.mdio-mi
|   |   |   |   |       |-- deferred_probe
|   |   |   |   |       |-- device -> ../../../f1072004.mdio

    Andrew
Florian Fainelli Jan. 16, 2017, 8:01 p.m. UTC | #6
On 01/15/2017 11:16 AM, Andrew Lunn wrote:
>>> What exactly is the relationship between these devices (a ascii-art tree
>>> or sysfs tree output might be nice) so I can try to understand what is
>>> going on here.
> 
> Hi Greg, Florian
> 
> A few diagrams and trees which might help understand what is going on.
> 
> The first diagram comes from the 2008 patch which added all this code:
> 
>             +-----------+       +-----------+
>             |           | RGMII |           |
>             |           +-------+           +------ 1000baseT MDI ("WAN")
>             |           |       |  6-port   +------ 1000baseT MDI ("LAN1")
>             |    CPU    |       |  ethernet +------ 1000baseT MDI ("LAN2")
>             |           |MIImgmt|  switch   +------ 1000baseT MDI ("LAN3")
>             |           +-------+  w/5 PHYs +------ 1000baseT MDI ("LAN4")
>             |           |       |           |
>             +-----------+       +-----------+
> 
> We have an ethernet switch and a host CPU. The switch is connected to
> the CPU in two different ways. RGMII allows us to get Ethernet frames
> from the CPU into the switch. MIImgmt, is the management bus normally
> used for Ethernet PHYs, but Marvell switches also use it for Managing
> switches.
> 
> The diagram above is the simplest setup. You can have multiple
> Ethernet switches, connected together via switch ports. Each switch
> has its own MIImgmt connect to the CPU, but there is only one RGMII
> link.
> 
> When this code was designed back in 2008, it was decided to represent
> this is a platform device, and it has a platform_data, which i have
> slightly edited to keep it simple:
> 
> struct dsa_platform_data {
>         /*
>          * Reference to a Linux network interface that connects
>          * to the root switch chip of the tree.
>          */
>         struct device   *netdev;
> 
>         /*
>          * Info structs describing each of the switch chips
>          * connected via this network interface.
>          */
>         int             nr_chips;
>         struct dsa_chip_data    *chip;
> };
> 
> This netdev is the CPU side of the RGMII interface.
> 
> Each switch has a dsa_chip_data, again edited:
> 
> struct dsa_chip_data {
>         /*
>          * How to access the switch configuration registers.
>          */
>         struct device   *host_dev;
>         int             sw_addr;
> ...
> }
> 
> The host_dev is the CPU side of the MIImgmt, and we have the address
> the switch is using on the bus.
> 
> During probe of this platform device, we need to get from the
> struct device *netdev to a struct net_device *dev.
> 
> So the code looks in the device net class to find the device
> 
> |   |   |   |-- f1074000.ethernet
> |   |   |   |   |-- deferred_probe
> |   |   |   |   |-- driver -> ../../../../../bus/platform/drivers/mvneta
> |   |   |   |   |-- driver_override
> |   |   |   |   |-- modalias
> |   |   |   |   |-- net
> |   |   |   |   |   `-- eth1
> |   |   |   |   |       |-- addr_assign_type
> |   |   |   |   |       |-- address
> |   |   |   |   |       |-- addr_len
> |   |   |   |   |       |-- broadcast
> |   |   |   |   |       |-- carrier
> |   |   |   |   |       |-- carrier_changes
> |   |   |   |   |       |-- deferred_probe
> |   |   |   |   |       |-- device -> ../../../f1074000.ethernet
> 
> and then use container_of() to get the net_device.
> 
> Similarly, the code needs to get from struct device *host_dev to a struct mii_bus *.
> 
> |   |   |   |-- f1072004.mdio
> |   |   |   |   |-- deferred_probe
> |   |   |   |   |-- driver -> ../../../../../bus/platform/drivers/orion-mdio
> |   |   |   |   |-- driver_override
> |   |   |   |   |-- mdio_bus
> |   |   |   |   |   `-- f1072004.mdio-mi
> |   |   |   |   |       |-- deferred_probe
> |   |   |   |   |       |-- device -> ../../../f1072004.mdio
> 

Thanks Andrew! Greg, does that make it clearer how these devices
references are used, do you still think the way this is done is wrong,
too cautious, or valid?
Greg Kroah-Hartman Jan. 18, 2017, 7:06 a.m. UTC | #7
On Mon, Jan 16, 2017 at 12:01:02PM -0800, Florian Fainelli wrote:
> On 01/15/2017 11:16 AM, Andrew Lunn wrote:
> >>> What exactly is the relationship between these devices (a ascii-art tree
> >>> or sysfs tree output might be nice) so I can try to understand what is
> >>> going on here.
> > 
> > Hi Greg, Florian
> > 
> > A few diagrams and trees which might help understand what is going on.
> > 
> > The first diagram comes from the 2008 patch which added all this code:
> > 
> >             +-----------+       +-----------+
> >             |           | RGMII |           |
> >             |           +-------+           +------ 1000baseT MDI ("WAN")
> >             |           |       |  6-port   +------ 1000baseT MDI ("LAN1")
> >             |    CPU    |       |  ethernet +------ 1000baseT MDI ("LAN2")
> >             |           |MIImgmt|  switch   +------ 1000baseT MDI ("LAN3")
> >             |           +-------+  w/5 PHYs +------ 1000baseT MDI ("LAN4")
> >             |           |       |           |
> >             +-----------+       +-----------+
> > 
> > We have an ethernet switch and a host CPU. The switch is connected to
> > the CPU in two different ways. RGMII allows us to get Ethernet frames
> > from the CPU into the switch. MIImgmt, is the management bus normally
> > used for Ethernet PHYs, but Marvell switches also use it for Managing
> > switches.
> > 
> > The diagram above is the simplest setup. You can have multiple
> > Ethernet switches, connected together via switch ports. Each switch
> > has its own MIImgmt connect to the CPU, but there is only one RGMII
> > link.
> > 
> > When this code was designed back in 2008, it was decided to represent
> > this is a platform device, and it has a platform_data, which i have
> > slightly edited to keep it simple:
> > 
> > struct dsa_platform_data {
> >         /*
> >          * Reference to a Linux network interface that connects
> >          * to the root switch chip of the tree.
> >          */
> >         struct device   *netdev;
> > 
> >         /*
> >          * Info structs describing each of the switch chips
> >          * connected via this network interface.
> >          */
> >         int             nr_chips;
> >         struct dsa_chip_data    *chip;
> > };
> > 
> > This netdev is the CPU side of the RGMII interface.
> > 
> > Each switch has a dsa_chip_data, again edited:
> > 
> > struct dsa_chip_data {
> >         /*
> >          * How to access the switch configuration registers.
> >          */
> >         struct device   *host_dev;
> >         int             sw_addr;
> > ...
> > }
> > 
> > The host_dev is the CPU side of the MIImgmt, and we have the address
> > the switch is using on the bus.
> > 
> > During probe of this platform device, we need to get from the
> > struct device *netdev to a struct net_device *dev.
> > 
> > So the code looks in the device net class to find the device
> > 
> > |   |   |   |-- f1074000.ethernet
> > |   |   |   |   |-- deferred_probe
> > |   |   |   |   |-- driver -> ../../../../../bus/platform/drivers/mvneta
> > |   |   |   |   |-- driver_override
> > |   |   |   |   |-- modalias
> > |   |   |   |   |-- net
> > |   |   |   |   |   `-- eth1
> > |   |   |   |   |       |-- addr_assign_type
> > |   |   |   |   |       |-- address
> > |   |   |   |   |       |-- addr_len
> > |   |   |   |   |       |-- broadcast
> > |   |   |   |   |       |-- carrier
> > |   |   |   |   |       |-- carrier_changes
> > |   |   |   |   |       |-- deferred_probe
> > |   |   |   |   |       |-- device -> ../../../f1074000.ethernet
> > 
> > and then use container_of() to get the net_device.
> > 
> > Similarly, the code needs to get from struct device *host_dev to a struct mii_bus *.
> > 
> > |   |   |   |-- f1072004.mdio
> > |   |   |   |   |-- deferred_probe
> > |   |   |   |   |-- driver -> ../../../../../bus/platform/drivers/orion-mdio
> > |   |   |   |   |-- driver_override
> > |   |   |   |   |-- mdio_bus
> > |   |   |   |   |   `-- f1072004.mdio-mi
> > |   |   |   |   |       |-- deferred_probe
> > |   |   |   |   |       |-- device -> ../../../f1072004.mdio
> > 
> 
> Thanks Andrew! Greg, does that make it clearer how these devices
> references are used, do you still think the way this is done is wrong,
> too cautious, or valid?

I'm still not sold on it, I think there is something odd here with your
use/assumptions of the driver model.  Give me a few days to catch up
with other stuff to respond back please...

thanks,

greg k-h
Greg Kroah-Hartman Jan. 19, 2017, 2:28 p.m. UTC | #8
On Mon, Jan 16, 2017 at 12:01:02PM -0800, Florian Fainelli wrote:
> On 01/15/2017 11:16 AM, Andrew Lunn wrote:
> >>> What exactly is the relationship between these devices (a ascii-art tree
> >>> or sysfs tree output might be nice) so I can try to understand what is
> >>> going on here.
> > 
> > Hi Greg, Florian
> > 
> > A few diagrams and trees which might help understand what is going on.
> > 
> > The first diagram comes from the 2008 patch which added all this code:
> > 
> >             +-----------+       +-----------+
> >             |           | RGMII |           |
> >             |           +-------+           +------ 1000baseT MDI ("WAN")
> >             |           |       |  6-port   +------ 1000baseT MDI ("LAN1")
> >             |    CPU    |       |  ethernet +------ 1000baseT MDI ("LAN2")
> >             |           |MIImgmt|  switch   +------ 1000baseT MDI ("LAN3")
> >             |           +-------+  w/5 PHYs +------ 1000baseT MDI ("LAN4")
> >             |           |       |           |
> >             +-----------+       +-----------+
> > 
> > We have an ethernet switch and a host CPU. The switch is connected to
> > the CPU in two different ways. RGMII allows us to get Ethernet frames
> > from the CPU into the switch. MIImgmt, is the management bus normally
> > used for Ethernet PHYs, but Marvell switches also use it for Managing
> > switches.
> > 
> > The diagram above is the simplest setup. You can have multiple
> > Ethernet switches, connected together via switch ports. Each switch
> > has its own MIImgmt connect to the CPU, but there is only one RGMII
> > link.
> > 
> > When this code was designed back in 2008, it was decided to represent
> > this is a platform device, and it has a platform_data, which i have
> > slightly edited to keep it simple:
> > 
> > struct dsa_platform_data {
> >         /*
> >          * Reference to a Linux network interface that connects
> >          * to the root switch chip of the tree.
> >          */
> >         struct device   *netdev;

This I think is the oddest thing, why do you need to have the "root
switch" here?  You seem to have dropped the next value in this
structure:
	struct net_device *of_netdev;

Isn't that the "real" net_device you need/want to get to here?

Or, when you set netdev, can't you also point to the "real" net_device
you want to later get to?

> >         /*
> >          * Info structs describing each of the switch chips
> >          * connected via this network interface.
> >          */
> >         int             nr_chips;
> >         struct dsa_chip_data    *chip;
> > };
> > 
> > This netdev is the CPU side of the RGMII interface.
> > 
> > Each switch has a dsa_chip_data, again edited:
> > 
> > struct dsa_chip_data {
> >         /*
> >          * How to access the switch configuration registers.
> >          */
> >         struct device   *host_dev;
> >         int             sw_addr;
> > ...
> > }

If each switch only has one dsa_chip_data, can't you point directly from
that structure back to the dsa_platform_device?  Or is that what this
"host_dev" pointer is pointing to?

> > The host_dev is the CPU side of the MIImgmt, and we have the address
> > the switch is using on the bus.

"cpu side" means what?  What 'struct device' is this?  The host
controller?  The platform device?  Something else?

> > During probe of this platform device, we need to get from the
> > struct device *netdev to a struct net_device *dev.

Wait, what exactly is this "struct device *"?  Who created it?

And why are you using a platform device?  Shouldn't this be a custom
bus?  I know people like to abuse platform devices, is that the case
here, or is it really driven only by a device tree representation?

Shouldn't you have a bus for RGMII devices?  Is that the real problem
here, you don't have a representation for your RGMII "bus" with a
controller to bundle everything under (like a USB host controller, it
bridges from one bus to another).

> > So the code looks in the device net class to find the device
> > 
> > |   |   |   |-- f1074000.ethernet
> > |   |   |   |   |-- deferred_probe
> > |   |   |   |   |-- driver -> ../../../../../bus/platform/drivers/mvneta
> > |   |   |   |   |-- driver_override
> > |   |   |   |   |-- modalias
> > |   |   |   |   |-- net
> > |   |   |   |   |   `-- eth1
> > |   |   |   |   |       |-- addr_assign_type
> > |   |   |   |   |       |-- address
> > |   |   |   |   |       |-- addr_len
> > |   |   |   |   |       |-- broadcast
> > |   |   |   |   |       |-- carrier
> > |   |   |   |   |       |-- carrier_changes
> > |   |   |   |   |       |-- deferred_probe
> > |   |   |   |   |       |-- device -> ../../../f1074000.ethernet
> > 
> > and then use container_of() to get the net_device.

What 'struct device *' are you passing to container_of() here?  The one
representing eth1?  What call is this?  And where are you trying to go
from there, to a peer?  Why?

> > Similarly, the code needs to get from struct device *host_dev to a struct mii_bus *.
> > 
> > |   |   |   |-- f1072004.mdio
> > |   |   |   |   |-- deferred_probe
> > |   |   |   |   |-- driver -> ../../../../../bus/platform/drivers/orion-mdio
> > |   |   |   |   |-- driver_override
> > |   |   |   |   |-- mdio_bus
> > |   |   |   |   |   `-- f1072004.mdio-mi
> > |   |   |   |   |       |-- deferred_probe
> > |   |   |   |   |       |-- device -> ../../../f1072004.mdio

Ok, this looks like the "peer" you want to get to from eth1, you want
f1072004.mdio-mi, right?

If so, why is eth1 not below f1072004.mdio-mi in the heirachy already?

Why is it up attached to the parent device, making this a "flat"
heirachy?

That's what is confusing about your functions, you are just walking all
of the "child" devices of a specific struct device, trying to find the
first random one that happens to belong to a specific 'bus' because you
are related to it.

Which is wrong, eth1 should be a child of your bus device, that way you
"know" how to get to it (it's your parent!).

So, I'm still confused, and still think this is all wrong, but feel free
to prove me wrong about this :)

thanks,

greg k-h
Andrew Lunn Jan. 19, 2017, 2:53 p.m. UTC | #9
> > > struct dsa_platform_data {
> > >         /*
> > >          * Reference to a Linux network interface that connects
> > >          * to the root switch chip of the tree.
> > >          */
> > >         struct device   *netdev;
> 
> This I think is the oddest thing, why do you need to have the "root
> switch" here?  You seem to have dropped the next value in this
> structure:
> 	struct net_device *of_netdev;

We are implementing platform_data for devices which don't support
device tree. When using OF, we don't have any of these issues. We can
go straight to the device.

It is a bit convoluted, but look at
arch/arm/mach-orion5x/rd88f5181l-ge-setup.c. It defines the start of
the dsa_platform_data in that file. It then gets passed through
common.c: orion5x_eth_switch_init() to 
arch/arm/plat-orion/common.c:orion_ge00_switch_init() :

void __init orion_ge00_switch_init(struct dsa_platform_data *d)
{
        int i;

        d->netdev = &orion_ge00.dev;
        for (i = 0; i < d->nr_chips; i++)
                d->chip[i].host_dev = &orion_ge_mvmdio.dev;

        platform_device_register_data(NULL, "dsa", 0, d, sizeof(d));
}

Where we have

static struct platform_device orion_ge00 = {
        .name           = MV643XX_ETH_NAME,
        .id             = 0,
        .num_resources  = 1,
        .resource       = orion_ge00_resources,
        .dev            = {
                .coherent_dma_mask      = DMA_BIT_MASK(32),
        },
};

So this is the platform device for the Ethernet device. We cannot go
to the net_device, because it does not exist until this Ethernet
platform device is instantiated.

> Shouldn't you have a bus for RGMII devices?  Is that the real problem
> here, you don't have a representation for your RGMII "bus" with a
> controller to bundle everything under (like a USB host controller, it
> bridges from one bus to another).

RGMII is not a bus. It is a point to point link. Normally, it is
between the Ethernet MAC and the Ethernet PHY. But you can also have
it between an Ethernet MAC and another Ethernet MAC. I'm not sure
describing this is a bus would be practical. It would mean every
ethernet driver also becomes a bus driver! Every Ethernet PHY would
become a bus device. That is a huge change, for a few legacy boards
which are not getting converted to device tree.

> If so, why is eth1 not below f1072004.mdio-mi in the heirachy already?

See the initial diagram above. The switch has two parents. It hangs of
an MDIO bus, and you would like to make RGMII also a bus. Can the
device model handle that? I thought it was a tree, not a graph?

       Andrew
Greg Kroah-Hartman Jan. 19, 2017, 4:30 p.m. UTC | #10
On Thu, Jan 19, 2017 at 03:53:15PM +0100, Andrew Lunn wrote:
> > > > struct dsa_platform_data {
> > > >         /*
> > > >          * Reference to a Linux network interface that connects
> > > >          * to the root switch chip of the tree.
> > > >          */
> > > >         struct device   *netdev;
> > 
> > This I think is the oddest thing, why do you need to have the "root
> > switch" here?  You seem to have dropped the next value in this
> > structure:
> > 	struct net_device *of_netdev;
> 
> We are implementing platform_data for devices which don't support
> device tree. When using OF, we don't have any of these issues. We can
> go straight to the device.
> 
> It is a bit convoluted, but look at
> arch/arm/mach-orion5x/rd88f5181l-ge-setup.c. It defines the start of
> the dsa_platform_data in that file. It then gets passed through
> common.c: orion5x_eth_switch_init() to 
> arch/arm/plat-orion/common.c:orion_ge00_switch_init() :
> 
> void __init orion_ge00_switch_init(struct dsa_platform_data *d)
> {
>         int i;
> 
>         d->netdev = &orion_ge00.dev;
>         for (i = 0; i < d->nr_chips; i++)
>                 d->chip[i].host_dev = &orion_ge_mvmdio.dev;
> 
>         platform_device_register_data(NULL, "dsa", 0, d, sizeof(d));
> }
> 
> Where we have
> 
> static struct platform_device orion_ge00 = {
>         .name           = MV643XX_ETH_NAME,
>         .id             = 0,
>         .num_resources  = 1,
>         .resource       = orion_ge00_resources,
>         .dev            = {
>                 .coherent_dma_mask      = DMA_BIT_MASK(32),
>         },
> };
> 
> So this is the platform device for the Ethernet device. We cannot go
> to the net_device, because it does not exist until this Ethernet
> platform device is instantiated.

Ok, fine, but why isn't the ethernet device a child of this platform
device?  Why is it floating around somewhere else?  You don't see that
happening for other devices.

> > Shouldn't you have a bus for RGMII devices?  Is that the real problem
> > here, you don't have a representation for your RGMII "bus" with a
> > controller to bundle everything under (like a USB host controller, it
> > bridges from one bus to another).
> 
> RGMII is not a bus. It is a point to point link.

That's fine, but you have multiple devices talking across it, so in the
kernel driver model "naming", it's a bus.  Anything can be a bus, it's
just a way to group together devices of the same type.

> Normally, it is
> between the Ethernet MAC and the Ethernet PHY. But you can also have
> it between an Ethernet MAC and another Ethernet MAC. I'm not sure
> describing this is a bus would be practical. It would mean every
> ethernet driver also becomes a bus driver!

Instead of a custom platform device driver, yes.  Is that a big deal?
How many do you have?

> Every Ethernet PHY would become a bus device. That is a huge change,
> for a few legacy boards which are not getting converted to device
> tree.

How many different drivers are we talking about here?

> > If so, why is eth1 not below f1072004.mdio-mi in the heirachy already?
> 
> See the initial diagram above. The switch has two parents. It hangs of
> an MDIO bus, and you would like to make RGMII also a bus. Can the
> device model handle that? I thought it was a tree, not a graph?

It is a tree, you are correct.  But right now you are picking and
choosing where you want to put that network device.  Why not put it over
on the mdio bus?  Or, like I mentioned, make it a custom bus where you
can properly show this relationship, not just in a generic "let's jump
to the parent and poke around randomly."

Again, it's that last sentance that I object the most to here.  You all
keep ignoring it for some reason...

thanks,

greg k-h
Russell King (Oracle) Jan. 19, 2017, 4:35 p.m. UTC | #11
On Thu, Jan 19, 2017 at 05:30:13PM +0100, Greg KH wrote:
> On Thu, Jan 19, 2017 at 03:53:15PM +0100, Andrew Lunn wrote:
> > > > > struct dsa_platform_data {
> > > > >         /*
> > > > >          * Reference to a Linux network interface that connects
> > > > >          * to the root switch chip of the tree.
> > > > >          */
> > > > >         struct device   *netdev;
> > > 
> > > This I think is the oddest thing, why do you need to have the "root
> > > switch" here?  You seem to have dropped the next value in this
> > > structure:
> > > 	struct net_device *of_netdev;
> > 
> > We are implementing platform_data for devices which don't support
> > device tree. When using OF, we don't have any of these issues. We can
> > go straight to the device.
> > 
> > It is a bit convoluted, but look at
> > arch/arm/mach-orion5x/rd88f5181l-ge-setup.c. It defines the start of
> > the dsa_platform_data in that file. It then gets passed through
> > common.c: orion5x_eth_switch_init() to 
> > arch/arm/plat-orion/common.c:orion_ge00_switch_init() :
> > 
> > void __init orion_ge00_switch_init(struct dsa_platform_data *d)
> > {
> >         int i;
> > 
> >         d->netdev = &orion_ge00.dev;
> >         for (i = 0; i < d->nr_chips; i++)
> >                 d->chip[i].host_dev = &orion_ge_mvmdio.dev;
> > 
> >         platform_device_register_data(NULL, "dsa", 0, d, sizeof(d));
> > }
> > 
> > Where we have
> > 
> > static struct platform_device orion_ge00 = {
> >         .name           = MV643XX_ETH_NAME,
> >         .id             = 0,
> >         .num_resources  = 1,
> >         .resource       = orion_ge00_resources,
> >         .dev            = {
> >                 .coherent_dma_mask      = DMA_BIT_MASK(32),
> >         },
> > };
> > 
> > So this is the platform device for the Ethernet device. We cannot go
> > to the net_device, because it does not exist until this Ethernet
> > platform device is instantiated.
> 
> Ok, fine, but why isn't the ethernet device a child of this platform
> device?  Why is it floating around somewhere else?  You don't see that
> happening for other devices.

The ethernet device is not a child of the DSA device.  I'm going to
send a mail I've had queued up for a few hours redoing Andrew's
ASCII art, because I think that's what's causing the confusion here,
provided I haven't discarded it.
Russell King (Oracle) Jan. 19, 2017, 4:51 p.m. UTC | #12
(This is mainly for Greg's benefit to help him understand the issue.)

I think the diagram you gave initially made this confusing, as it
talks about a CPU(sic) producing the "RGMII" and "MII-MGMT".

Let's instead show a better representation that hopefully helps Greg
understand networking. :)


  CPU
System <-B->  Ethernet controller <-P-> } PHY <---> network cable
                                        } - - - - - - - or - - - - - - -
                  MDIO bus -------M---> } Switch <-P-> PHYs <--> network
                                              `----M----^        cables

'B' can be an on-SoC bus or something like PCI.

'P' are the high-speed connectivity between the ethernet controller and
PHY which carries the packet data.  It has no addressing, it's a point
to point link.  RGMII is just one wiring example, there are many
different interfaces there (SGMII, AUI, XAUI, XGMII to name a few.)

'M' are the MDIO bus, which is the bus by which ethernet PHYs and
switches can be identified and controlled.

The MDIO bus has a bus_type, has host drivers which are sometimes
part of the ethernet controller, but can also be stand-alone devices
shared between multiple ethernet controllers.

PHYs are a kind of MDIO device which are members of the MDIO bus
type.  Each PHY (and switch) has a numerical address, and identifying
numbers within its register set which identifies the manufacturer
and device type.  We have device_driver objects for these.

Expanding the above diagram to make it (hopefully) even clearer,
we can have this classic setup:

  CPU
System <-B-> Ethernet controller <-P-> PHY <---> network cable
                 MDIO bus -------M------^

Or, in the case of two DSA switches attached to an Ethernet controller:

                                     |~~~~~~~~|
System <-B-> Ethernet controller <-P-> Switch <-P-> PHY1 <--> network cable
                 MDIO bus ----+--M--->   1    <-P-> PHY2 <--> network cable
                              |      |        ...    |
                              |      |        <-P-> PHYn <--> network cable
                              |      |....^...|      |
                              |           |  `---M---'
                              |           P
                              |           |
                              |      |~~~~v~~~|
                              `------> Switch <-P-> PHY1 <--> network cable
                                     |   2    ...    |
                                     |        <-P-> PHYn <--> network cable
                                     |........|      |
                                             `---M---'

The problem that the DSA guys are trying to deal with is how to
represent the link between the DSA switches (which are devices
sitting off their controlling bus - the MDIO bus) and the ethernet
controller associated with that collection of devices, be it a
switch or PHY.

Merely changing the parent/child relationships to try and solve
one issue just creates exactly the same problem elsewhere.

So, I hope with these diagrams, you can see that trying to make
the ethernet controller a child device of the DSA switches
means that (eg) it's no longer a PCI device, which is rather
absurd, especially when considering that what happens to the
right of the ethernet controller in the diagrams above is
normally external chips to the SoC or ethernet device.
Florian Fainelli Jan. 19, 2017, 6:12 p.m. UTC | #13
On 01/19/2017 08:51 AM, Russell King - ARM Linux wrote:
> (This is mainly for Greg's benefit to help him understand the issue.)
> 
> I think the diagram you gave initially made this confusing, as it
> talks about a CPU(sic) producing the "RGMII" and "MII-MGMT".
> 
> Let's instead show a better representation that hopefully helps Greg
> understand networking. :)
> 
> 
>   CPU
> System <-B->  Ethernet controller <-P-> } PHY <---> network cable
>                                         } - - - - - - - or - - - - - - -
>                   MDIO bus -------M---> } Switch <-P-> PHYs <--> network
>                                               `----M----^        cables
> 
> 'B' can be an on-SoC bus or something like PCI.
> 
> 'P' are the high-speed connectivity between the ethernet controller and
> PHY which carries the packet data.  It has no addressing, it's a point
> to point link.  RGMII is just one wiring example, there are many
> different interfaces there (SGMII, AUI, XAUI, XGMII to name a few.)
> 
> 'M' are the MDIO bus, which is the bus by which ethernet PHYs and
> switches can be identified and controlled.

So MDIO is one possible bus type to connect an Ethernet switch (and
PHYs) to a the System, but is not necessarily the only one. You have
existing devices out there with on-chip integrated switches (platform),
SPI/GPIO/I2C/PCI(e).

> 
> The MDIO bus has a bus_type, has host drivers which are sometimes
> part of the ethernet controller, but can also be stand-alone devices
> shared between multiple ethernet controllers.
> 
> PHYs are a kind of MDIO device which are members of the MDIO bus
> type.  Each PHY (and switch) has a numerical address, and identifying
> numbers within its register set which identifies the manufacturer
> and device type.  We have device_driver objects for these.
> 
> Expanding the above diagram to make it (hopefully) even clearer,
> we can have this classic setup:
> 
>   CPU
> System <-B-> Ethernet controller <-P-> PHY <---> network cable
>                  MDIO bus -------M------^
> 
> Or, in the case of two DSA switches attached to an Ethernet controller:
> 
>                                      |~~~~~~~~|
> System <-B-> Ethernet controller <-P-> Switch <-P-> PHY1 <--> network cable
>                  MDIO bus ----+--M--->   1    <-P-> PHY2 <--> network cable
>                               |      |        ...    |
>                               |      |        <-P-> PHYn <--> network cable
>                               |      |....^...|      |
>                               |           |  `---M---'
>                               |           P
>                               |           |
>                               |      |~~~~v~~~|
>                               `------> Switch <-P-> PHY1 <--> network cable
>                                      |   2    ...    |
>                                      |        <-P-> PHYn <--> network cable
>                                      |........|      |
>                                              `---M---'
> 
> The problem that the DSA guys are trying to deal with is how to
> represent the link between the DSA switches (which are devices
> sitting off their controlling bus - the MDIO bus) and the ethernet
> controller associated with that collection of devices, be it a
> switch or PHY.
> 
> Merely changing the parent/child relationships to try and solve
> one issue just creates exactly the same problem elsewhere.

Exactly!

> 
> So, I hope with these diagrams, you can see that trying to make
> the ethernet controller a child device of the DSA switches
> means that (eg) it's no longer a PCI device, which is rather
> absurd, especially when considering that what happens to the
> right of the ethernet controller in the diagrams above is
> normally external chips to the SoC or ethernet device.
> 

Indeed.

Back to the actual code that triggered this discussion, the whole
purpose is just a safeguard. Given a device reference, we can assume
that it is indeed the backing device for a net_device, and we could do a
to_net_device() right away (and crash if someone did not write correct
platform_data structures), or, by walking the device tree (the device
driver model one) we can make sure it does belong in the proper class
and this is indeed what we think it is.

HTH
Florian Fainelli Jan. 24, 2017, 6:59 p.m. UTC | #14
On 01/19/2017 10:12 AM, Florian Fainelli wrote:
> 
> Back to the actual code that triggered this discussion, the whole
> purpose is just a safeguard. Given a device reference, we can assume
> that it is indeed the backing device for a net_device, and we could do a
> to_net_device() right away (and crash if someone did not write correct
> platform_data structures), or, by walking the device tree (the device
> driver model one) we can make sure it does belong in the proper class
> and this is indeed what we think it is.

Greg, did Russell's explanation clarify things, or do you still think
this is completely bogus and we need to re design the whole thing?

Just asking so I can try to resubmit just the preparatory parts or just
the whole thing.

Thank you
Greg Kroah-Hartman Jan. 25, 2017, 9:25 p.m. UTC | #15
On Tue, Jan 24, 2017 at 10:59:15AM -0800, Florian Fainelli wrote:
> On 01/19/2017 10:12 AM, Florian Fainelli wrote:
> > 
> > Back to the actual code that triggered this discussion, the whole
> > purpose is just a safeguard. Given a device reference, we can assume
> > that it is indeed the backing device for a net_device, and we could do a
> > to_net_device() right away (and crash if someone did not write correct
> > platform_data structures), or, by walking the device tree (the device
> > driver model one) we can make sure it does belong in the proper class
> > and this is indeed what we think it is.
> 
> Greg, did Russell's explanation clarify things, or do you still think
> this is completely bogus and we need to re design the whole thing?
> 
> Just asking so I can try to resubmit just the preparatory parts or just
> the whole thing.

Sorry, I haven't gotten back to this, it's lower on my list.  Should try
to get to it tomorrow...
Florian Fainelli Jan. 30, 2017, 10:46 p.m. UTC | #16
On 01/25/2017 01:25 PM, Greg KH wrote:
> On Tue, Jan 24, 2017 at 10:59:15AM -0800, Florian Fainelli wrote:
>> On 01/19/2017 10:12 AM, Florian Fainelli wrote:
>>>
>>> Back to the actual code that triggered this discussion, the whole
>>> purpose is just a safeguard. Given a device reference, we can assume
>>> that it is indeed the backing device for a net_device, and we could do a
>>> to_net_device() right away (and crash if someone did not write correct
>>> platform_data structures), or, by walking the device tree (the device
>>> driver model one) we can make sure it does belong in the proper class
>>> and this is indeed what we think it is.
>>
>> Greg, did Russell's explanation clarify things, or do you still think
>> this is completely bogus and we need to re design the whole thing?
>>
>> Just asking so I can try to resubmit just the preparatory parts or just
>> the whole thing.
> 
> Sorry, I haven't gotten back to this, it's lower on my list.  Should try
> to get to it tomorrow...

Greg, please give some feedback here, I can only produce new patches as
fast as I am given feedback, and I would really hate to miss the 4.11
merge window because we have been sleeping on this. If there is a need
to clarify things, I will be more than happy to try to provide information.

Thank you!
Greg Kroah-Hartman Feb. 10, 2017, 1:02 p.m. UTC | #17
On Thu, Jan 19, 2017 at 04:51:55PM +0000, Russell King - ARM Linux wrote:
> (This is mainly for Greg's benefit to help him understand the issue.)
> 
> I think the diagram you gave initially made this confusing, as it
> talks about a CPU(sic) producing the "RGMII" and "MII-MGMT".
> 
> Let's instead show a better representation that hopefully helps Greg
> understand networking. :)
> 
> 
>   CPU
> System <-B->  Ethernet controller <-P-> } PHY <---> network cable
>                                         } - - - - - - - or - - - - - - -
>                   MDIO bus -------M---> } Switch <-P-> PHYs <--> network
>                                               `----M----^        cables
> 
> 'B' can be an on-SoC bus or something like PCI.
> 
> 'P' are the high-speed connectivity between the ethernet controller and
> PHY which carries the packet data.  It has no addressing, it's a point
> to point link.  RGMII is just one wiring example, there are many
> different interfaces there (SGMII, AUI, XAUI, XGMII to name a few.)
> 
> 'M' are the MDIO bus, which is the bus by which ethernet PHYs and
> switches can be identified and controlled.
> 
> The MDIO bus has a bus_type, has host drivers which are sometimes
> part of the ethernet controller, but can also be stand-alone devices
> shared between multiple ethernet controllers.
> 
> PHYs are a kind of MDIO device which are members of the MDIO bus
> type.  Each PHY (and switch) has a numerical address, and identifying
> numbers within its register set which identifies the manufacturer
> and device type.  We have device_driver objects for these.
> 
> Expanding the above diagram to make it (hopefully) even clearer,
> we can have this classic setup:
> 
>   CPU
> System <-B-> Ethernet controller <-P-> PHY <---> network cable
>                  MDIO bus -------M------^
> 
> Or, in the case of two DSA switches attached to an Ethernet controller:
> 
>                                      |~~~~~~~~|
> System <-B-> Ethernet controller <-P-> Switch <-P-> PHY1 <--> network cable
>                  MDIO bus ----+--M--->   1    <-P-> PHY2 <--> network cable
>                               |      |        ...    |
>                               |      |        <-P-> PHYn <--> network cable
>                               |      |....^...|      |
>                               |           |  `---M---'
>                               |           P
>                               |           |
>                               |      |~~~~v~~~|
>                               `------> Switch <-P-> PHY1 <--> network cable
>                                      |   2    ...    |
>                                      |        <-P-> PHYn <--> network cable
>                                      |........|      |
>                                              `---M---'
> 
> The problem that the DSA guys are trying to deal with is how to
> represent the link between the DSA switches (which are devices
> sitting off their controlling bus - the MDIO bus) and the ethernet
> controller associated with that collection of devices, be it a
> switch or PHY.

Why do they have to represent that link?  This is a driver that somehow
binds the two togther in some sort of "control plane"?

> Merely changing the parent/child relationships to try and solve
> one issue just creates exactly the same problem elsewhere.

Fair enough.

> So, I hope with these diagrams, you can see that trying to make
> the ethernet controller a child device of the DSA switches
> means that (eg) it's no longer a PCI device, which is rather
> absurd, especially when considering that what happens to the
> right of the ethernet controller in the diagrams above is
> normally external chips to the SoC or ethernet device.

Ok, thanks for the long explainations and diagrams.

_BUT_ my original point remains.  These new functions you all are trying
to get into the driver core, do NOT do what they say they are doing.
They are mucking around with a "known topology" and just happen to work
because the device you are trying to find shares a common parent with
yourself.

That is not what the function says it does, and as such, I do not want
that function in the driver core at all.

If you wish to keep it in your own subsystem, that's fine, but call it
what it really is:
	hack_to_find_peer_device_on_random_bus()
and pass in a _real_ pointer to a bus type.  Not some random string
please.

Or better yet, have the DSA code accept pointers to the two devices in
the first place, so it "knows" what to do here in a much better way.
Right now it is a bad hack.  You all can not argue that is not true.

thanks,

greg k-h
Florian Fainelli Feb. 10, 2017, 6:30 p.m. UTC | #18
On 02/10/2017 05:02 AM, Greg KH wrote:
> On Thu, Jan 19, 2017 at 04:51:55PM +0000, Russell King - ARM Linux wrote:
>> (This is mainly for Greg's benefit to help him understand the issue.)
>>
>> I think the diagram you gave initially made this confusing, as it
>> talks about a CPU(sic) producing the "RGMII" and "MII-MGMT".
>>
>> Let's instead show a better representation that hopefully helps Greg
>> understand networking. :)
>>
>>
>>   CPU
>> System <-B->  Ethernet controller <-P-> } PHY <---> network cable
>>                                         } - - - - - - - or - - - - - - -
>>                   MDIO bus -------M---> } Switch <-P-> PHYs <--> network
>>                                               `----M----^        cables
>>
>> 'B' can be an on-SoC bus or something like PCI.
>>
>> 'P' are the high-speed connectivity between the ethernet controller and
>> PHY which carries the packet data.  It has no addressing, it's a point
>> to point link.  RGMII is just one wiring example, there are many
>> different interfaces there (SGMII, AUI, XAUI, XGMII to name a few.)
>>
>> 'M' are the MDIO bus, which is the bus by which ethernet PHYs and
>> switches can be identified and controlled.
>>
>> The MDIO bus has a bus_type, has host drivers which are sometimes
>> part of the ethernet controller, but can also be stand-alone devices
>> shared between multiple ethernet controllers.
>>
>> PHYs are a kind of MDIO device which are members of the MDIO bus
>> type.  Each PHY (and switch) has a numerical address, and identifying
>> numbers within its register set which identifies the manufacturer
>> and device type.  We have device_driver objects for these.
>>
>> Expanding the above diagram to make it (hopefully) even clearer,
>> we can have this classic setup:
>>
>>   CPU
>> System <-B-> Ethernet controller <-P-> PHY <---> network cable
>>                  MDIO bus -------M------^
>>
>> Or, in the case of two DSA switches attached to an Ethernet controller:
>>
>>                                      |~~~~~~~~|
>> System <-B-> Ethernet controller <-P-> Switch <-P-> PHY1 <--> network cable
>>                  MDIO bus ----+--M--->   1    <-P-> PHY2 <--> network cable
>>                               |      |        ...    |
>>                               |      |        <-P-> PHYn <--> network cable
>>                               |      |....^...|      |
>>                               |           |  `---M---'
>>                               |           P
>>                               |           |
>>                               |      |~~~~v~~~|
>>                               `------> Switch <-P-> PHY1 <--> network cable
>>                                      |   2    ...    |
>>                                      |        <-P-> PHYn <--> network cable
>>                                      |........|      |
>>                                              `---M---'
>>
>> The problem that the DSA guys are trying to deal with is how to
>> represent the link between the DSA switches (which are devices
>> sitting off their controlling bus - the MDIO bus) and the ethernet
>> controller associated with that collection of devices, be it a
>> switch or PHY.
> 
> Why do they have to represent that link?  This is a driver that somehow
> binds the two togther in some sort of "control plane"?

We have to represent that link because the CPU/host/management Ethernet
MAC is physically connected to the CPU/management port of the switch. It
does indeed participate in establishing the control plane.

The basic idea of DSA is that the switch inserts vendor tags to indicate
why the packet is sent towards the CPU in the first place: flooding,
management, copy etc along with information as to which
originating/destination port(s) this packet comes/goes from/to. On top
of that, we demultiplex that tag to deliver normal Ethernet frames to
per-port network devices (virtual network devices).

If we did leave the switch in an unmanaged mode and not logically
attached to an Ethernet MAC for management, we'd lose all that
information (we could use per-port VLANs to re-create it, but it would
be inferior to what a switch with proprietary tags can do)

Code in net/dsa/dsa2.c that binds the two (switch and Ethernet MAC)
together is not strictly a driver, it just is resident in memory and
waits for dsa_register_switch() to be called until it tries to do this
binding.

> 
>> Merely changing the parent/child relationships to try and solve
>> one issue just creates exactly the same problem elsewhere.
> 
> Fair enough.
> 
>> So, I hope with these diagrams, you can see that trying to make
>> the ethernet controller a child device of the DSA switches
>> means that (eg) it's no longer a PCI device, which is rather
>> absurd, especially when considering that what happens to the
>> right of the ethernet controller in the diagrams above is
>> normally external chips to the SoC or ethernet device.
> 
> Ok, thanks for the long explainations and diagrams.
> 
> _BUT_ my original point remains.  These new functions you all are trying
> to get into the driver core, do NOT do what they say they are doing.
> They are mucking around with a "known topology" and just happen to work
> because the device you are trying to find shares a common parent with
> yourself.
> 
> That is not what the function says it does, and as such, I do not want
> that function in the driver core at all.
> 
> If you wish to keep it in your own subsystem, that's fine, but call it
> what it really is:
> 	hack_to_find_peer_device_on_random_bus()
> and pass in a _real_ pointer to a bus type.  Not some random string
> please.

Yes, David has accepted that and we did make it prefixed with dsa_*.
That does not mean it should stay forever though if we can work on
something better designed.

> 
> Or better yet, have the DSA code accept pointers to the two devices in
> the first place, so it "knows" what to do here in a much better way.

Well, that is really what is being done here, the "inputs" to the code
in net/dsa/dsa2.c are opaque struct device references that we
verify/proof by making sure they belong in the class of device we expect
them to be.

NB: technically it's just one device reference: Ethernet MAC device,
because the other one is implied by the device driver registering the
switch with dsa_register_switch() (MDIO, SPI, I2C, PCI etc..).

> Right now it is a bad hack.  You all can not argue that is not true.

It's hard to argue on an qualifier for that design when we can't even
agree whether our understanding of these devices matches your
understanding after you read and digested our explanations :)

I entirely agree that the design has shortcomings, and in fact, it does
present some challenges that are not well solved right now, if you
unbind the Ethernet MAC driver, you leave a dangling DSA switch tree,
and good luck re-attaching it.

So maybe we should talk about it in person at $next conference, I will
be at ELC, how about you?

Thanks
Greg Kroah-Hartman Feb. 12, 2017, 12:56 p.m. UTC | #19
On Fri, Feb 10, 2017 at 10:30:58AM -0800, Florian Fainelli wrote:
> On 02/10/2017 05:02 AM, Greg KH wrote:
> > On Thu, Jan 19, 2017 at 04:51:55PM +0000, Russell King - ARM Linux wrote:
> >> (This is mainly for Greg's benefit to help him understand the issue.)
> >>
> >> I think the diagram you gave initially made this confusing, as it
> >> talks about a CPU(sic) producing the "RGMII" and "MII-MGMT".
> >>
> >> Let's instead show a better representation that hopefully helps Greg
> >> understand networking. :)
> >>
> >>
> >>   CPU
> >> System <-B->  Ethernet controller <-P-> } PHY <---> network cable
> >>                                         } - - - - - - - or - - - - - - -
> >>                   MDIO bus -------M---> } Switch <-P-> PHYs <--> network
> >>                                               `----M----^        cables
> >>
> >> 'B' can be an on-SoC bus or something like PCI.
> >>
> >> 'P' are the high-speed connectivity between the ethernet controller and
> >> PHY which carries the packet data.  It has no addressing, it's a point
> >> to point link.  RGMII is just one wiring example, there are many
> >> different interfaces there (SGMII, AUI, XAUI, XGMII to name a few.)
> >>
> >> 'M' are the MDIO bus, which is the bus by which ethernet PHYs and
> >> switches can be identified and controlled.
> >>
> >> The MDIO bus has a bus_type, has host drivers which are sometimes
> >> part of the ethernet controller, but can also be stand-alone devices
> >> shared between multiple ethernet controllers.
> >>
> >> PHYs are a kind of MDIO device which are members of the MDIO bus
> >> type.  Each PHY (and switch) has a numerical address, and identifying
> >> numbers within its register set which identifies the manufacturer
> >> and device type.  We have device_driver objects for these.
> >>
> >> Expanding the above diagram to make it (hopefully) even clearer,
> >> we can have this classic setup:
> >>
> >>   CPU
> >> System <-B-> Ethernet controller <-P-> PHY <---> network cable
> >>                  MDIO bus -------M------^
> >>
> >> Or, in the case of two DSA switches attached to an Ethernet controller:
> >>
> >>                                      |~~~~~~~~|
> >> System <-B-> Ethernet controller <-P-> Switch <-P-> PHY1 <--> network cable
> >>                  MDIO bus ----+--M--->   1    <-P-> PHY2 <--> network cable
> >>                               |      |        ...    |
> >>                               |      |        <-P-> PHYn <--> network cable
> >>                               |      |....^...|      |
> >>                               |           |  `---M---'
> >>                               |           P
> >>                               |           |
> >>                               |      |~~~~v~~~|
> >>                               `------> Switch <-P-> PHY1 <--> network cable
> >>                                      |   2    ...    |
> >>                                      |        <-P-> PHYn <--> network cable
> >>                                      |........|      |
> >>                                              `---M---'
> >>
> >> The problem that the DSA guys are trying to deal with is how to
> >> represent the link between the DSA switches (which are devices
> >> sitting off their controlling bus - the MDIO bus) and the ethernet
> >> controller associated with that collection of devices, be it a
> >> switch or PHY.
> > 
> > Why do they have to represent that link?  This is a driver that somehow
> > binds the two togther in some sort of "control plane"?
> 
> We have to represent that link because the CPU/host/management Ethernet
> MAC is physically connected to the CPU/management port of the switch. It
> does indeed participate in establishing the control plane.

But who uses that "link"?  What in userspace cares about it?  What in
the kernel cares?

> The basic idea of DSA is that the switch inserts vendor tags to indicate
> why the packet is sent towards the CPU in the first place: flooding,
> management, copy etc along with information as to which
> originating/destination port(s) this packet comes/goes from/to. On top
> of that, we demultiplex that tag to deliver normal Ethernet frames to
> per-port network devices (virtual network devices).
> 
> If we did leave the switch in an unmanaged mode and not logically
> attached to an Ethernet MAC for management, we'd lose all that
> information (we could use per-port VLANs to re-create it, but it would
> be inferior to what a switch with proprietary tags can do)
> 
> Code in net/dsa/dsa2.c that binds the two (switch and Ethernet MAC)
> together is not strictly a driver, it just is resident in memory and
> waits for dsa_register_switch() to be called until it tries to do this
> binding.

Ok, but when that "binding" happens, why do you need to show it in
sysfs?

> >> Merely changing the parent/child relationships to try and solve
> >> one issue just creates exactly the same problem elsewhere.
> > 
> > Fair enough.
> > 
> >> So, I hope with these diagrams, you can see that trying to make
> >> the ethernet controller a child device of the DSA switches
> >> means that (eg) it's no longer a PCI device, which is rather
> >> absurd, especially when considering that what happens to the
> >> right of the ethernet controller in the diagrams above is
> >> normally external chips to the SoC or ethernet device.
> > 
> > Ok, thanks for the long explainations and diagrams.
> > 
> > _BUT_ my original point remains.  These new functions you all are trying
> > to get into the driver core, do NOT do what they say they are doing.
> > They are mucking around with a "known topology" and just happen to work
> > because the device you are trying to find shares a common parent with
> > yourself.
> > 
> > That is not what the function says it does, and as such, I do not want
> > that function in the driver core at all.
> > 
> > If you wish to keep it in your own subsystem, that's fine, but call it
> > what it really is:
> > 	hack_to_find_peer_device_on_random_bus()
> > and pass in a _real_ pointer to a bus type.  Not some random string
> > please.
> 
> Yes, David has accepted that and we did make it prefixed with dsa_*.
> That does not mean it should stay forever though if we can work on
> something better designed.

Maybe, but really, this is a special case.

> > Or better yet, have the DSA code accept pointers to the two devices in
> > the first place, so it "knows" what to do here in a much better way.
> 
> Well, that is really what is being done here, the "inputs" to the code
> in net/dsa/dsa2.c are opaque struct device references that we
> verify/proof by making sure they belong in the class of device we expect
> them to be.

Why not take pointers to the class devices you expect in the first
place?  No need to take an "opaque" pointer, right?  Then nothing
special needs to be found anywhere else.

> NB: technically it's just one device reference: Ethernet MAC device,
> because the other one is implied by the device driver registering the
> switch with dsa_register_switch() (MDIO, SPI, I2C, PCI etc..).
> 
> > Right now it is a bad hack.  You all can not argue that is not true.
> 
> It's hard to argue on an qualifier for that design when we can't even
> agree whether our understanding of these devices matches your
> understanding after you read and digested our explanations :)

I think so, but again, you all don't seem to understand just why this is
such a bad hack of the driver model...

The main reason we don't have a "what type of device is this device
pointer" is that you are always supposed to "just know" what type it is,
because someone gave you the right type (i.e. the driver core, or
yourself).  So when you do random walks of the tree, and just expect
random strings to match up, it scares me.

> I entirely agree that the design has shortcomings, and in fact, it does
> present some challenges that are not well solved right now, if you
> unbind the Ethernet MAC driver, you leave a dangling DSA switch tree,
> and good luck re-attaching it.
> 
> So maybe we should talk about it in person at $next conference, I will
> be at ELC, how about you?

No, sorry, I'll be at a conference all the week before, and was at
FOSDEM, conferences 3 weeks in a row is rough.

greg k-h
diff mbox

Patch

diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index 2306d1b87c83..77fa4c4f5828 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -455,29 +455,11 @@  EXPORT_SYMBOL_GPL(dsa_switch_resume);
 #endif
 
 /* platform driver init and cleanup *****************************************/
-static int dev_is_class(struct device *dev, void *class)
-{
-	if (dev->class != NULL && !strcmp(dev->class->name, class))
-		return 1;
-
-	return 0;
-}
-
-static struct device *dev_find_class(struct device *parent, char *class)
-{
-	if (dev_is_class(parent, class)) {
-		get_device(parent);
-		return parent;
-	}
-
-	return device_find_child(parent, class, dev_is_class);
-}
-
 struct mii_bus *dsa_host_dev_to_mii_bus(struct device *dev)
 {
 	struct device *d;
 
-	d = dev_find_class(dev, "mdio_bus");
+	d = device_find_class(dev, "mdio_bus");
 	if (d != NULL) {
 		struct mii_bus *bus;
 
@@ -495,7 +477,7 @@  static struct net_device *dev_to_net_device(struct device *dev)
 {
 	struct device *d;
 
-	d = dev_find_class(dev, "net");
+	d = device_find_class(dev, "net");
 	if (d != NULL) {
 		struct net_device *nd;