mbox series

[0/2] net: stmmac: allow sharing MDIO lines

Message ID 20230807193102.6374-1-brgl@bgdev.pl (mailing list archive)
Headers show
Series net: stmmac: allow sharing MDIO lines | expand

Message

Bartosz Golaszewski Aug. 7, 2023, 7:31 p.m. UTC
From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

Two MACs may share MDIO lines to the PHYs. Let's allow that in the
stmmac driver by providing a new device-tree property allowing one MAC
node to reference the MDIO bus defined on a second MAC node.

Bartosz Golaszewski (2):
  dt-bindings: net: snps,dwmac: document the snps,shared-mdio property
  net: stmmac: support shared MDIO

 Documentation/devicetree/bindings/net/snps,dwmac.yaml | 6 ++++++
 drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c     | 8 ++++++++
 drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c | 6 ++++++
 include/linux/stmmac.h                                | 1 +
 4 files changed, 21 insertions(+)

Comments

Andrew Lunn Aug. 7, 2023, 7:50 p.m. UTC | #1
On Mon, Aug 07, 2023 at 09:31:00PM +0200, Bartosz Golaszewski wrote:
> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> 
> Two MACs may share MDIO lines to the PHYs. Let's allow that in the
> stmmac driver by providing a new device-tree property allowing one MAC
> node to reference the MDIO bus defined on a second MAC node.

I don't understand why this is needed. phy-handle can point to a phy
on any MDIO bus. So it is no problem for one MAC to point to the other
MACs MDIO bus as is.

You do sometimes get into ordering problems, especially if MAC0 is
pointing to a PHY on MAC1 MDIO bus. But MAC0 should get a
-EPROBE_DEFER, MAC1 then probes, creating its MDIO bus and the two
PHYs on it, and then later MAC0 is probes again and is successful.

     Andrew
Bartosz Golaszewski Aug. 8, 2023, 8:13 a.m. UTC | #2
On Mon, Aug 7, 2023 at 9:50 PM Andrew Lunn <andrew@lunn.ch> wrote:
>
> On Mon, Aug 07, 2023 at 09:31:00PM +0200, Bartosz Golaszewski wrote:
> > From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> >
> > Two MACs may share MDIO lines to the PHYs. Let's allow that in the
> > stmmac driver by providing a new device-tree property allowing one MAC
> > node to reference the MDIO bus defined on a second MAC node.
>
> I don't understand why this is needed. phy-handle can point to a phy
> on any MDIO bus. So it is no problem for one MAC to point to the other
> MACs MDIO bus as is.
>
> You do sometimes get into ordering problems, especially if MAC0 is
> pointing to a PHY on MAC1 MDIO bus. But MAC0 should get a
> -EPROBE_DEFER, MAC1 then probes, creating its MDIO bus and the two
> PHYs on it, and then later MAC0 is probes again and is successful.
>
>      Andrew

Ok so upon some further investigation, the actual culprit is in stmmac
platform code - it always tries to register an MDIO bus - independent
of whether there is an actual mdio child node - unless the MAC is
marked explicitly as having a fixed-link.

When I fixed that, MAC1's probe is correctly deferred until MAC0 has
created the MDIO bus.

Even so, isn't it useful to actually reference the shared MDIO bus in some way?

If the schematics look something like this:

--------           -------
| MAC0 |--MDIO-----| PHY |
-------- |     |   -------
         |     |
-------- |     |   -------
| MAC1 |--     ----| PHY |
--------           -------

Then it would make sense to model it on the device tree?

Anyway, this can be discussed later, I will drop this for now and send
a fix for stmmac mdio code instead to get this upstream.

Bart
Andrew Lunn Aug. 8, 2023, 1:09 p.m. UTC | #3
> Ok so upon some further investigation, the actual culprit is in stmmac
> platform code - it always tries to register an MDIO bus - independent
> of whether there is an actual mdio child node - unless the MAC is
> marked explicitly as having a fixed-link.

If the MDIO bus does exist registering it should not be a problem.
Ideally it should have an internal pull up on its MDIO line, so that
all reads return 0xffff. The phylib core will then determine there are
no devices on it. Not actually registering it because there is no MDIO
node in DT is just an optimisation.

> When I fixed that, MAC1's probe is correctly deferred until MAC0 has
> created the MDIO bus.

Great. That is how it should work.

> Even so, isn't it useful to actually reference the shared MDIO bus in some way?

Why? Linux does not care where the PHY is. There are some SoCs with an
independent MDIO bus masters. They have there own node in DT, there
own driver etc. You can create an MDIO bus from two or three GPIOs and
bit banging, again as an node in DT.  There are Ethernet switches
which can have 11 ports, and 2 MDIO busses, one purely internal and
one external, and the PHYs are scattered over these busses.

All linux needs is a phandle to the PHY, nothing more.

	Andrew
Russell King (Oracle) Aug. 8, 2023, 1:26 p.m. UTC | #4
On Tue, Aug 08, 2023 at 10:13:09AM +0200, Bartosz Golaszewski wrote:
> Ok so upon some further investigation, the actual culprit is in stmmac
> platform code - it always tries to register an MDIO bus - independent
> of whether there is an actual mdio child node - unless the MAC is
> marked explicitly as having a fixed-link.
> 
> When I fixed that, MAC1's probe is correctly deferred until MAC0 has
> created the MDIO bus.
> 
> Even so, isn't it useful to actually reference the shared MDIO bus in some way?
> 
> If the schematics look something like this:
> 
> --------           -------
> | MAC0 |--MDIO-----| PHY |
> -------- |     |   -------
>          |     |
> -------- |     |   -------
> | MAC1 |--     ----| PHY |
> --------           -------
> 
> Then it would make sense to model it on the device tree?

So I think what you're saying is that MAC0 and MAC1's have MDIO bus
masters, and the hardware designer decided to tie both together to
a single set of clock and data lines, which then go to two PHYs.

In that case, I would strongly advise only registering one MDIO bus,
and avoid registering the second one - thereby preventing any issues
caused by both MDIO bus masters trying to talk at the same time.

The PHYs should be populated in firmware on just one of the buses.

You will also need to ensure that whatever registers the bus does
make sure that the clocks necessary for communicating on the bus
are under control of the MDIO bus code and not the ethernet MAC
code. We've run into problems in the past where this has not been
the case, and it means - taking your example above - that when MAC1
wants to talk to its PHY, if MAC0 isn't alive it can't.

So just be aware of the clocking situation and make sure that your
MDIO bus code is managing the clocks necessary for the MDIO bus
master to work.

In regard to sharing of the MDIO bus signals between two bus
masters, I do not believe that is permissible - there's no
collision detection in hardware like there is on I²C. So
having two MDIO bus masters talking at the same time would
end up corrupting the MDC (clock) and MDIO (data) signals if
both were active at the same time.
Bartosz Golaszewski Aug. 8, 2023, 2:09 p.m. UTC | #5
On Tue, Aug 8, 2023 at 3:26 PM Russell King (Oracle)
<linux@armlinux.org.uk> wrote:
>
> On Tue, Aug 08, 2023 at 10:13:09AM +0200, Bartosz Golaszewski wrote:
> > Ok so upon some further investigation, the actual culprit is in stmmac
> > platform code - it always tries to register an MDIO bus - independent
> > of whether there is an actual mdio child node - unless the MAC is
> > marked explicitly as having a fixed-link.
> >
> > When I fixed that, MAC1's probe is correctly deferred until MAC0 has
> > created the MDIO bus.
> >
> > Even so, isn't it useful to actually reference the shared MDIO bus in some way?
> >
> > If the schematics look something like this:
> >
> > --------           -------
> > | MAC0 |--MDIO-----| PHY |
> > -------- |     |   -------
> >          |     |
> > -------- |     |   -------
> > | MAC1 |--     ----| PHY |
> > --------           -------
> >
> > Then it would make sense to model it on the device tree?
>
> So I think what you're saying is that MAC0 and MAC1's have MDIO bus
> masters, and the hardware designer decided to tie both together to
> a single set of clock and data lines, which then go to two PHYs.

The schematics I have are not very clear on that, but now that you
mention this, it's most likely the case.

>
> In that case, I would strongly advise only registering one MDIO bus,
> and avoid registering the second one - thereby preventing any issues
> caused by both MDIO bus masters trying to talk at the same time.
>

I sent a patch for that earlier today.

> The PHYs should be populated in firmware on just one of the buses.
>
> You will also need to ensure that whatever registers the bus does
> make sure that the clocks necessary for communicating on the bus
> are under control of the MDIO bus code and not the ethernet MAC
> code. We've run into problems in the past where this has not been
> the case, and it means - taking your example above - that when MAC1
> wants to talk to its PHY, if MAC0 isn't alive it can't.

Good point, but it's worse than that: when MAC0 is unbound, it will
unregister the MDIO bus and destroy all PHY devices. These are not
refcounted so they will literally go from under MAC1. Not sure how
this can be dealt with?

>
> So just be aware of the clocking situation and make sure that your
> MDIO bus code is managing the clocks necessary for the MDIO bus
> master to work.

Doesn't seem like stmmac is ready for it as it is now so this is going
to be fun...

Bartosz

>
> In regard to sharing of the MDIO bus signals between two bus
> masters, I do not believe that is permissible - there's no
> collision detection in hardware like there is on I涎. So
> having two MDIO bus masters talking at the same time would
> end up corrupting the MDC (clock) and MDIO (data) signals if
> both were active at the same time.
>
Andrew Lunn Aug. 8, 2023, 2:25 p.m. UTC | #6
> > On Tue, Aug 08, 2023 at 10:13:09AM +0200, Bartosz Golaszewski wrote:
> > > Ok so upon some further investigation, the actual culprit is in stmmac
> > > platform code - it always tries to register an MDIO bus - independent
> > > of whether there is an actual mdio child node - unless the MAC is
> > > marked explicitly as having a fixed-link.
> > >
> > > When I fixed that, MAC1's probe is correctly deferred until MAC0 has
> > > created the MDIO bus.
> > >
> > > Even so, isn't it useful to actually reference the shared MDIO bus in some way?
> > >
> > > If the schematics look something like this:
> > >
> > > --------           -------
> > > | MAC0 |--MDIO-----| PHY |
> > > -------- |     |   -------
> > >          |     |
> > > -------- |     |   -------
> > > | MAC1 |--     ----| PHY |
> > > --------           -------
> > >
> > > Then it would make sense to model it on the device tree?
> >
> > So I think what you're saying is that MAC0 and MAC1's have MDIO bus
> > masters, and the hardware designer decided to tie both together to
> > a single set of clock and data lines, which then go to two PHYs.
> 
> The schematics I have are not very clear on that, but now that you
> mention this, it's most likely the case.

I hope not. That would be very broken. As Russell pointed out, MDIO is
not multi-master. You need to check with the hardware designer if the
schematics are not clear.

> Good point, but it's worse than that: when MAC0 is unbound, it will
> unregister the MDIO bus and destroy all PHY devices. These are not
> refcounted so they will literally go from under MAC1. Not sure how
> this can be dealt with?

unbinding is not a normal operation. So i would just live with it, and
if root decides to shoot herself in the foot, that is her choice.

   Andrew
Bartosz Golaszewski Aug. 8, 2023, 2:30 p.m. UTC | #7
On Tue, Aug 8, 2023 at 4:25 PM Andrew Lunn <andrew@lunn.ch> wrote:
>
> > > On Tue, Aug 08, 2023 at 10:13:09AM +0200, Bartosz Golaszewski wrote:
> > > > Ok so upon some further investigation, the actual culprit is in stmmac
> > > > platform code - it always tries to register an MDIO bus - independent
> > > > of whether there is an actual mdio child node - unless the MAC is
> > > > marked explicitly as having a fixed-link.
> > > >
> > > > When I fixed that, MAC1's probe is correctly deferred until MAC0 has
> > > > created the MDIO bus.
> > > >
> > > > Even so, isn't it useful to actually reference the shared MDIO bus in some way?
> > > >
> > > > If the schematics look something like this:
> > > >
> > > > --------           -------
> > > > | MAC0 |--MDIO-----| PHY |
> > > > -------- |     |   -------
> > > >          |     |
> > > > -------- |     |   -------
> > > > | MAC1 |--     ----| PHY |
> > > > --------           -------
> > > >
> > > > Then it would make sense to model it on the device tree?
> > >
> > > So I think what you're saying is that MAC0 and MAC1's have MDIO bus
> > > masters, and the hardware designer decided to tie both together to
> > > a single set of clock and data lines, which then go to two PHYs.
> >
> > The schematics I have are not very clear on that, but now that you
> > mention this, it's most likely the case.
>
> I hope not. That would be very broken. As Russell pointed out, MDIO is
> not multi-master. You need to check with the hardware designer if the
> schematics are not clear.

Sorry, it was not very clear. It's the case that two MDIO masters
share the MDC and data lines.

>
> > Good point, but it's worse than that: when MAC0 is unbound, it will
> > unregister the MDIO bus and destroy all PHY devices. These are not
> > refcounted so they will literally go from under MAC1. Not sure how
> > this can be dealt with?
>
> unbinding is not a normal operation. So i would just live with it, and
> if root decides to shoot herself in the foot, that is her choice.
>

I disagree. Unbinding is very much a normal operation. Less so for
platform devices but still, it is there for a reason and should be
expected to work correctly. Or at the very least not crash and burn
the system.

On the other hand, I like your approach because I may get away without
having to fix it. But if I were to fix it - I would reference the MDIO
bus from the secondary mac by phandle and count its references before
dropping it. :)

Bartosz
Russell King (Oracle) Aug. 8, 2023, 2:30 p.m. UTC | #8
On Tue, Aug 08, 2023 at 04:09:11PM +0200, Bartosz Golaszewski wrote:
> On Tue, Aug 8, 2023 at 3:26 PM Russell King (Oracle)
> <linux@armlinux.org.uk> wrote:
> >
> > On Tue, Aug 08, 2023 at 10:13:09AM +0200, Bartosz Golaszewski wrote:
> > > Ok so upon some further investigation, the actual culprit is in stmmac
> > > platform code - it always tries to register an MDIO bus - independent
> > > of whether there is an actual mdio child node - unless the MAC is
> > > marked explicitly as having a fixed-link.
> > >
> > > When I fixed that, MAC1's probe is correctly deferred until MAC0 has
> > > created the MDIO bus.
> > >
> > > Even so, isn't it useful to actually reference the shared MDIO bus in some way?
> > >
> > > If the schematics look something like this:
> > >
> > > --------           -------
> > > | MAC0 |--MDIO-----| PHY |
> > > -------- |     |   -------
> > >          |     |
> > > -------- |     |   -------
> > > | MAC1 |--     ----| PHY |
> > > --------           -------
> > >
> > > Then it would make sense to model it on the device tree?
> >
> > So I think what you're saying is that MAC0 and MAC1's have MDIO bus
> > masters, and the hardware designer decided to tie both together to
> > a single set of clock and data lines, which then go to two PHYs.
> 
> The schematics I have are not very clear on that, but now that you
> mention this, it's most likely the case.
> 
> >
> > In that case, I would strongly advise only registering one MDIO bus,
> > and avoid registering the second one - thereby preventing any issues
> > caused by both MDIO bus masters trying to talk at the same time.
> >
> 
> I sent a patch for that earlier today.
> 
> > The PHYs should be populated in firmware on just one of the buses.
> >
> > You will also need to ensure that whatever registers the bus does
> > make sure that the clocks necessary for communicating on the bus
> > are under control of the MDIO bus code and not the ethernet MAC
> > code. We've run into problems in the past where this has not been
> > the case, and it means - taking your example above - that when MAC1
> > wants to talk to its PHY, if MAC0 isn't alive it can't.
> 
> Good point, but it's worse than that: when MAC0 is unbound, it will
> unregister the MDIO bus and destroy all PHY devices. These are not
> refcounted so they will literally go from under MAC1. Not sure how
> this can be dealt with?

That has been a problem in the past, where a MII bus has been
registered by a driver, and then because its probe defers, the MII
bus gets torn down.

The "simple" solution to this is... try to avoid registering the MII
bus until you're sure that the probing will not defer. It is far from
perfect, since there's still the opportunity to unbind the driver
causing the MII bus to vanish along with the PHYs.

I have mentioned trying to address the issue of PHY drivers being
unbound in the past, and there's been some improvements with that,
but if the phy_device vanishes while something is using it, it
certainly will not end well. phylib is not the only case of this,
there are numerous instances of it. One of the recent ones that
I happened to be reminded of today is the pcs-rzn1-miic thing...
If you have a look at miic_create() and consider what would happen
if:

        if (!pdev || !platform_get_drvdata(pdev))
                return ERR_PTR(-EPROBE_DEFER);

 ... another thread ended up executing miic_remove() for this
    platform device at this very point ...

        miic_port = kzalloc(sizeof(*miic_port), GFP_KERNEL);
        if (!miic_port)
                return ERR_PTR(-ENOMEM);

        miic = platform_get_drvdata(pdev);
        device_link_add(dev, miic->dev, DL_FLAG_AUTOREMOVE_CONSUMER);

The devm allocation for "miic" would be freed, so either miic
ends up a stale pointer if it happened after this point, or
if miic_remove() completes, then platform_get_drvdata() returns
NULL and we oops the kernel here.

It's an unlikely race, but it's still a race. Sadly, the kernel
is getting riddled with things like this. I used to point these
things out, but having been shouted down many times I've given
up raising it.

Another example is the direct rendering manager bridge code
(drm_bridge).

I suggest a similar approach to not caring too much about this
for your own sanity... providing it doesn't actually cause a
problem!
Andrew Halaney Aug. 8, 2023, 2:44 p.m. UTC | #9
On Tue, Aug 08, 2023 at 04:30:05PM +0200, Bartosz Golaszewski wrote:
> On Tue, Aug 8, 2023 at 4:25 PM Andrew Lunn <andrew@lunn.ch> wrote:
> >
> > > > On Tue, Aug 08, 2023 at 10:13:09AM +0200, Bartosz Golaszewski wrote:
> > > > > Ok so upon some further investigation, the actual culprit is in stmmac
> > > > > platform code - it always tries to register an MDIO bus - independent
> > > > > of whether there is an actual mdio child node - unless the MAC is
> > > > > marked explicitly as having a fixed-link.
> > > > >
> > > > > When I fixed that, MAC1's probe is correctly deferred until MAC0 has
> > > > > created the MDIO bus.
> > > > >
> > > > > Even so, isn't it useful to actually reference the shared MDIO bus in some way?
> > > > >
> > > > > If the schematics look something like this:
> > > > >
> > > > > --------           -------
> > > > > | MAC0 |--MDIO-----| PHY |
> > > > > -------- |     |   -------
> > > > >          |     |
> > > > > -------- |     |   -------
> > > > > | MAC1 |--     ----| PHY |
> > > > > --------           -------
> > > > >
> > > > > Then it would make sense to model it on the device tree?
> > > >
> > > > So I think what you're saying is that MAC0 and MAC1's have MDIO bus
> > > > masters, and the hardware designer decided to tie both together to
> > > > a single set of clock and data lines, which then go to two PHYs.
> > >
> > > The schematics I have are not very clear on that, but now that you
> > > mention this, it's most likely the case.
> >
> > I hope not. That would be very broken. As Russell pointed out, MDIO is
> > not multi-master. You need to check with the hardware designer if the
> > schematics are not clear.
> 
> Sorry, it was not very clear. It's the case that two MDIO masters
> share the MDC and data lines.

I'll make the water muddier (hopefully clearer?). I have access to the
board schematic (not SIP/SOM stuff though), but that should help here.

MAC0 owns its own MDIO bus (we'll call it MDIO0). It is pinmuxed to
gpio8/gpio9 for mdc/mdio. MAC1 owns its own bus (MDIO1) which is
pinmuxed to gpio21/22.

On MDIO0 there are two SGMII ethernet phys. One is connected to MAC0,
one is connected to MAC1.

MDIO1 is not connected to anything on the board. So there is only one
MDIO master, MAC0 on MDIO0, and it manages the ethernet phy for both
MAC0/MAC1.

Does that make sense? I don't think from a hardware design standpoint
this is violating anything, it isn't a multimaster setup on MDIO.

> 
> >
> > > Good point, but it's worse than that: when MAC0 is unbound, it will
> > > unregister the MDIO bus and destroy all PHY devices. These are not
> > > refcounted so they will literally go from under MAC1. Not sure how
> > > this can be dealt with?
> >
> > unbinding is not a normal operation. So i would just live with it, and
> > if root decides to shoot herself in the foot, that is her choice.
> >
> 
> I disagree. Unbinding is very much a normal operation. Less so for
> platform devices but still, it is there for a reason and should be
> expected to work correctly. Or at the very least not crash and burn
> the system.
> 
> On the other hand, I like your approach because I may get away without
> having to fix it. But if I were to fix it - I would reference the MDIO
> bus from the secondary mac by phandle and count its references before
> dropping it. :)
> 
> Bartosz
>
Russell King (Oracle) Aug. 8, 2023, 3:10 p.m. UTC | #10
On Tue, Aug 08, 2023 at 09:44:16AM -0500, Andrew Halaney wrote:
> On Tue, Aug 08, 2023 at 04:30:05PM +0200, Bartosz Golaszewski wrote:
> > On Tue, Aug 8, 2023 at 4:25 PM Andrew Lunn <andrew@lunn.ch> wrote:
> > >
> > > > > On Tue, Aug 08, 2023 at 10:13:09AM +0200, Bartosz Golaszewski wrote:
> > > > > > Ok so upon some further investigation, the actual culprit is in stmmac
> > > > > > platform code - it always tries to register an MDIO bus - independent
> > > > > > of whether there is an actual mdio child node - unless the MAC is
> > > > > > marked explicitly as having a fixed-link.
> > > > > >
> > > > > > When I fixed that, MAC1's probe is correctly deferred until MAC0 has
> > > > > > created the MDIO bus.
> > > > > >
> > > > > > Even so, isn't it useful to actually reference the shared MDIO bus in some way?
> > > > > >
> > > > > > If the schematics look something like this:
> > > > > >
> > > > > > --------           -------
> > > > > > | MAC0 |--MDIO-----| PHY |
> > > > > > -------- |     |   -------
> > > > > >          |     |
> > > > > > -------- |     |   -------
> > > > > > | MAC1 |--     ----| PHY |
> > > > > > --------           -------
> > > > > >
> > > > > > Then it would make sense to model it on the device tree?
> > > > >
> > > > > So I think what you're saying is that MAC0 and MAC1's have MDIO bus
> > > > > masters, and the hardware designer decided to tie both together to
> > > > > a single set of clock and data lines, which then go to two PHYs.
> > > >
> > > > The schematics I have are not very clear on that, but now that you
> > > > mention this, it's most likely the case.
> > >
> > > I hope not. That would be very broken. As Russell pointed out, MDIO is
> > > not multi-master. You need to check with the hardware designer if the
> > > schematics are not clear.
> > 
> > Sorry, it was not very clear. It's the case that two MDIO masters
> > share the MDC and data lines.
> 
> I'll make the water muddier (hopefully clearer?). I have access to the
> board schematic (not SIP/SOM stuff though), but that should help here.
> 
> MAC0 owns its own MDIO bus (we'll call it MDIO0). It is pinmuxed to
> gpio8/gpio9 for mdc/mdio. MAC1 owns its own bus (MDIO1) which is
> pinmuxed to gpio21/22.
> 
> On MDIO0 there are two SGMII ethernet phys. One is connected to MAC0,
> one is connected to MAC1.
> 
> MDIO1 is not connected to anything on the board. So there is only one
> MDIO master, MAC0 on MDIO0, and it manages the ethernet phy for both
> MAC0/MAC1.
> 
> Does that make sense? I don't think from a hardware design standpoint
> this is violating anything, it isn't a multimaster setup on MDIO.

That all sounds sane, thanks.
Andrew Lunn Aug. 8, 2023, 3:15 p.m. UTC | #11
> I'll make the water muddier (hopefully clearer?). I have access to the
> board schematic (not SIP/SOM stuff though), but that should help here.
> 
> MAC0 owns its own MDIO bus (we'll call it MDIO0). It is pinmuxed to
> gpio8/gpio9 for mdc/mdio. MAC1 owns its own bus (MDIO1) which is
> pinmuxed to gpio21/22.
> 
> On MDIO0 there are two SGMII ethernet phys. One is connected to MAC0,
> one is connected to MAC1.
> 
> MDIO1 is not connected to anything on the board. So there is only one
> MDIO master, MAC0 on MDIO0, and it manages the ethernet phy for both
> MAC0/MAC1.
> 
> Does that make sense? I don't think from a hardware design standpoint
> this is violating anything, it isn't a multimaster setup on MDIO.

Thanks for taking a detailed look at the schematics. This is how i
would expect it to be.

> > > > Good point, but it's worse than that: when MAC0 is unbound, it will
> > > > unregister the MDIO bus and destroy all PHY devices. These are not
> > > > refcounted so they will literally go from under MAC1. Not sure how
> > > > this can be dealt with?
> > >
> > > unbinding is not a normal operation. So i would just live with it, and
> > > if root decides to shoot herself in the foot, that is her choice.
> > >
> > 
> > I disagree. Unbinding is very much a normal operation.

What do you use it for?

I don't think i've ever manually done it. Maybe as part of a script to
unbind the FTDI driver from an FTDI device in order to use user space
tools to program the EEPROM? But that is about it.

I actually expect many unbind operations are broken because it is very
rarely used.

       Andrew
Russell King (Oracle) Aug. 8, 2023, 3:27 p.m. UTC | #12
On Tue, Aug 08, 2023 at 05:15:45PM +0200, Andrew Lunn wrote:
> > > > > Good point, but it's worse than that: when MAC0 is unbound, it will
> > > > > unregister the MDIO bus and destroy all PHY devices. These are not
> > > > > refcounted so they will literally go from under MAC1. Not sure how
> > > > > this can be dealt with?
> > > >
> > > > unbinding is not a normal operation. So i would just live with it, and
> > > > if root decides to shoot herself in the foot, that is her choice.
> > > >
> > > 
> > > I disagree. Unbinding is very much a normal operation.
> 
> What do you use it for?
> 
> I don't think i've ever manually done it. Maybe as part of a script to
> unbind the FTDI driver from an FTDI device in order to use user space
> tools to program the EEPROM? But that is about it.
> 
> I actually expect many unbind operations are broken because it is very
> rarely used.

rmmod! Particularly useful during driver development, I tend to use it
extensively - and it has the advantage of testing those unbind paths!
Bartosz Golaszewski Aug. 8, 2023, 6:26 p.m. UTC | #13
On Tue, Aug 8, 2023 at 5:15 PM Andrew Lunn <andrew@lunn.ch> wrote:
>
> > I'll make the water muddier (hopefully clearer?). I have access to the
> > board schematic (not SIP/SOM stuff though), but that should help here.
> >
> > MAC0 owns its own MDIO bus (we'll call it MDIO0). It is pinmuxed to
> > gpio8/gpio9 for mdc/mdio. MAC1 owns its own bus (MDIO1) which is
> > pinmuxed to gpio21/22.
> >
> > On MDIO0 there are two SGMII ethernet phys. One is connected to MAC0,
> > one is connected to MAC1.
> >
> > MDIO1 is not connected to anything on the board. So there is only one
> > MDIO master, MAC0 on MDIO0, and it manages the ethernet phy for both
> > MAC0/MAC1.
> >
> > Does that make sense? I don't think from a hardware design standpoint
> > this is violating anything, it isn't a multimaster setup on MDIO.
>
> Thanks for taking a detailed look at the schematics. This is how i
> would expect it to be.
>
> > > > > Good point, but it's worse than that: when MAC0 is unbound, it will
> > > > > unregister the MDIO bus and destroy all PHY devices. These are not
> > > > > refcounted so they will literally go from under MAC1. Not sure how
> > > > > this can be dealt with?
> > > >
> > > > unbinding is not a normal operation. So i would just live with it, and
> > > > if root decides to shoot herself in the foot, that is her choice.
> > > >
> > >
> > > I disagree. Unbinding is very much a normal operation.
>
> What do you use it for?
>
> I don't think i've ever manually done it. Maybe as part of a script to
> unbind the FTDI driver from an FTDI device in order to use user space
> tools to program the EEPROM? But that is about it.
>
> I actually expect many unbind operations are broken because it is very
> rarely used.
>

When I say "device unbind", I don't just mean manual unbinding using
sysfs. I mean any code path (rmmod, unplugging the USB, etc.) that
leads to the device being detached from its driver. This is a
perfectly normal situation and should work correctly.

I won't be fixing it for this series but may end up looking into
establishing some kind of device links between MACs and their "remote"
PHYs that would allow to safely unbind them.

Bart
Russell King (Oracle) Aug. 8, 2023, 6:38 p.m. UTC | #14
On Tue, Aug 08, 2023 at 08:26:22PM +0200, Bartosz Golaszewski wrote:
> When I say "device unbind", I don't just mean manual unbinding using
> sysfs. I mean any code path (rmmod, unplugging the USB, etc.) that
> leads to the device being detached from its driver. This is a
> perfectly normal situation and should work correctly.
> 
> I won't be fixing it for this series but may end up looking into
> establishing some kind of device links between MACs and their "remote"
> PHYs that would allow to safely unbind them.

I don't think you're the first to suggest that!

That gets difficult - because although the PHY may be a different
driver, the MDIO bus may be provided by the _same_ hardware as the
ethernet MAC itself. So you end up with a circular dependency - the
PHY device depends on the MDIO bus device (which is the ethernet MAC)
and then you make the ethernet MAC depend on the PHY device.