diff mbox series

[RFC] net: phy: Fix reboot crash if CONFIG_IP_PNP is not set

Message ID 20210104122415.1263541-1-geert+renesas@glider.be (mailing list archive)
State Under Review
Delegated to: Geert Uytterhoeven
Headers show
Series [RFC] net: phy: Fix reboot crash if CONFIG_IP_PNP is not set | expand

Commit Message

Geert Uytterhoeven Jan. 4, 2021, 12:24 p.m. UTC
Wolfram reports that his R-Car H2-based Lager board can no longer be
rebooted in v5.11-rc1, as it crashes with an imprecise external abort.
The issue can be reproduced on other boards (e.g. Koelsch with R-Car
M2-W) too, if CONFIG_IP_PNP is disabled:

    Unhandled fault: imprecise external abort (0x1406) at 0x00000000
    pgd = (ptrval)
    [00000000] *pgd=422b6835, *pte=00000000, *ppte=00000000
    Internal error: : 1406 [#1] ARM
    Modules linked in:
    CPU: 0 PID: 1105 Comm: init Tainted: G        W         5.10.0-rc1-00402-ge2f016cf7751 #1048
    Hardware name: Generic R-Car Gen2 (Flattened Device Tree)
    PC is at sh_mdio_ctrl+0x44/0x60
    LR is at sh_mmd_ctrl+0x20/0x24
    ...
    Backtrace:
    [<c0451f30>] (sh_mdio_ctrl) from [<c0451fd4>] (sh_mmd_ctrl+0x20/0x24)
     r7:0000001f r6:00000020 r5:00000002 r4:c22a1dc4
    [<c0451fb4>] (sh_mmd_ctrl) from [<c044fc18>] (mdiobb_cmd+0x38/0xa8)
    [<c044fbe0>] (mdiobb_cmd) from [<c044feb8>] (mdiobb_read+0x58/0xdc)
     r9:c229f844 r8:c0c329dc r7:c221e000 r6:00000001 r5:c22a1dc4 r4:00000001
    [<c044fe60>] (mdiobb_read) from [<c044c854>] (__mdiobus_read+0x74/0xe0)
     r7:0000001f r6:00000001 r5:c221e000 r4:c221e000
    [<c044c7e0>] (__mdiobus_read) from [<c044c9d8>] (mdiobus_read+0x40/0x54)
     r7:0000001f r6:00000001 r5:c221e000 r4:c221e458
    [<c044c998>] (mdiobus_read) from [<c044d678>] (phy_read+0x1c/0x20)
     r7:ffffe000 r6:c221e470 r5:00000200 r4:c229f800
    [<c044d65c>] (phy_read) from [<c044d94c>] (kszphy_config_intr+0x44/0x80)
    [<c044d908>] (kszphy_config_intr) from [<c044694c>] (phy_disable_interrupts+0x44/0x50)
     r5:c229f800 r4:c229f800
    [<c0446908>] (phy_disable_interrupts) from [<c0449370>] (phy_shutdown+0x18/0x1c)
     r5:c229f800 r4:c229f804
    [<c0449358>] (phy_shutdown) from [<c040066c>] (device_shutdown+0x168/0x1f8)
    [<c0400504>] (device_shutdown) from [<c013de44>] (kernel_restart_prepare+0x3c/0x48)
     r9:c22d2000 r8:c0100264 r7:c0b0d034 r6:00000000 r5:4321fedc r4:00000000
    [<c013de08>] (kernel_restart_prepare) from [<c013dee0>] (kernel_restart+0x1c/0x60)
    [<c013dec4>] (kernel_restart) from [<c013e1d8>] (__do_sys_reboot+0x168/0x208)
     r5:4321fedc r4:01234567
    [<c013e070>] (__do_sys_reboot) from [<c013e2e8>] (sys_reboot+0x18/0x1c)
     r7:00000058 r6:00000000 r5:00000000 r4:00000000
    [<c013e2d0>] (sys_reboot) from [<c0100060>] (ret_fast_syscall+0x0/0x54)

Calling phy_disable_interrupts() unconditionally means that the PHY
registers may be accessed while the device is suspended, causing
undefined behavior, which may crash the system.

Fix this by calling phy_disable_interrupts() only when the PHY has been
started.

Reported-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Fixes: e2f016cf775129c0 ("net: phy: add a shutdown procedure")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
---
Marked RFC as I do not know if this change breaks the use case fixed by
the faulty commit.  Alternatively, the device may have to be started
explicitly first.
---
 drivers/net/phy/phy_device.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Ioana Ciornei Jan. 4, 2021, 2:53 p.m. UTC | #1
Hi Geert,

On Mon, Jan 04, 2021 at 01:24:15PM +0100, Geert Uytterhoeven wrote:
> Wolfram reports that his R-Car H2-based Lager board can no longer be
> rebooted in v5.11-rc1, as it crashes with an imprecise external abort.
> The issue can be reproduced on other boards (e.g. Koelsch with R-Car
> M2-W) too, if CONFIG_IP_PNP is disabled:

What kind of PHYs are used on these boards?

> 
>     Unhandled fault: imprecise external abort (0x1406) at 0x00000000
>     pgd = (ptrval)
>     [00000000] *pgd=422b6835, *pte=00000000, *ppte=00000000
>     Internal error: : 1406 [#1] ARM
>     Modules linked in:
>     CPU: 0 PID: 1105 Comm: init Tainted: G        W         5.10.0-rc1-00402-ge2f016cf7751 #1048
>     Hardware name: Generic R-Car Gen2 (Flattened Device Tree)
>     PC is at sh_mdio_ctrl+0x44/0x60
>     LR is at sh_mmd_ctrl+0x20/0x24
>     ...
>     Backtrace:
>     [<c0451f30>] (sh_mdio_ctrl) from [<c0451fd4>] (sh_mmd_ctrl+0x20/0x24)
>      r7:0000001f r6:00000020 r5:00000002 r4:c22a1dc4
>     [<c0451fb4>] (sh_mmd_ctrl) from [<c044fc18>] (mdiobb_cmd+0x38/0xa8)
>     [<c044fbe0>] (mdiobb_cmd) from [<c044feb8>] (mdiobb_read+0x58/0xdc)
>      r9:c229f844 r8:c0c329dc r7:c221e000 r6:00000001 r5:c22a1dc4 r4:00000001
>     [<c044fe60>] (mdiobb_read) from [<c044c854>] (__mdiobus_read+0x74/0xe0)
>      r7:0000001f r6:00000001 r5:c221e000 r4:c221e000
>     [<c044c7e0>] (__mdiobus_read) from [<c044c9d8>] (mdiobus_read+0x40/0x54)
>      r7:0000001f r6:00000001 r5:c221e000 r4:c221e458
>     [<c044c998>] (mdiobus_read) from [<c044d678>] (phy_read+0x1c/0x20)
>      r7:ffffe000 r6:c221e470 r5:00000200 r4:c229f800
>     [<c044d65c>] (phy_read) from [<c044d94c>] (kszphy_config_intr+0x44/0x80)
>     [<c044d908>] (kszphy_config_intr) from [<c044694c>] (phy_disable_interrupts+0x44/0x50)
>      r5:c229f800 r4:c229f800
>     [<c0446908>] (phy_disable_interrupts) from [<c0449370>] (phy_shutdown+0x18/0x1c)
>      r5:c229f800 r4:c229f804
>     [<c0449358>] (phy_shutdown) from [<c040066c>] (device_shutdown+0x168/0x1f8)
>     [<c0400504>] (device_shutdown) from [<c013de44>] (kernel_restart_prepare+0x3c/0x48)
>      r9:c22d2000 r8:c0100264 r7:c0b0d034 r6:00000000 r5:4321fedc r4:00000000
>     [<c013de08>] (kernel_restart_prepare) from [<c013dee0>] (kernel_restart+0x1c/0x60)
>     [<c013dec4>] (kernel_restart) from [<c013e1d8>] (__do_sys_reboot+0x168/0x208)
>      r5:4321fedc r4:01234567
>     [<c013e070>] (__do_sys_reboot) from [<c013e2e8>] (sys_reboot+0x18/0x1c)
>      r7:00000058 r6:00000000 r5:00000000 r4:00000000
>     [<c013e2d0>] (sys_reboot) from [<c0100060>] (ret_fast_syscall+0x0/0x54)
> 
> Calling phy_disable_interrupts() unconditionally means that the PHY
> registers may be accessed while the device is suspended, causing
> undefined behavior, which may crash the system.
> 
> Fix this by calling phy_disable_interrupts() only when the PHY has been
> started.
> 
> Reported-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
> Fixes: e2f016cf775129c0 ("net: phy: add a shutdown procedure")
> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> ---
> Marked RFC as I do not know if this change breaks the use case fixed by
> the faulty commit.

I haven't tested it yet but most probably this change would partially
revert the behavior to how things were before adding the shutdown
procedure.

And this is because the interrupts are enabled at phy_connect and not at
phy_start so we would want to disable any PHY interrupts even though the
PHY has not been started yet.

> Alternatively, the device may have to be started
> explicitly first.

Have you actually tried this out and it worked?

I am asking this because I would much rather expect this to be a problem
with how the sh_eth driver behaves if the netdevice did not connect to
the PHY (this is done in .open() alongside the phy_start()) and it
suddently has to interract with it through the mdiobb_ops callbacks.

Also, I just re-tested this use case in which I do not start the
interface and just issue a reboot, and it behaves as expected.

> ---
>  drivers/net/phy/phy_device.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
> index 80c2e646c0934311..5985061b00128f8a 100644
> --- a/drivers/net/phy/phy_device.c
> +++ b/drivers/net/phy/phy_device.c
> @@ -2962,7 +2962,8 @@ static void phy_shutdown(struct device *dev)
>  {
>  	struct phy_device *phydev = to_phy_device(dev);
>  
> -	phy_disable_interrupts(phydev);
> +	if (phy_is_started(phydev))
> +		phy_disable_interrupts(phydev);
>  }
>  
>  /**
> -- 
> 2.25.1
>
Geert Uytterhoeven Jan. 4, 2021, 3:11 p.m. UTC | #2
Hi Ioana,

On Mon, Jan 4, 2021 at 3:53 PM Ioana Ciornei <ioana.ciornei@nxp.com> wrote:
> On Mon, Jan 04, 2021 at 01:24:15PM +0100, Geert Uytterhoeven wrote:
> > Wolfram reports that his R-Car H2-based Lager board can no longer be
> > rebooted in v5.11-rc1, as it crashes with an imprecise external abort.
> > The issue can be reproduced on other boards (e.g. Koelsch with R-Car
> > M2-W) too, if CONFIG_IP_PNP is disabled:
>
> What kind of PHYs are used on these boards?

Micrel KSZ8041RNLI

> >     Unhandled fault: imprecise external abort (0x1406) at 0x00000000
> >     pgd = (ptrval)
> >     [00000000] *pgd=422b6835, *pte=00000000, *ppte=00000000
> >     Internal error: : 1406 [#1] ARM
> >     Modules linked in:
> >     CPU: 0 PID: 1105 Comm: init Tainted: G        W         5.10.0-rc1-00402-ge2f016cf7751 #1048
> >     Hardware name: Generic R-Car Gen2 (Flattened Device Tree)
> >     PC is at sh_mdio_ctrl+0x44/0x60
> >     LR is at sh_mmd_ctrl+0x20/0x24
> >     ...
> >     Backtrace:
> >     [<c0451f30>] (sh_mdio_ctrl) from [<c0451fd4>] (sh_mmd_ctrl+0x20/0x24)
> >      r7:0000001f r6:00000020 r5:00000002 r4:c22a1dc4
> >     [<c0451fb4>] (sh_mmd_ctrl) from [<c044fc18>] (mdiobb_cmd+0x38/0xa8)
> >     [<c044fbe0>] (mdiobb_cmd) from [<c044feb8>] (mdiobb_read+0x58/0xdc)
> >      r9:c229f844 r8:c0c329dc r7:c221e000 r6:00000001 r5:c22a1dc4 r4:00000001
> >     [<c044fe60>] (mdiobb_read) from [<c044c854>] (__mdiobus_read+0x74/0xe0)
> >      r7:0000001f r6:00000001 r5:c221e000 r4:c221e000
> >     [<c044c7e0>] (__mdiobus_read) from [<c044c9d8>] (mdiobus_read+0x40/0x54)
> >      r7:0000001f r6:00000001 r5:c221e000 r4:c221e458
> >     [<c044c998>] (mdiobus_read) from [<c044d678>] (phy_read+0x1c/0x20)
> >      r7:ffffe000 r6:c221e470 r5:00000200 r4:c229f800
> >     [<c044d65c>] (phy_read) from [<c044d94c>] (kszphy_config_intr+0x44/0x80)
> >     [<c044d908>] (kszphy_config_intr) from [<c044694c>] (phy_disable_interrupts+0x44/0x50)
> >      r5:c229f800 r4:c229f800
> >     [<c0446908>] (phy_disable_interrupts) from [<c0449370>] (phy_shutdown+0x18/0x1c)
> >      r5:c229f800 r4:c229f804
> >     [<c0449358>] (phy_shutdown) from [<c040066c>] (device_shutdown+0x168/0x1f8)
> >     [<c0400504>] (device_shutdown) from [<c013de44>] (kernel_restart_prepare+0x3c/0x48)
> >      r9:c22d2000 r8:c0100264 r7:c0b0d034 r6:00000000 r5:4321fedc r4:00000000
> >     [<c013de08>] (kernel_restart_prepare) from [<c013dee0>] (kernel_restart+0x1c/0x60)
> >     [<c013dec4>] (kernel_restart) from [<c013e1d8>] (__do_sys_reboot+0x168/0x208)
> >      r5:4321fedc r4:01234567
> >     [<c013e070>] (__do_sys_reboot) from [<c013e2e8>] (sys_reboot+0x18/0x1c)
> >      r7:00000058 r6:00000000 r5:00000000 r4:00000000
> >     [<c013e2d0>] (sys_reboot) from [<c0100060>] (ret_fast_syscall+0x0/0x54)
> >
> > Calling phy_disable_interrupts() unconditionally means that the PHY
> > registers may be accessed while the device is suspended, causing
> > undefined behavior, which may crash the system.
> >
> > Fix this by calling phy_disable_interrupts() only when the PHY has been
> > started.
> >
> > Reported-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
> > Fixes: e2f016cf775129c0 ("net: phy: add a shutdown procedure")
> > Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> > ---
> > Marked RFC as I do not know if this change breaks the use case fixed by
> > the faulty commit.
>
> I haven't tested it yet but most probably this change would partially
> revert the behavior to how things were before adding the shutdown
> procedure.
>
> And this is because the interrupts are enabled at phy_connect and not at
> phy_start so we would want to disable any PHY interrupts even though the
> PHY has not been started yet.

Makes sense.

> > Alternatively, the device may have to be started
> > explicitly first.
>
> Have you actually tried this out and it worked?

No, I haven't tested restarting the device first.
I would like to avoid starting the device during shutdown, unless it is
absolutely necessary.

> I am asking this because I would much rather expect this to be a problem
> with how the sh_eth driver behaves if the netdevice did not connect to
> the PHY (this is done in .open() alongside the phy_start()) and it
> suddently has to interract with it through the mdiobb_ops callbacks.
>
> Also, I just re-tested this use case in which I do not start the
> interface and just issue a reboot, and it behaves as expected.

It depends on the hardware: the sh_eth device is powered down when its
module clock is stopped. When powered down, any access to the sh_eth
registers or to the PHY connected to it will cause a crash.

On most other hardware, you can access the PHY regardless, and no crash
will happen.

> > --- a/drivers/net/phy/phy_device.c
> > +++ b/drivers/net/phy/phy_device.c
> > @@ -2962,7 +2962,8 @@ static void phy_shutdown(struct device *dev)
> >  {
> >       struct phy_device *phydev = to_phy_device(dev);
> >
> > -     phy_disable_interrupts(phydev);
> > +     if (phy_is_started(phydev))
> > +             phy_disable_interrupts(phydev);
> >  }
> >
> >  /**

Gr{oetje,eeting}s,

                        Geert
Ioana Ciornei Jan. 4, 2021, 5:01 p.m. UTC | #3
On Mon, Jan 04, 2021 at 04:11:02PM +0100, Geert Uytterhoeven wrote:
> Hi Ioana,
> 
> On Mon, Jan 4, 2021 at 3:53 PM Ioana Ciornei <ioana.ciornei@nxp.com> wrote:
> > On Mon, Jan 04, 2021 at 01:24:15PM +0100, Geert Uytterhoeven wrote:
> > > Wolfram reports that his R-Car H2-based Lager board can no longer be
> > > rebooted in v5.11-rc1, as it crashes with an imprecise external abort.
> > > The issue can be reproduced on other boards (e.g. Koelsch with R-Car
> > > M2-W) too, if CONFIG_IP_PNP is disabled:
> >
> > What kind of PHYs are used on these boards?
> 
> Micrel KSZ8041RNLI
> 
> > >     Unhandled fault: imprecise external abort (0x1406) at 0x00000000
> > >     pgd = (ptrval)
> > >     [00000000] *pgd=422b6835, *pte=00000000, *ppte=00000000
> > >     Internal error: : 1406 [#1] ARM
> > >     Modules linked in:
> > >     CPU: 0 PID: 1105 Comm: init Tainted: G        W         5.10.0-rc1-00402-ge2f016cf7751 #1048
> > >     Hardware name: Generic R-Car Gen2 (Flattened Device Tree)
> > >     PC is at sh_mdio_ctrl+0x44/0x60
> > >     LR is at sh_mmd_ctrl+0x20/0x24
> > >     ...
> > >     Backtrace:
> > >     [<c0451f30>] (sh_mdio_ctrl) from [<c0451fd4>] (sh_mmd_ctrl+0x20/0x24)
> > >      r7:0000001f r6:00000020 r5:00000002 r4:c22a1dc4
> > >     [<c0451fb4>] (sh_mmd_ctrl) from [<c044fc18>] (mdiobb_cmd+0x38/0xa8)
> > >     [<c044fbe0>] (mdiobb_cmd) from [<c044feb8>] (mdiobb_read+0x58/0xdc)
> > >      r9:c229f844 r8:c0c329dc r7:c221e000 r6:00000001 r5:c22a1dc4 r4:00000001
> > >     [<c044fe60>] (mdiobb_read) from [<c044c854>] (__mdiobus_read+0x74/0xe0)
> > >      r7:0000001f r6:00000001 r5:c221e000 r4:c221e000
> > >     [<c044c7e0>] (__mdiobus_read) from [<c044c9d8>] (mdiobus_read+0x40/0x54)
> > >      r7:0000001f r6:00000001 r5:c221e000 r4:c221e458
> > >     [<c044c998>] (mdiobus_read) from [<c044d678>] (phy_read+0x1c/0x20)
> > >      r7:ffffe000 r6:c221e470 r5:00000200 r4:c229f800
> > >     [<c044d65c>] (phy_read) from [<c044d94c>] (kszphy_config_intr+0x44/0x80)
> > >     [<c044d908>] (kszphy_config_intr) from [<c044694c>] (phy_disable_interrupts+0x44/0x50)
> > >      r5:c229f800 r4:c229f800
> > >     [<c0446908>] (phy_disable_interrupts) from [<c0449370>] (phy_shutdown+0x18/0x1c)
> > >      r5:c229f800 r4:c229f804
> > >     [<c0449358>] (phy_shutdown) from [<c040066c>] (device_shutdown+0x168/0x1f8)
> > >     [<c0400504>] (device_shutdown) from [<c013de44>] (kernel_restart_prepare+0x3c/0x48)
> > >      r9:c22d2000 r8:c0100264 r7:c0b0d034 r6:00000000 r5:4321fedc r4:00000000
> > >     [<c013de08>] (kernel_restart_prepare) from [<c013dee0>] (kernel_restart+0x1c/0x60)
> > >     [<c013dec4>] (kernel_restart) from [<c013e1d8>] (__do_sys_reboot+0x168/0x208)
> > >      r5:4321fedc r4:01234567
> > >     [<c013e070>] (__do_sys_reboot) from [<c013e2e8>] (sys_reboot+0x18/0x1c)
> > >      r7:00000058 r6:00000000 r5:00000000 r4:00000000
> > >     [<c013e2d0>] (sys_reboot) from [<c0100060>] (ret_fast_syscall+0x0/0x54)
> > >
> > > Calling phy_disable_interrupts() unconditionally means that the PHY
> > > registers may be accessed while the device is suspended, causing
> > > undefined behavior, which may crash the system.
> > >
> > > Fix this by calling phy_disable_interrupts() only when the PHY has been
> > > started.
> > >
> > > Reported-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
> > > Fixes: e2f016cf775129c0 ("net: phy: add a shutdown procedure")
> > > Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> > > ---
> > > Marked RFC as I do not know if this change breaks the use case fixed by
> > > the faulty commit.
> >
> > I haven't tested it yet but most probably this change would partially
> > revert the behavior to how things were before adding the shutdown
> > procedure.
> >
> > And this is because the interrupts are enabled at phy_connect and not at
> > phy_start so we would want to disable any PHY interrupts even though the
> > PHY has not been started yet.
> 
> Makes sense.
> 
> > > Alternatively, the device may have to be started
> > > explicitly first.
> >
> > Have you actually tried this out and it worked?
> 
> No, I haven't tested restarting the device first.
> I would like to avoid starting the device during shutdown, unless it is
> absolutely necessary.

I was talking about starting the PHY device but in light of the new
information, this would lead to the exact same crash since it's just
another PHY register access.

Now I understand that you were referring to the sh_eth device itself.

> 
> > I am asking this because I would much rather expect this to be a problem
> > with how the sh_eth driver behaves if the netdevice did not connect to
> > the PHY (this is done in .open() alongside the phy_start()) and it
> > suddently has to interract with it through the mdiobb_ops callbacks.
> >
> > Also, I just re-tested this use case in which I do not start the
> > interface and just issue a reboot, and it behaves as expected.
> 
> It depends on the hardware: the sh_eth device is powered down when its
> module clock is stopped. When powered down, any access to the sh_eth
> registers or to the PHY connected to it will cause a crash.
> 
> On most other hardware, you can access the PHY regardless, and no crash
> will happen.

Ok, so this does not have anything to do with interrupts explicitly but
rather with the fact that any PHY access will cause a crash when the
sh_eth device is powered down.

If the device is powered-down before the actual .ndo_open() how is the
probe actually setting up the device? Or is the device returned to the
powered-down state after the probe and only powered-up at a subsequent
.ndo_open()?

Instead of the phy_is_started() call we could check if we had previously
enabled the interrupts on the PHY but this would mean that a basic
assumption of the PHY library is violated in that a registered PHY
device cannot access its regiters because the MDIO controller just
decided so.

Can't the MDIO bitbang driver callbacks just check if the device is
powered-down and if it is just power it back up temporarily?
Andrew Lunn Jan. 4, 2021, 5:15 p.m. UTC | #4
> Ok, so this does not have anything to do with interrupts explicitly but
> rather with the fact that any PHY access will cause a crash when the
> sh_eth device is powered down.
> 
> If the device is powered-down before the actual .ndo_open() how is the
> probe actually setting up the device? Or is the device returned to the
> powered-down state after the probe and only powered-up at a subsequent
> .ndo_open()?
> 
> Instead of the phy_is_started() call we could check if we had previously
> enabled the interrupts on the PHY but this would mean that a basic
> assumption of the PHY library is violated in that a registered PHY
> device cannot access its regiters because the MDIO controller just
> decided so.
> 
> Can't the MDIO bitbang driver callbacks just check if the device is
> powered-down and if it is just power it back up temporarily?

Is this runtime PM?

I had problems with the FEC driver and its runtime PM. After probe, it
would runtime power off its clocks, making the MDIO bus unusable. For
a plain boring setup, this is not too much of a problem, but when you
have a DSA switch on the bus, the DSA driver expects to be able to
access the switch, and this failed. I had to make the MDIO bus driver
in the FEC runtime PM aware, and turn the clocks back on again when an
MDIO transaction occurred.

The basic rules here should be, if the MDIO bus is registered, it is
usable. There are things like PHY statistics, HWMON temperature
sensors, etc, DSA switches, all which have a life cycle separate to
the interface being up.

    Andrew
Andrew Lunn Jan. 4, 2021, 5:31 p.m. UTC | #5
> The basic rules here should be, if the MDIO bus is registered, it is
> usable. There are things like PHY statistics, HWMON temperature
> sensors, etc, DSA switches, all which have a life cycle separate to
> the interface being up.

[Goes and looks at the code]

Yes, this is runtime PM which is broken.

sh_mdio_init() needs to wrap the mdp->mii_bus->read and
mdp->mii_bus->write calls with calls to

pm_runtime_get_sync(&mdp->pdev->dev);

and

pm_runtime_put_sync(&mdp->pdev->dev);

The KSZ8041RNLI supports statistics, which ethtool --phy-stats can
read, and these will also going to cause problems.

      Andrew
Ioana Ciornei Jan. 4, 2021, 6:43 p.m. UTC | #6
On Mon, Jan 04, 2021 at 06:31:05PM +0100, Andrew Lunn wrote:
> > The basic rules here should be, if the MDIO bus is registered, it is
> > usable. There are things like PHY statistics, HWMON temperature
> > sensors, etc, DSA switches, all which have a life cycle separate to
> > the interface being up.
> 
> [Goes and looks at the code]
> 
> Yes, this is runtime PM which is broken.
> 
> sh_mdio_init() needs to wrap the mdp->mii_bus->read and
> mdp->mii_bus->write calls with calls to
> 
> pm_runtime_get_sync(&mdp->pdev->dev);
> 
> and
> 
> pm_runtime_put_sync(&mdp->pdev->dev);
> 

Agree. Thanks for actually looking into it.. I'm not really well versed
in runtime PM.

> The KSZ8041RNLI supports statistics, which ethtool --phy-stats can
> read, and these will also going to cause problems.
> 

Not really, this driver connects to the PHY on .ndo_open(), thus any
try to actually dump the PHY statistics before an ifconfig up would get
an -EOPNOTSUPP since the dev->phydev is not yet populated.

This is exactly why I do not understand why some drivers insist on
calling of_phy_connect() and its variants on .ndo_open() and not while
probing the device - you can access the debug stats only if the
interface was started.
Florian Fainelli Jan. 4, 2021, 9:30 p.m. UTC | #7
On 1/4/21 10:43 AM, Ioana Ciornei wrote:
> On Mon, Jan 04, 2021 at 06:31:05PM +0100, Andrew Lunn wrote:
>>> The basic rules here should be, if the MDIO bus is registered, it is
>>> usable. There are things like PHY statistics, HWMON temperature
>>> sensors, etc, DSA switches, all which have a life cycle separate to
>>> the interface being up.
>>
>> [Goes and looks at the code]
>>
>> Yes, this is runtime PM which is broken.
>>
>> sh_mdio_init() needs to wrap the mdp->mii_bus->read and
>> mdp->mii_bus->write calls with calls to
>>
>> pm_runtime_get_sync(&mdp->pdev->dev);
>>
>> and
>>
>> pm_runtime_put_sync(&mdp->pdev->dev);
>>
> 
> Agree. Thanks for actually looking into it.. I'm not really well versed
> in runtime PM.
> 
>> The KSZ8041RNLI supports statistics, which ethtool --phy-stats can
>> read, and these will also going to cause problems.
>>
> 
> Not really, this driver connects to the PHY on .ndo_open(), thus any
> try to actually dump the PHY statistics before an ifconfig up would get
> an -EOPNOTSUPP since the dev->phydev is not yet populated.
> 
> This is exactly why I do not understand why some drivers insist on
> calling of_phy_connect() and its variants on .ndo_open() and not while
> probing the device - you can access the debug stats only if the
> interface was started.

Doing the connect in ndo_open() allows you to keep the PHY in whatever
the state it was prior to the kernel managing it, which if everything is
correctly designed means in a low power state.

Your Ethernet driver's probe function may be called on boot and you may
never use the network device at all, so it is a waste of energy to power
on the PHY, have it potentially link with its link partner while you
still have no chance of doing any configuration to it because you have
not brought up the network interface.
Geert Uytterhoeven Jan. 5, 2021, 10:01 a.m. UTC | #8
Hi Ioana, Andrew,

On Mon, 4 Jan 2021, Ioana Ciornei wrote:
> On Mon, Jan 04, 2021 at 06:31:05PM +0100, Andrew Lunn wrote:
>>> The basic rules here should be, if the MDIO bus is registered, it is
>>> usable. There are things like PHY statistics, HWMON temperature
>>> sensors, etc, DSA switches, all which have a life cycle separate to
>>> the interface being up.
>>
>> [Goes and looks at the code]
>>
>> Yes, this is runtime PM which is broken.
>>
>> sh_mdio_init() needs to wrap the mdp->mii_bus->read and
>> mdp->mii_bus->write calls with calls to
>>
>> pm_runtime_get_sync(&mdp->pdev->dev);
>>
>> and
>>
>> pm_runtime_put_sync(&mdp->pdev->dev);

pm_runtime_put().

Thanks, that works (see patch below), but I'm still wondering if that is
the right fix...

> Agree. Thanks for actually looking into it.. I'm not really well versed
> in runtime PM.
>
>> The KSZ8041RNLI supports statistics, which ethtool --phy-stats can
>> read, and these will also going to cause problems.
>>
>
> Not really, this driver connects to the PHY on .ndo_open(), thus any
> try to actually dump the PHY statistics before an ifconfig up would get
> an -EOPNOTSUPP since the dev->phydev is not yet populated.

I added a statically-linked ethtool binary to my initramfs, and can
confirm that retrieving the PHY statistics does not access the PHY
registers when the device is suspended:

     # ethtool --phy-statistics eth0
     no stats available
     # ifconfig eth0 up
     # ethtool --phy-statistics eth0
     PHY statistics:
 	 phy_receive_errors: 0
 	 phy_idle_errors: 0
     #

In the past, we've gone to great lengths to avoid accessing the PHY
registers when the device is suspended, usually in the statistics
handling (see e.g. [1][2]).  Hence I'm wondering if we should do the
same here, and handle this at a higher layer than the individual network
device driver (other drivers than sh_eth may be affected, too)?

Thanks!

[1] 124eee3f6955f7aa ("net: linkwatch: add check for netdevice being present to linkwatch_do_dev")
[2] https://lore.kernel.org/netdev/11beeaa9-57d5-e641-9486-f2ba202d0998@gmail.com/

From b3cc15e56bddbe65e0196ce04604e5e6c78abd7a Mon Sep 17 00:00:00 2001
From: Geert Uytterhoeven <geert+renesas@glider.be>
Date: Tue, 5 Jan 2021 10:29:22 +0100
Subject: [PATCH] [RFC] sh_eth: Make PHY access aware of Runtime PM to fix
  reboot crash

Wolfram reports that his R-Car H2-based Lager board can no longer be
rebooted in v5.11-rc1, as it crashes with an imprecise external abort.
The issue can be reproduced on other boards (e.g. Koelsch with R-Car
M2-W) too, if CONFIG_IP_PNP is disabled, and the Ethernet interface is
down at reboot time:

     Unhandled fault: imprecise external abort (0x1406) at 0x00000000
     pgd = (ptrval)
     [00000000] *pgd=422b6835, *pte=00000000, *ppte=00000000
     Internal error: : 1406 [#1] ARM
     Modules linked in:
     CPU: 0 PID: 1105 Comm: init Tainted: G        W         5.10.0-rc1-00402-ge2f016cf7751 #1048
     Hardware name: Generic R-Car Gen2 (Flattened Device Tree)
     PC is at sh_mdio_ctrl+0x44/0x60
     LR is at sh_mmd_ctrl+0x20/0x24
     ...
     Backtrace:
     [<c0451f30>] (sh_mdio_ctrl) from [<c0451fd4>] (sh_mmd_ctrl+0x20/0x24)
      r7:0000001f r6:00000020 r5:00000002 r4:c22a1dc4
     [<c0451fb4>] (sh_mmd_ctrl) from [<c044fc18>] (mdiobb_cmd+0x38/0xa8)
     [<c044fbe0>] (mdiobb_cmd) from [<c044feb8>] (mdiobb_read+0x58/0xdc)
      r9:c229f844 r8:c0c329dc r7:c221e000 r6:00000001 r5:c22a1dc4 r4:00000001
     [<c044fe60>] (mdiobb_read) from [<c044c854>] (__mdiobus_read+0x74/0xe0)
      r7:0000001f r6:00000001 r5:c221e000 r4:c221e000
     [<c044c7e0>] (__mdiobus_read) from [<c044c9d8>] (mdiobus_read+0x40/0x54)
      r7:0000001f r6:00000001 r5:c221e000 r4:c221e458
     [<c044c998>] (mdiobus_read) from [<c044d678>] (phy_read+0x1c/0x20)
      r7:ffffe000 r6:c221e470 r5:00000200 r4:c229f800
     [<c044d65c>] (phy_read) from [<c044d94c>] (kszphy_config_intr+0x44/0x80)
     [<c044d908>] (kszphy_config_intr) from [<c044694c>] (phy_disable_interrupts+0x44/0x50)
      r5:c229f800 r4:c229f800
     [<c0446908>] (phy_disable_interrupts) from [<c0449370>] (phy_shutdown+0x18/0x1c)
      r5:c229f800 r4:c229f804
     [<c0449358>] (phy_shutdown) from [<c040066c>] (device_shutdown+0x168/0x1f8)
     [<c0400504>] (device_shutdown) from [<c013de44>] (kernel_restart_prepare+0x3c/0x48)
      r9:c22d2000 r8:c0100264 r7:c0b0d034 r6:00000000 r5:4321fedc r4:00000000
     [<c013de08>] (kernel_restart_prepare) from [<c013dee0>] (kernel_restart+0x1c/0x60)
     [<c013dec4>] (kernel_restart) from [<c013e1d8>] (__do_sys_reboot+0x168/0x208)
      r5:4321fedc r4:01234567
     [<c013e070>] (__do_sys_reboot) from [<c013e2e8>] (sys_reboot+0x18/0x1c)
      r7:00000058 r6:00000000 r5:00000000 r4:00000000
     [<c013e2d0>] (sys_reboot) from [<c0100060>] (ret_fast_syscall+0x0/0x54)

As of commit e2f016cf775129c0 ("net: phy: add a shutdown procedure"),
system reboot calls phy_disable_interrupts() during shutdown.  As this
happens unconditionally, the PHY registers may be accessed while the
device is suspended, causing undefined behavior, which may crash the
system.

Fix this by wrapping the PHY bitbang accessors in the sh_eth driver by
wrappers that take care of Runtime PM, to resume the device when needed.

Reported-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
---
  drivers/net/ethernet/renesas/sh_eth.c | 34 +++++++++++++++++++++++++++
  1 file changed, 34 insertions(+)

diff --git a/drivers/net/ethernet/renesas/sh_eth.c b/drivers/net/ethernet/renesas/sh_eth.c
index c633046329352601..f8b306fa61bc25ca 100644
--- a/drivers/net/ethernet/renesas/sh_eth.c
+++ b/drivers/net/ethernet/renesas/sh_eth.c
@@ -1162,7 +1162,10 @@ static void read_mac_address(struct net_device *ndev, unsigned char *mac)

  struct bb_info {
  	void (*set_gate)(void *addr);
+	int (*read)(struct mii_bus *bus, int addr, int regnum);
+	int (*write)(struct mii_bus *bus, int addr, int regnum, u16 val);
  	struct mdiobb_ctrl ctrl;
+	struct device *dev;
  	void *addr;
  };

@@ -3034,6 +3037,30 @@ static int sh_mdio_release(struct sh_eth_private *mdp)
  	return 0;
  }

+static int sh_mdiobb_read(struct mii_bus *bus, int phy, int reg)
+{
+	struct bb_info *bb = container_of(bus->priv, struct bb_info, ctrl);
+	int res;
+
+	pm_runtime_get_sync(bb->dev);
+	res = bb->read(bus, phy, reg);
+	pm_runtime_put(bb->dev);
+
+	return res;
+}
+
+static int sh_mdiobb_write(struct mii_bus *bus, int phy, int reg, u16 val)
+{
+	struct bb_info *bb = container_of(bus->priv, struct bb_info, ctrl);
+	int res;
+
+	pm_runtime_get_sync(bb->dev);
+	res = bb->write(bus, phy, reg, val);
+	pm_runtime_put(bb->dev);
+
+	return res;
+}
+
  /* MDIO bus init function */
  static int sh_mdio_init(struct sh_eth_private *mdp,
  			struct sh_eth_plat_data *pd)
@@ -3052,12 +3079,19 @@ static int sh_mdio_init(struct sh_eth_private *mdp,
  	bitbang->addr = mdp->addr + mdp->reg_offset[PIR];
  	bitbang->set_gate = pd->set_mdio_gate;
  	bitbang->ctrl.ops = &bb_ops;
+	bitbang->dev = dev;

  	/* MII controller setting */
  	mdp->mii_bus = alloc_mdio_bitbang(&bitbang->ctrl);
  	if (!mdp->mii_bus)
  		return -ENOMEM;

+	/* Wrap accessors with Runtime PM-aware ops */
+	bitbang->read = mdp->mii_bus->read;
+	bitbang->write = mdp->mii_bus->write;
+	mdp->mii_bus->read = sh_mdiobb_read;
+	mdp->mii_bus->write = sh_mdiobb_write;
+
  	/* Hook up MII support for ethtool */
  	mdp->mii_bus->name = "sh_mii";
  	mdp->mii_bus->parent = dev;
Andrew Lunn Jan. 5, 2021, 2:10 p.m. UTC | #9
> I added a statically-linked ethtool binary to my initramfs, and can
> confirm that retrieving the PHY statistics does not access the PHY
> registers when the device is suspended:
> 
>     # ethtool --phy-statistics eth0
>     no stats available
>     # ifconfig eth0 up
>     # ethtool --phy-statistics eth0
>     PHY statistics:
> 	 phy_receive_errors: 0
> 	 phy_idle_errors: 0
>     #
> 
> In the past, we've gone to great lengths to avoid accessing the PHY
> registers when the device is suspended, usually in the statistics
> handling (see e.g. [1][2]).

I would argue that is the wrong approach. The PHY device is a
device. It has its own lifetime. You would not suspend a PCI bus
controller without first suspending all PCI devices on the bus etc.

> +static int sh_mdiobb_read(struct mii_bus *bus, int phy, int reg)
> +{
> +	struct bb_info *bb = container_of(bus->priv, struct bb_info, ctrl);

mii_bus->parent should give you dev, so there is no need to add it to
bb_info.

> +	/* Wrap accessors with Runtime PM-aware ops */
> +	bitbang->read = mdp->mii_bus->read;
> +	bitbang->write = mdp->mii_bus->write;
> +	mdp->mii_bus->read = sh_mdiobb_read;
> +	mdp->mii_bus->write = sh_mdiobb_write;

I did wonder about just exporting the two functions so you can
directly call them.

Otherwise, this looks good.

	   Andrew
Geert Uytterhoeven Jan. 13, 2021, 9:02 a.m. UTC | #10
Hi Andrew,

On Tue, Jan 5, 2021 at 3:10 PM Andrew Lunn <andrew@lunn.ch> wrote:
> > I added a statically-linked ethtool binary to my initramfs, and can
> > confirm that retrieving the PHY statistics does not access the PHY
> > registers when the device is suspended:
> >
> >     # ethtool --phy-statistics eth0
> >     no stats available
> >     # ifconfig eth0 up
> >     # ethtool --phy-statistics eth0
> >     PHY statistics:
> >        phy_receive_errors: 0
> >        phy_idle_errors: 0
> >     #
> >
> > In the past, we've gone to great lengths to avoid accessing the PHY
> > registers when the device is suspended, usually in the statistics
> > handling (see e.g. [1][2]).
>
> I would argue that is the wrong approach. The PHY device is a
> device. It has its own lifetime. You would not suspend a PCI bus
> controller without first suspending all PCI devices on the bus etc.

Makes sense.  So perhaps the PHY devices should become full citizens
instead, and start using Runtime PM theirselves? Then the device
framework will take care of it automatically through the devices'
parent/child relations.

This would be similar to e.g. commit 3a611e26e958b037 ("net/smsc911x:
Add minimal runtime PM support"), but now for PHYs w.r.t. their parent
network controller device, instead of for the network controller device
w.r.t. its parent bus.

> > +static int sh_mdiobb_read(struct mii_bus *bus, int phy, int reg)
> > +{
> > +     struct bb_info *bb = container_of(bus->priv, struct bb_info, ctrl);
>
> mii_bus->parent should give you dev, so there is no need to add it to
> bb_info.

OK.

> > +     /* Wrap accessors with Runtime PM-aware ops */
> > +     bitbang->read = mdp->mii_bus->read;
> > +     bitbang->write = mdp->mii_bus->write;
> > +     mdp->mii_bus->read = sh_mdiobb_read;
> > +     mdp->mii_bus->write = sh_mdiobb_write;
>
> I did wonder about just exporting the two functions so you can
> directly call them.

I did consider that. Do you prefer exporting the functions?

> Otherwise, this looks good.

Thanks. Do you want me to submit (with the above changed) as an interim
solution?

Note that the same issue seems to be present in the Renesas EtherAVB
driver.  But that is more difficult to reproduce, as I don't have any arm32
boards that use RAVB, and on arm64 register access while a device is
suspended doesn't cause a crash, but continues silently without any effect.

Gr{oetje,eeting}s,

                        Geert
diff mbox series

Patch

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 80c2e646c0934311..5985061b00128f8a 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -2962,7 +2962,8 @@  static void phy_shutdown(struct device *dev)
 {
 	struct phy_device *phydev = to_phy_device(dev);
 
-	phy_disable_interrupts(phydev);
+	if (phy_is_started(phydev))
+		phy_disable_interrupts(phydev);
 }
 
 /**