Message ID | 20230804175842.209537-1-marex@denx.de (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | net: phy: at803x: Improve hibernation support on start up | expand |
Hi Marek, > Toggle hibernation mode OFF and ON to wake the PHY up and make it > generate clock on RX_CLK pin for about 10 seconds. > These clock are needed during start up by MACs like DWMAC in NXP i.MX8M > Plus to release their DMA from reset. After the MAC has started up, the PHY > can enter hibernation and disable the RX_CLK clock, this poses no problem for > the MAC. > > Originally, this issue has been described by NXP in commit > 9ecf04016c87 ("net: phy: at803x: add disable hibernation mode support") but > this approach fully disables the hibernation support and takes away any power > saving benefit. This patch instead makes the PHY generate the clock on start > up for 10 seconds, which should be long enough for the EQoS MAC to release > DMA from reset. > > Before this patch on i.MX8M Plus board with AR8031 PHY: > " > $ ifconfig eth1 up > [ 25.576734] imx-dwmac 30bf0000.ethernet eth1: Register > MEM_TYPE_PAGE_POOL RxQ-0 > [ 25.658916] imx-dwmac 30bf0000.ethernet eth1: PHY [stmmac-1:00] > driver [Qualcomm Atheros AR8031/AR8033] (irq=38) > [ 26.670276] imx-dwmac 30bf0000.ethernet: Failed to reset the dma > [ 26.676322] imx-dwmac 30bf0000.ethernet eth1: stmmac_hw_setup: > DMA engine initialization failed > [ 26.685103] imx-dwmac 30bf0000.ethernet eth1: __stmmac_open: Hw > setup failed > ifconfig: SIOCSIFFLAGS: Connection timed out " > Have you reproduced this issue based on the upstream net-next or net tree? If so, can this issue be reproduced? The reason why I ask this is because when I tried to reproduce this problem on net-next 6.3.0 version, I found that it could not be reproduced (I did not disable hibernation mode when I reproduced this issue ). So I guess maybe other patches in eqos driver fixed the issue.
On 8/8/23 10:44, Wei Fang wrote: > Hi Marek, Hi, >> Toggle hibernation mode OFF and ON to wake the PHY up and make it >> generate clock on RX_CLK pin for about 10 seconds. >> These clock are needed during start up by MACs like DWMAC in NXP i.MX8M >> Plus to release their DMA from reset. After the MAC has started up, the PHY >> can enter hibernation and disable the RX_CLK clock, this poses no problem for >> the MAC. >> >> Originally, this issue has been described by NXP in commit >> 9ecf04016c87 ("net: phy: at803x: add disable hibernation mode support") but >> this approach fully disables the hibernation support and takes away any power >> saving benefit. This patch instead makes the PHY generate the clock on start >> up for 10 seconds, which should be long enough for the EQoS MAC to release >> DMA from reset. >> >> Before this patch on i.MX8M Plus board with AR8031 PHY: >> " >> $ ifconfig eth1 up >> [ 25.576734] imx-dwmac 30bf0000.ethernet eth1: Register >> MEM_TYPE_PAGE_POOL RxQ-0 >> [ 25.658916] imx-dwmac 30bf0000.ethernet eth1: PHY [stmmac-1:00] >> driver [Qualcomm Atheros AR8031/AR8033] (irq=38) >> [ 26.670276] imx-dwmac 30bf0000.ethernet: Failed to reset the dma >> [ 26.676322] imx-dwmac 30bf0000.ethernet eth1: stmmac_hw_setup: >> DMA engine initialization failed >> [ 26.685103] imx-dwmac 30bf0000.ethernet eth1: __stmmac_open: Hw >> setup failed >> ifconfig: SIOCSIFFLAGS: Connection timed out " >> > > Have you reproduced this issue based on the upstream net-next or net tree? On current linux-next next-20230808 so 6.5.0-rc5 . As far as I can tell, net-next is merged into this tree too. > If so, can this issue be reproduced? The reason why I ask this is because when > I tried to reproduce this problem on net-next 6.3.0 version, I found that it could > not be reproduced (I did not disable hibernation mode when I reproduced this > issue ). So I guess maybe other patches in eqos driver fixed the issue. This is what I use for testing: - Make sure "qca,disable-hibernation-mode" is NOT present in PHY DT node - Boot the machine with NO ethernet cable plugged into the affected port (i.e. the EQoS port), this is important - Make sure the EQoS MAC is not brought up e.g. by systemd-networkd or whatever other tool, I use busybox initramfs for testing with plain script as init (it mounts the various filesystems and runs /bin/sh) - Wait longer than 10 seconds - If possible, measure AR8031 PHY pin 33 RX_CLK, wait for the RX_CLK to be turned OFF by the PHY (means PHY entered hibernation) - ifconfig ethN up -- try to bring up the EQoS MAC <observe failure> [...]
> >> Toggle hibernation mode OFF and ON to wake the PHY up and make it > >> generate clock on RX_CLK pin for about 10 seconds. > >> These clock are needed during start up by MACs like DWMAC in NXP > >> i.MX8M Plus to release their DMA from reset. After the MAC has > >> started up, the PHY can enter hibernation and disable the RX_CLK > >> clock, this poses no problem for the MAC. > >> > >> Originally, this issue has been described by NXP in commit > >> 9ecf04016c87 ("net: phy: at803x: add disable hibernation mode > >> support") but this approach fully disables the hibernation support > >> and takes away any power saving benefit. This patch instead makes the > >> PHY generate the clock on start up for 10 seconds, which should be > >> long enough for the EQoS MAC to release DMA from reset. > >> > >> Before this patch on i.MX8M Plus board with AR8031 PHY: > >> " > >> $ ifconfig eth1 up > >> [ 25.576734] imx-dwmac 30bf0000.ethernet eth1: Register > >> MEM_TYPE_PAGE_POOL RxQ-0 > >> [ 25.658916] imx-dwmac 30bf0000.ethernet eth1: PHY [stmmac-1:00] > >> driver [Qualcomm Atheros AR8031/AR8033] (irq=38) > >> [ 26.670276] imx-dwmac 30bf0000.ethernet: Failed to reset the dma > >> [ 26.676322] imx-dwmac 30bf0000.ethernet eth1: stmmac_hw_setup: > >> DMA engine initialization failed > >> [ 26.685103] imx-dwmac 30bf0000.ethernet eth1: __stmmac_open: > Hw > >> setup failed > >> ifconfig: SIOCSIFFLAGS: Connection timed out " > >> > > > > Have you reproduced this issue based on the upstream net-next or net > tree? > > On current linux-next next-20230808 so 6.5.0-rc5 . As far as I can tell, > net-next is merged into this tree too. > > > If so, can this issue be reproduced? The reason why I ask this is > > because when I tried to reproduce this problem on net-next 6.3.0 > > version, I found that it could not be reproduced (I did not disable > > hibernation mode when I reproduced this issue ). So I guess maybe other > patches in eqos driver fixed the issue. > > This is what I use for testing: > > - Make sure "qca,disable-hibernation-mode" is NOT present in PHY DT node Yes, I deleted this property when I reproduced this issue. > - Boot the machine with NO ethernet cable plugged into the affected port > (i.e. the EQoS port), this is important > - Make sure the EQoS MAC is not brought up e.g. by systemd-networkd or > whatever other tool, I use busybox initramfs for testing with plain > script as init (it mounts the various filesystems and runs /bin/sh) It looks like something has been changed since I submitted the "disable hibernation mode " patch. In previous test, I only need to unplug the cable and then use ifconfig cmd to disable the interface, wait more than 10 seconds, then use ifconfig cmd to enable the interface. > - Wait longer than 10 seconds > - If possible, measure AR8031 PHY pin 33 RX_CLK, wait for the RX_CLK to > be turned OFF by the PHY (means PHY entered hibernation) > - ifconfig ethN up -- try to bring up the EQoS MAC <observe failure> > > [...] For the patch, I think your approach is better than mine, but I have a suggestion, is the following modification more appropriate? --- a/drivers/net/phy/at803x.c +++ b/drivers/net/phy/at803x.c @@ -991,12 +991,28 @@ static int at8031_pll_config(struct phy_device *phydev) static int at803x_hibernation_mode_config(struct phy_device *phydev) { struct at803x_priv *priv = phydev->priv; + int ret; /* The default after hardware reset is hibernation mode enabled. After * software reset, the value is retained. */ - if (!(priv->flags & AT803X_DISABLE_HIBERNATION_MODE)) - return 0; + if (!(priv->flags & AT803X_DISABLE_HIBERNATION_MODE)) { + /* Toggle hibernation mode OFF and ON to wake the PHY up and + * make it generate clock on RX_CLK pin for about 10 seconds. + * These clock are needed during start up by MACs like DWMAC + * in NXP i.MX8M Plus to release their DMA from reset. After + * the MAC has started up, the PHY can enter hibernation and + * disable the RX_CLK clock, this poses no problem for the MAC. + */ + ret = at803x_debug_reg_mask(phydev, AT803X_DEBUG_REG_HIB_CTRL, + AT803X_DEBUG_HIB_CTRL_PS_HIB_EN, 0); + if (ret < 0) + return ret; + + return at803x_debug_reg_mask(phydev, AT803X_DEBUG_REG_HIB_CTRL, + AT803X_DEBUG_HIB_CTRL_PS_HIB_EN, + AT803X_DEBUG_HIB_CTRL_PS_HIB_EN); + } return at803x_debug_reg_mask(phydev, AT803X_DEBUG_REG_HIB_CTRL,
On Wed, Aug 09, 2023 at 02:27:12AM +0000, Wei Fang wrote: > > >> Toggle hibernation mode OFF and ON to wake the PHY up and make it > > >> generate clock on RX_CLK pin for about 10 seconds. > > >> These clock are needed during start up by MACs like DWMAC in NXP > > >> i.MX8M Plus to release their DMA from reset. After the MAC has > > >> started up, the PHY can enter hibernation and disable the RX_CLK > > >> clock, this poses no problem for the MAC. > > >> > > >> Originally, this issue has been described by NXP in commit > > >> 9ecf04016c87 ("net: phy: at803x: add disable hibernation mode > > >> support") but this approach fully disables the hibernation support > > >> and takes away any power saving benefit. This patch instead makes the > > >> PHY generate the clock on start up for 10 seconds, which should be > > >> long enough for the EQoS MAC to release DMA from reset. > > >> > > >> Before this patch on i.MX8M Plus board with AR8031 PHY: > > >> " > > >> $ ifconfig eth1 up > > >> [ 25.576734] imx-dwmac 30bf0000.ethernet eth1: Register > > >> MEM_TYPE_PAGE_POOL RxQ-0 > > >> [ 25.658916] imx-dwmac 30bf0000.ethernet eth1: PHY [stmmac-1:00] > > >> driver [Qualcomm Atheros AR8031/AR8033] (irq=38) > > >> [ 26.670276] imx-dwmac 30bf0000.ethernet: Failed to reset the dma > > >> [ 26.676322] imx-dwmac 30bf0000.ethernet eth1: stmmac_hw_setup: > > >> DMA engine initialization failed > > >> [ 26.685103] imx-dwmac 30bf0000.ethernet eth1: __stmmac_open: > > Hw > > >> setup failed > > >> ifconfig: SIOCSIFFLAGS: Connection timed out " > > >> > > > > > > Have you reproduced this issue based on the upstream net-next or net > > tree? > > > > On current linux-next next-20230808 so 6.5.0-rc5 . As far as I can tell, > > net-next is merged into this tree too. > > > > > > If so, can this issue be reproduced? The reason why I ask this is > > > because when I tried to reproduce this problem on net-next 6.3.0 > > > version, I found that it could not be reproduced (I did not disable > > > hibernation mode when I reproduced this issue ). So I guess maybe other > > patches in eqos driver fixed the issue. > > > > This is what I use for testing: > > > > - Make sure "qca,disable-hibernation-mode" is NOT present in PHY DT node > Yes, I deleted this property when I reproduced this issue. > > > - Boot the machine with NO ethernet cable plugged into the affected port > > (i.e. the EQoS port), this is important > > - Make sure the EQoS MAC is not brought up e.g. by systemd-networkd or > > whatever other tool, I use busybox initramfs for testing with plain > > script as init (it mounts the various filesystems and runs /bin/sh) > > It looks like something has been changed since I submitted the "disable hibernation > mode " patch. In previous test, I only need to unplug the cable and then use ifconfig > cmd to disable the interface, wait more than 10 seconds, then use ifconfig cmd to > enable the interface. > > > - Wait longer than 10 seconds > > - If possible, measure AR8031 PHY pin 33 RX_CLK, wait for the RX_CLK to > > be turned OFF by the PHY (means PHY entered hibernation) > > - ifconfig ethN up -- try to bring up the EQoS MAC <observe failure> > > > > [...] > > For the patch, I think your approach is better than mine, but I have a suggestion, > is the following modification more appropriate? > > --- a/drivers/net/phy/at803x.c > +++ b/drivers/net/phy/at803x.c > @@ -991,12 +991,28 @@ static int at8031_pll_config(struct phy_device *phydev) > static int at803x_hibernation_mode_config(struct phy_device *phydev) > { > struct at803x_priv *priv = phydev->priv; > + int ret; > > /* The default after hardware reset is hibernation mode enabled. After > * software reset, the value is retained. > */ > - if (!(priv->flags & AT803X_DISABLE_HIBERNATION_MODE)) > - return 0; > + if (!(priv->flags & AT803X_DISABLE_HIBERNATION_MODE)) { > + /* Toggle hibernation mode OFF and ON to wake the PHY up and > + * make it generate clock on RX_CLK pin for about 10 seconds. > + * These clock are needed during start up by MACs like DWMAC > + * in NXP i.MX8M Plus to release their DMA from reset. After > + * the MAC has started up, the PHY can enter hibernation and > + * disable the RX_CLK clock, this poses no problem for the MAC. > + */ > + ret = at803x_debug_reg_mask(phydev, AT803X_DEBUG_REG_HIB_CTRL, > + AT803X_DEBUG_HIB_CTRL_PS_HIB_EN, 0); > + if (ret < 0) > + return ret; > + > + return at803x_debug_reg_mask(phydev, AT803X_DEBUG_REG_HIB_CTRL, > + AT803X_DEBUG_HIB_CTRL_PS_HIB_EN, > + AT803X_DEBUG_HIB_CTRL_PS_HIB_EN); > + } > > return at803x_debug_reg_mask(phydev, AT803X_DEBUG_REG_HIB_CTRL, Hm.. how about officially defining this PHY as the clock provider and disable PHY automatic hibernation as long as clock is acquired? Regards, Oleksij
> > For the patch, I think your approach is better than mine, but I have a > > suggestion, is the following modification more appropriate? > > > > --- a/drivers/net/phy/at803x.c > > +++ b/drivers/net/phy/at803x.c > > @@ -991,12 +991,28 @@ static int at8031_pll_config(struct phy_device > > *phydev) static int at803x_hibernation_mode_config(struct phy_device > > *phydev) { > > struct at803x_priv *priv = phydev->priv; > > + int ret; > > > > /* The default after hardware reset is hibernation mode enabled. > After > > * software reset, the value is retained. > > */ > > - if (!(priv->flags & AT803X_DISABLE_HIBERNATION_MODE)) > > - return 0; > > + if (!(priv->flags & AT803X_DISABLE_HIBERNATION_MODE)) { > > + /* Toggle hibernation mode OFF and ON to wake the > PHY up and > > + * make it generate clock on RX_CLK pin for about 10 > seconds. > > + * These clock are needed during start up by MACs like > DWMAC > > + * in NXP i.MX8M Plus to release their DMA from reset. > After > > + * the MAC has started up, the PHY can enter > hibernation and > > + * disable the RX_CLK clock, this poses no problem for > the MAC. > > + */ > > + ret = at803x_debug_reg_mask(phydev, > AT803X_DEBUG_REG_HIB_CTRL, > > + > AT803X_DEBUG_HIB_CTRL_PS_HIB_EN, 0); > > + if (ret < 0) > > + return ret; > > + > > + return at803x_debug_reg_mask(phydev, > AT803X_DEBUG_REG_HIB_CTRL, > > + > AT803X_DEBUG_HIB_CTRL_PS_HIB_EN, > > + > AT803X_DEBUG_HIB_CTRL_PS_HIB_EN); > > + } > > > > return at803x_debug_reg_mask(phydev, > > AT803X_DEBUG_REG_HIB_CTRL, > > Hm.. how about officially defining this PHY as the clock provider and disable > PHY automatic hibernation as long as clock is acquired? > Sorry, I don't know much about the clock provider/consumer, but I think there will be more changes if we use clock provider/consume mechanism. Furthermore, we would expect the hibernation mode is enabled when the ethernet interface is brought up but the cable is not plugged, that is to say, we only need the PHY to provide the clock for a while to make the MAC reset successfully. Therefore, I think the current approach is more simple and effective, and it takes full advantage of the characteristics of the hardware (The PHY will continue to provide the clock about 10 seconds after hibernation mode is enabled when the cable is not plugged and automatically disable the clock after 10 seconds, so the 10 seconds is enough for the MAC to reset successfully).
CCing clk folks in the loop. On Wed, Aug 09, 2023 at 05:37:45AM +0000, Wei Fang wrote: ... > > Hm.. how about officially defining this PHY as the clock provider and disable > > PHY automatic hibernation as long as clock is acquired? > > > Sorry, I don't know much about the clock provider/consumer, but I think there > will be more changes if we use clock provider/consume mechanism. Yes, more changes will be needed. > Furthermore, > we would expect the hibernation mode is enabled when the ethernet interface > is brought up but the cable is not plugged, that is to say, we only need the PHY to > provide the clock for a while to make the MAC reset successfully. Means, if external clock is not provided, MAC is not fully functional. Correct? What kind of MAC operation will fail in this case? For example, if stmmac_open() fails without external clock, will stmmac_release() work properly? Will we be able to do any configuration on an interface which is opened, but without active link and hibernated clock? How about self tests? > Therefore, I think > the current approach is more simple and effective, and it takes full advantage of the > characteristics of the hardware (The PHY will continue to provide the clock about > 10 seconds after hibernation mode is enabled when the cable is not plugged and > automatically disable the clock after 10 seconds, so the 10 seconds is enough for > the MAC to reset successfully). If multiple independent operations are synchronized based on the assumption that 10 seconds should be enough, bad thing happens. If fully functional external clock provider is need to initialize the MAC, just disabling this clock on already initialized HW without doing proper re-initialization sequence is usually bad idea. HW may get some glitch which will make troubleshooting a pain. Regards, Oleksij
On Wed, Aug 09, 2023 at 08:08:36AM +0200, Oleksij Rempel wrote: > If fully functional external clock provider is need to initialize the > MAC, just disabling this clock on already initialized HW without doing > proper re-initialization sequence is usually bad idea. HW may get some > glitch which will make troubleshooting a pain. There are cases where the PHY sits on a MDIO bus that is created by the ethernet MAC driver, which means the PHY only exists during the ethernet MAC driver probe. I think that provided the clock is only obtained after we know the PHY is present, that would probably be fine - but doing it before the MDIO bus has been created will of course cause problems. We've had these issues before with stmmac, so this "stmmac needs the PHY receive clock" is nothing new - it's had problems with system suspend/resume in the past, and I've made suggestions... and when there's been two people trying to work on it, I've attempted to get them to talk to each other which resulted in nothing further happening. Another solution could possibly be that we reserve bit 30 on the PHY dev_flags to indicate that the receive clock must always be provided. I suspect that would have an advantage in another situation - for EEE, there's a control bit which allows the receive clock to be stopped while the link is in low-power state. If the MAC always needs the receive clock, then obviously that should be blocked.
> > Hm.. how about officially defining this PHY as the clock provider and disable > > PHY automatic hibernation as long as clock is acquired? > > > Sorry, I don't know much about the clock provider/consumer, but I think there > will be more changes if we use clock provider/consume mechanism. Less changes is not always best. What happens when a different PHY is used? By having a clock provider in the PHY, you are defining a clear API that any PHY needs to provide to work with this MAC. As Russell has point out, this is not the first time we have run into this problem. So far, it seems everybody has tried to solve it for just their system. So maybe now we should look at the whole picture and put in place a generic solution which can be made to work for any PHY. Andrew
On 8/9/23 15:40, Andrew Lunn wrote: >>> Hm.. how about officially defining this PHY as the clock provider and disable >>> PHY automatic hibernation as long as clock is acquired? >>> >> Sorry, I don't know much about the clock provider/consumer, but I think there >> will be more changes if we use clock provider/consume mechanism. > > Less changes is not always best. What happens when a different PHY is > used? Then the system wouldn't be affected by this AR803x specific behavior. > By having a clock provider in the PHY, you are defining a clear > API that any PHY needs to provide to work with this MAC. > > As Russell has point out, this is not the first time we have run into > this problem. So far, it seems everybody has tried to solve it for > just their system. So maybe now we should look at the whole picture > and put in place a generic solution which can be made to work for any > PHY. I'll see what I can do, it might take a while though.
On Wed, Aug 09, 2023 at 11:34:19PM +0200, Marek Vasut wrote: > On 8/9/23 15:40, Andrew Lunn wrote: > > > > Hm.. how about officially defining this PHY as the clock provider and disable > > > > PHY automatic hibernation as long as clock is acquired? > > > > > > > Sorry, I don't know much about the clock provider/consumer, but I think there > > > will be more changes if we use clock provider/consume mechanism. > > > > Less changes is not always best. What happens when a different PHY is > > used? > > Then the system wouldn't be affected by this AR803x specific behavior. Do you know it really is specific to the AR803x? Turning the clock off seams a reasonable thing to do when saving power, or when there is no link partner. Andrew
On Wed, Aug 09, 2023 at 11:34:19PM +0200, Marek Vasut wrote: > On 8/9/23 15:40, Andrew Lunn wrote: > > > > Hm.. how about officially defining this PHY as the clock provider and disable > > > > PHY automatic hibernation as long as clock is acquired? > > > > > > > Sorry, I don't know much about the clock provider/consumer, but I think there > > > will be more changes if we use clock provider/consume mechanism. > > > > Less changes is not always best. What happens when a different PHY is > > used? > > Then the system wouldn't be affected by this AR803x specific behavior. I don't think it is AR803x specific behaviour. I think it is an interesting stmmac behaviour, where it requires the clock from the PHY in order to do even the most trivial things (like reset itself.) That is a very interesting design decision. how does stmmac hardware complete a power-on reset if it needs the external clock from a PHY that itself might be in the process of powering itself up and establishing its clocks... There have been *hacks* to phylink requested over the years to work around this peculiar behaviour by stmmac - and it seems that it's not common behaviour. This kind of thing might affect Cadence's macb driver as well, but rather than it being a clock from the ethernet PHY, it seems to be from the serdes PHY built in to the SoC - if I understand what's reported in the proposed patch commit log (which I don't fully.) In the case of stmmac, I don't think it's fair to blame the AR803x. It's a hardware integration issue - the AR803x implementation which works fine elsewhere has a problem with the stmmac implementation, because design decisions made in both implementations end up being incompatible with each other. However, pair them with different implementations, and they're fine. Given that stmmac requires a clock from the PHY, I'm of the opinion that we need to have a way to tell phylib that "hey, this MAC must always have a receive clock from the PHY, please arrange for that to happen". AR803x needs to check that and arrange for the receive clock to always be output. phylib can also use that to ensure that when EEE mode is active in the PHY, clock-stop support is disabled... and that's actually a key part to getting EEE properly implemented. Clearly, the IEEE 802.3 folk catered for this issue when specifying EEE, where some MACs must always be fed a receive clock, and so I think phylib in Linux needs to recognise that this is A Thing that it should allow MACs to specify.
On 8/10/23 00:06, Andrew Lunn wrote: > On Wed, Aug 09, 2023 at 11:34:19PM +0200, Marek Vasut wrote: >> On 8/9/23 15:40, Andrew Lunn wrote: >>>>> Hm.. how about officially defining this PHY as the clock provider and disable >>>>> PHY automatic hibernation as long as clock is acquired? >>>>> >>>> Sorry, I don't know much about the clock provider/consumer, but I think there >>>> will be more changes if we use clock provider/consume mechanism. >>> >>> Less changes is not always best. What happens when a different PHY is >>> used? >> >> Then the system wouldn't be affected by this AR803x specific behavior. > > Do you know it really is specific to the AR803x? Turning the clock off > seams a reasonable thing to do when saving power, or when there is no > link partner. This hibernation behavior seem specific to this PHY, I haven't seen it on another PHY connected to the EQoS so far.
> > Furthermore, > > we would expect the hibernation mode is enabled when the ethernet > > interface is brought up but the cable is not plugged, that is to say, > > we only need the PHY to provide the clock for a while to make the MAC > reset successfully. > > Means, if external clock is not provided, MAC is not fully functional. > Correct? > The MAC will failed to do software reset if the PHY input clocks are not present. According to the datasheet from Synopsys, the software reset is for the MAC and the DMA controller reset the logic and all internal registers of the DMA, MTL, and MAC. One obvious error log can be seen from the console shows as follows. [ 26.670276] imx-dwmac 30bf0000.ethernet: Failed to reset the dma [ 26.676322] imx-dwmac 30bf0000.ethernet eth1: stmmac_hw_setup: DMA engine initialization failed [ 26.685103] imx-dwmac 30bf0000.ethernet eth1: __stmmac_open: Hw setup failed I believe other manufacturers who use the dwmac IP also have encountered the similar issue and use other methods to resolve it, such as intel, 49725ffc15fc ("net: stmmac: power up/down serdes in stmmac_open/release ") > What kind of MAC operation will fail in this case? Below are the steps from Marek to reproduce the issue. - Make sure "qca,disable-hibernation-mode" is NOT present in PHY DT node - Boot the machine with NO ethernet cable plugged into the affected port (i.e. the EQoS port), this is important - Make sure the EQoS MAC is not brought up e.g. by systemd-networkd or whatever other tool, I use busybox initramfs for testing with plain script as init (it mounts the various filesystems and runs /bin/sh) - Wait longer than 10 seconds - If possible, measure AR8031 PHY pin 33 RX_CLK, wait for the RX_CLK to be turned OFF by the PHY (means PHY entered hibernation) - ifconfig ethN up -- try to bring up the EQoS MAC <observe failure> But the situation I encountered on i.MX8DXL-EVK platform was not as complicated as above. At that time, it only took 3 steps to reproduce this issue. But now, the latest upstream tree cannot reproduce this issue based on the following three steps. Something has been changed, I think it should be feafeb53140a (arm64: dts: imx8dxl-evk: Fix eqos phy reset gpio). - unplug the cable - ifconfig eth0 down and wait for about 10 seconds - ifconfig eth0 up > For example, if stmmac_open() fails without external clock, will > stmmac_release() work properly? Actually, I don't know much about stmmac driver and the dwmac IP, Because I'm not responsible for this IP on NXP i.MX platform. But I have a look at the code of stmmac_release(), I think stmmac_release() will work properly, because it invokes phylink_stop() and phylink_disconnect_phy() first which will disable the clock from PHY. >Will we be able to do any configuration on > an interface which is opened, but without active link and hibernated clock? I don't know, but as far as I know, we can not set the VLAN filter successfully due to lack of clock. So there may be other configurations that depend on the clock. > How about self tests? > > > Therefore, I think > > the current approach is more simple and effective, and it takes full > > advantage of the characteristics of the hardware (The PHY will > > continue to provide the clock about > > 10 seconds after hibernation mode is enabled when the cable is not > > plugged and automatically disable the clock after 10 seconds, so the > > 10 seconds is enough for the MAC to reset successfully). > > If multiple independent operations are synchronized based on the > assumption that 10 seconds should be enough, bad thing happens. > > If fully functional external clock provider is need to initialize the MAC, just > disabling this clock on already initialized HW without doing proper > re-initialization sequence is usually bad idea. HW may get some glitch which > will make troubleshooting a pain. >
> > > Hm.. how about officially defining this PHY as the clock provider > > > and disable PHY automatic hibernation as long as clock is acquired? > > > > > Sorry, I don't know much about the clock provider/consumer, but I > > think there will be more changes if we use clock provider/consume > mechanism. > > Less changes is not always best. What happens when a different PHY is used? > By having a clock provider in the PHY, you are defining a clear API that any > PHY needs to provide to work with this MAC. > Yes, as Russell mentioned the issue of suspend/resume, that i.MX platform uses the RTL8211 PHY. But now I don't have a clear idea to solve this issue on dwmac IP, the design of this IP is so different. Cc to Clark to aware of this issue on dwmac IP. > As Russell has point out, this is not the first time we have run into this > problem. So far, it seems everybody has tried to solve it for just their system. > So maybe now we should look at the whole picture and put in place a > generic solution which can be made to work for any PHY.
On Thu, Aug 10, 2023 at 02:49:55AM +0200, Marek Vasut wrote: > On 8/10/23 00:06, Andrew Lunn wrote: > > On Wed, Aug 09, 2023 at 11:34:19PM +0200, Marek Vasut wrote: > > > On 8/9/23 15:40, Andrew Lunn wrote: > > > > > > Hm.. how about officially defining this PHY as the clock provider and disable > > > > > > PHY automatic hibernation as long as clock is acquired? > > > > > > > > > > > Sorry, I don't know much about the clock provider/consumer, but I think there > > > > > will be more changes if we use clock provider/consume mechanism. > > > > > > > > Less changes is not always best. What happens when a different PHY is > > > > used? > > > > > > Then the system wouldn't be affected by this AR803x specific behavior. > > > > Do you know it really is specific to the AR803x? Turning the clock off > > seams a reasonable thing to do when saving power, or when there is no > > link partner. > > This hibernation behavior seem specific to this PHY, I haven't seen it on > another PHY connected to the EQoS so far. Hm.. if I see it correctly I have right now kind of similar issues with Microchip KSZ9893 switch attached to EQoS. The switch is clock provider and I need to make sure that switch has CPU port enabled before probing the ethernet controller.
On Thu, Aug 10, 2023 at 03:28:13AM +0000, Wei Fang wrote: > > > Furthermore, > > > we would expect the hibernation mode is enabled when the ethernet > > > interface is brought up but the cable is not plugged, that is to say, > > > we only need the PHY to provide the clock for a while to make the MAC > > reset successfully. > > > > Means, if external clock is not provided, MAC is not fully functional. > > Correct? > > > The MAC will failed to do software reset if the PHY input clocks are not > present. Yes, that's well known to me - it's been brought up before with stmmac and phylink. I said earlier in this thread about this. > > For example, if stmmac_open() fails without external clock, will > > stmmac_release() work properly? > Actually, I don't know much about stmmac driver and the dwmac IP, > Because I'm not responsible for this IP on NXP i.MX platform. But I > have a look at the code of stmmac_release(), I think stmmac_release() > will work properly, because it invokes phylink_stop() and phylink_disconnect_phy() > first which will disable the clock from PHY. When tearing down a network device, there should be no reason to call phylink_stop() - unregistering the netdev will take the network device down, and thus call the .ndo_stop method if the device was up. .ndo_stop should already be calling phylink_stop(). I just don't get why driver authors don't check things like this...
On Thu, Aug 10, 2023 at 02:49:55AM +0200, Marek Vasut wrote: > On 8/10/23 00:06, Andrew Lunn wrote: > > On Wed, Aug 09, 2023 at 11:34:19PM +0200, Marek Vasut wrote: > > > On 8/9/23 15:40, Andrew Lunn wrote: > > > > > > Hm.. how about officially defining this PHY as the clock provider and disable > > > > > > PHY automatic hibernation as long as clock is acquired? > > > > > > > > > > > Sorry, I don't know much about the clock provider/consumer, but I think there > > > > > will be more changes if we use clock provider/consume mechanism. > > > > > > > > Less changes is not always best. What happens when a different PHY is > > > > used? > > > > > > Then the system wouldn't be affected by this AR803x specific behavior. > > > > Do you know it really is specific to the AR803x? Turning the clock off > > seams a reasonable thing to do when saving power, or when there is no > > link partner. > > This hibernation behavior seem specific to this PHY, I haven't seen it on > another PHY connected to the EQoS so far. Marvell PHYs can be programmed so that RXCLK stops when the PHY enters power down or energy-detect state, although it defaults to always keeping the RGMII interface powered (and thus providing a clock.) One Micrel PHY - "To save more power, the KSZ9031RNX stops the RX_CLK clock output to the MAC after 10 or more RX_CLK clock cycles have occurred in the receive LPI state." which seems to imply if EEE is enabled, then the receive clock will be stopped when entering low-power state. I've said this several times in this thread - I think we need a bit in the PHY device's dev_flags to allow the MAC to say "do not power down the receive clock" which is used by the PHY drivers to (a) program the hardware to prevent the receive clock being stopped in situations such as the AR803x hibernate mode, and (b) to program the hardware not to stop the receive clock when entering EEE low power. This does seem to be a generic thing and not specific to just one PHY - especially as the stopping of clocks when entering EEE low power is a IEEE 802.3 defined thing.
On Thu, Aug 10, 2023 at 06:32:42AM +0200, Oleksij Rempel wrote: > On Thu, Aug 10, 2023 at 02:49:55AM +0200, Marek Vasut wrote: > > On 8/10/23 00:06, Andrew Lunn wrote: > > > On Wed, Aug 09, 2023 at 11:34:19PM +0200, Marek Vasut wrote: > > > > On 8/9/23 15:40, Andrew Lunn wrote: > > > > > > > Hm.. how about officially defining this PHY as the clock provider and disable > > > > > > > PHY automatic hibernation as long as clock is acquired? > > > > > > > > > > > > > Sorry, I don't know much about the clock provider/consumer, but I think there > > > > > > will be more changes if we use clock provider/consume mechanism. > > > > > > > > > > Less changes is not always best. What happens when a different PHY is > > > > > used? > > > > > > > > Then the system wouldn't be affected by this AR803x specific behavior. > > > > > > Do you know it really is specific to the AR803x? Turning the clock off > > > seams a reasonable thing to do when saving power, or when there is no > > > link partner. > > > > This hibernation behavior seem specific to this PHY, I haven't seen it on > > another PHY connected to the EQoS so far. > > Hm.. if I see it correctly I have right now kind of similar issues with > Microchip KSZ9893 switch attached to EQoS. The switch is clock > provider and I need to make sure that switch has CPU port enabled before > probing the ethernet controller. ... and Cadence macb. So, this is a thing, and we need to be able to cater for it. Can we agree that: (a) some ethernet MAC controllers need the RGMII receive clock to function. (b) IEEE 802.3 permits clocks to be disabled when entering EEE low power state, and there is a defined control bit to allow this to happen - so IEEE 802.3 _recognises_ that some MAC controllers need this clock whereas others do not. Therefore: (c) Given that IEEE 802.3 appears to recognise this, we should add some sort of control to phylib so that MACs can tell phylib that they require the receive clock to always be present. (d) We need to ensure that PHY drivers respect that bit and program the hardware appropriately to ensure that where a MAC always needs a receive clock, it is maintained. (e) MACs that always need the receive clock enabled need to indicate to phylib/phylink that this is the case. My suggestion is to use a bit on the PHY device dev_flags (bit 30) for this.
On Wed, Aug 09, 2023 at 09:43:49AM +0100, Russell King (Oracle) wrote: > On Wed, Aug 09, 2023 at 08:08:36AM +0200, Oleksij Rempel wrote: > > If fully functional external clock provider is need to initialize the > > MAC, just disabling this clock on already initialized HW without doing > > proper re-initialization sequence is usually bad idea. HW may get some > > glitch which will make troubleshooting a pain. > > There are cases where the PHY sits on a MDIO bus that is created > by the ethernet MAC driver, which means the PHY only exists during > the ethernet MAC driver probe. > > I think that provided the clock is only obtained after we know the > PHY is present, that would probably be fine - but doing it before > the MDIO bus has been created will of course cause problems. > > We've had these issues before with stmmac, so this "stmmac needs the > PHY receive clock" is nothing new - it's had problems with system > suspend/resume in the past, and I've made suggestions... and when > there's been two people trying to work on it, I've attempted to get > them to talk to each other which resulted in nothing further > happening. > > Another solution could possibly be that we reserve bit 30 on the > PHY dev_flags to indicate that the receive clock must always be > provided. I suspect that would have an advantage in another > situation - for EEE, there's a control bit which allows the > receive clock to be stopped while the link is in low-power state. > If the MAC always needs the receive clock, then obviously that > should be blocked. Something like this for starters: diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index fcab363d8dfa..a954f1d61709 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -1254,6 +1254,11 @@ static int stmmac_phy_setup(struct stmmac_priv *priv) ~(MAC_10HD | MAC_100HD | MAC_1000HD); priv->phylink_config.mac_managed_pm = true; + /* stmmac always requires a receive clock in order for things like + * hardware reset to work. + */ + priv->phylink_config.mac_requires_rxc = true; + phylink = phylink_create(&priv->phylink_config, fwnode, mode, &stmmac_phylink_mac_ops); if (IS_ERR(phylink)) diff --git a/drivers/net/phy/at803x.c b/drivers/net/phy/at803x.c index 13c4121fa309..619a63a0d14f 100644 --- a/drivers/net/phy/at803x.c +++ b/drivers/net/phy/at803x.c @@ -990,7 +990,8 @@ static int at803x_hibernation_mode_config(struct phy_device *phydev) /* The default after hardware reset is hibernation mode enabled. After * software reset, the value is retained. */ - if (!(priv->flags & AT803X_DISABLE_HIBERNATION_MODE)) + if (!(priv->flags & AT803X_DISABLE_HIBERNATION_MODE) && + !(phydev->dev_flags & PHY_F_RXC_ALWAYS_ON)) return 0; return at803x_debug_reg_mask(phydev, AT803X_DEBUG_REG_HIB_CTRL, diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c index 3e9909b30938..4d1a37487923 100644 --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c @@ -3216,6 +3216,8 @@ static int phy_probe(struct device *dev) goto out; } + phy_disable_interrupts(phydev); + /* Start out supporting everything. Eventually, * a controller will attach, and may modify one * or both of these values @@ -3333,16 +3335,6 @@ static int phy_remove(struct device *dev) return 0; } -static void phy_shutdown(struct device *dev) -{ - struct phy_device *phydev = to_phy_device(dev); - - if (phydev->state == PHY_READY || !phydev->attached_dev) - return; - - phy_disable_interrupts(phydev); -} - /** * phy_driver_register - register a phy_driver with the PHY layer * @new_driver: new phy_driver to register @@ -3376,7 +3368,6 @@ int phy_driver_register(struct phy_driver *new_driver, struct module *owner) new_driver->mdiodrv.driver.bus = &mdio_bus_type; new_driver->mdiodrv.driver.probe = phy_probe; new_driver->mdiodrv.driver.remove = phy_remove; - new_driver->mdiodrv.driver.shutdown = phy_shutdown; new_driver->mdiodrv.driver.owner = owner; new_driver->mdiodrv.driver.probe_type = PROBE_FORCE_SYNCHRONOUS; diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c index 4f1c8bb199e9..6568a2759101 100644 --- a/drivers/net/phy/phylink.c +++ b/drivers/net/phy/phylink.c @@ -1830,6 +1830,8 @@ static int phylink_bringup_phy(struct phylink *pl, struct phy_device *phy, static int phylink_attach_phy(struct phylink *pl, struct phy_device *phy, phy_interface_t interface) { + u32 flags = 0; + if (WARN_ON(pl->cfg_link_an_mode == MLO_AN_FIXED || (pl->cfg_link_an_mode == MLO_AN_INBAND && phy_interface_mode_is_8023z(interface) && !pl->sfp_bus))) @@ -1838,7 +1840,10 @@ static int phylink_attach_phy(struct phylink *pl, struct phy_device *phy, if (pl->phydev) return -EBUSY; - return phy_attach_direct(pl->netdev, phy, 0, interface); + if (pl->config.mac_requires_rxc) + flags |= PHY_F_RXC_ALWAYS_ON; + + return phy_attach_direct(pl->netdev, phy, flags, interface); } /** @@ -1941,6 +1946,9 @@ int phylink_fwnode_phy_connect(struct phylink *pl, pl->link_config.interface = pl->link_interface; } + if (pl->config.mac_requires_rxc) + flags |= PHY_F_RXC_ALWAYS_ON; + ret = phy_attach_direct(pl->netdev, phy_dev, flags, pl->link_interface); phy_device_free(phy_dev); diff --git a/include/linux/phy.h b/include/linux/phy.h index ba08b0e60279..79df5e01707d 100644 --- a/include/linux/phy.h +++ b/include/linux/phy.h @@ -761,6 +761,7 @@ struct phy_device { /* Generic phy_device::dev_flags */ #define PHY_F_NO_IRQ 0x80000000 +#define PHY_F_RXC_ALWAYS_ON BIT(30) static inline struct phy_device *to_phy_device(const struct device *dev) { diff --git a/include/linux/phylink.h b/include/linux/phylink.h index 789c516c6b4a..a83c1a77338f 100644 --- a/include/linux/phylink.h +++ b/include/linux/phylink.h @@ -204,6 +204,7 @@ enum phylink_op_type { * @poll_fixed_state: if true, starts link_poll, * if MAC link is at %MLO_AN_FIXED mode. * @mac_managed_pm: if true, indicate the MAC driver is responsible for PHY PM. + * @mac_requires_rxc: if true, the MAC always requires a receive clock from PHY. * @ovr_an_inband: if true, override PCS to MLO_AN_INBAND * @get_fixed_state: callback to execute to determine the fixed link state, * if MAC link is at %MLO_AN_FIXED mode. @@ -216,6 +217,7 @@ struct phylink_config { enum phylink_op_type type; bool poll_fixed_state; bool mac_managed_pm; + bool mac_requires_rxc; bool ovr_an_inband; void (*get_fixed_state)(struct phylink_config *config, struct phylink_link_state *state);
On Thu, Aug 10, 2023 at 11:01:53AM +0100, Russell King (Oracle) wrote: > On Thu, Aug 10, 2023 at 02:49:55AM +0200, Marek Vasut wrote: > > On 8/10/23 00:06, Andrew Lunn wrote: > > > On Wed, Aug 09, 2023 at 11:34:19PM +0200, Marek Vasut wrote: > > > > On 8/9/23 15:40, Andrew Lunn wrote: > > > > > > > Hm.. how about officially defining this PHY as the clock provider and disable > > > > > > > PHY automatic hibernation as long as clock is acquired? > > > > > > > > > > > > > Sorry, I don't know much about the clock provider/consumer, but I think there > > > > > > will be more changes if we use clock provider/consume mechanism. > > > > > > > > > > Less changes is not always best. What happens when a different PHY is > > > > > used? > > > > > > > > Then the system wouldn't be affected by this AR803x specific behavior. > > > > > > Do you know it really is specific to the AR803x? Turning the clock off > > > seams a reasonable thing to do when saving power, or when there is no > > > link partner. > > > > This hibernation behavior seem specific to this PHY, I haven't seen it on > > another PHY connected to the EQoS so far. > > Marvell PHYs can be programmed so that RXCLK stops when the PHY > enters power down or energy-detect state, although it defaults to > always keeping the RGMII interface powered (and thus providing a > clock.) > > One Micrel PHY - "To save more power, the KSZ9031RNX stops the RX_CLK > clock output to the MAC after 10 or more RX_CLK clock > cycles have occurred in the receive LPI state." which seems to imply > if EEE is enabled, then the receive clock will be stopped when > entering low-power state. > > I've said this several times in this thread - I think we need a bit > in the PHY device's dev_flags to allow the MAC to say "do not power > down the receive clock" which is used by the PHY drivers to (a) program > the hardware to prevent the receive clock being stopped in situations > such as the AR803x hibernate mode, and (b) to program the hardware not > to stop the receive clock when entering EEE low power. This does seem > to be a generic thing and not specific to just one PHY - especially as > the stopping of clocks when entering EEE low power is a IEEE 802.3 > defined thing. Like this: diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index fcab363d8dfa..a954f1d61709 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -1254,6 +1254,11 @@ static int stmmac_phy_setup(struct stmmac_priv *priv) ~(MAC_10HD | MAC_100HD | MAC_1000HD); priv->phylink_config.mac_managed_pm = true; + /* stmmac always requires a receive clock in order for things like + * hardware reset to work. + */ + priv->phylink_config.mac_requires_rxc = true; + phylink = phylink_create(&priv->phylink_config, fwnode, mode, &stmmac_phylink_mac_ops); if (IS_ERR(phylink)) diff --git a/drivers/net/phy/at803x.c b/drivers/net/phy/at803x.c index 13c4121fa309..619a63a0d14f 100644 --- a/drivers/net/phy/at803x.c +++ b/drivers/net/phy/at803x.c @@ -990,7 +990,8 @@ static int at803x_hibernation_mode_config(struct phy_device *phydev) /* The default after hardware reset is hibernation mode enabled. After * software reset, the value is retained. */ - if (!(priv->flags & AT803X_DISABLE_HIBERNATION_MODE)) + if (!(priv->flags & AT803X_DISABLE_HIBERNATION_MODE) && + !(phydev->dev_flags & PHY_F_RXC_ALWAYS_ON)) return 0; return at803x_debug_reg_mask(phydev, AT803X_DEBUG_REG_HIB_CTRL, diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c index 3e9909b30938..4d1a37487923 100644 --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c @@ -3216,6 +3216,8 @@ static int phy_probe(struct device *dev) goto out; } + phy_disable_interrupts(phydev); + /* Start out supporting everything. Eventually, * a controller will attach, and may modify one * or both of these values @@ -3333,16 +3335,6 @@ static int phy_remove(struct device *dev) return 0; } -static void phy_shutdown(struct device *dev) -{ - struct phy_device *phydev = to_phy_device(dev); - - if (phydev->state == PHY_READY || !phydev->attached_dev) - return; - - phy_disable_interrupts(phydev); -} - /** * phy_driver_register - register a phy_driver with the PHY layer * @new_driver: new phy_driver to register @@ -3376,7 +3368,6 @@ int phy_driver_register(struct phy_driver *new_driver, struct module *owner) new_driver->mdiodrv.driver.bus = &mdio_bus_type; new_driver->mdiodrv.driver.probe = phy_probe; new_driver->mdiodrv.driver.remove = phy_remove; - new_driver->mdiodrv.driver.shutdown = phy_shutdown; new_driver->mdiodrv.driver.owner = owner; new_driver->mdiodrv.driver.probe_type = PROBE_FORCE_SYNCHRONOUS; diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c index 4f1c8bb199e9..6568a2759101 100644 --- a/drivers/net/phy/phylink.c +++ b/drivers/net/phy/phylink.c @@ -1830,6 +1830,8 @@ static int phylink_bringup_phy(struct phylink *pl, struct phy_device *phy, static int phylink_attach_phy(struct phylink *pl, struct phy_device *phy, phy_interface_t interface) { + u32 flags = 0; + if (WARN_ON(pl->cfg_link_an_mode == MLO_AN_FIXED || (pl->cfg_link_an_mode == MLO_AN_INBAND && phy_interface_mode_is_8023z(interface) && !pl->sfp_bus))) @@ -1838,7 +1840,10 @@ static int phylink_attach_phy(struct phylink *pl, struct phy_device *phy, if (pl->phydev) return -EBUSY; - return phy_attach_direct(pl->netdev, phy, 0, interface); + if (pl->config.mac_requires_rxc) + flags |= PHY_F_RXC_ALWAYS_ON; + + return phy_attach_direct(pl->netdev, phy, flags, interface); } /** @@ -1941,6 +1946,9 @@ int phylink_fwnode_phy_connect(struct phylink *pl, pl->link_config.interface = pl->link_interface; } + if (pl->config.mac_requires_rxc) + flags |= PHY_F_RXC_ALWAYS_ON; + ret = phy_attach_direct(pl->netdev, phy_dev, flags, pl->link_interface); phy_device_free(phy_dev); diff --git a/include/linux/phy.h b/include/linux/phy.h index ba08b0e60279..79df5e01707d 100644 --- a/include/linux/phy.h +++ b/include/linux/phy.h @@ -761,6 +761,7 @@ struct phy_device { /* Generic phy_device::dev_flags */ #define PHY_F_NO_IRQ 0x80000000 +#define PHY_F_RXC_ALWAYS_ON BIT(30) static inline struct phy_device *to_phy_device(const struct device *dev) { diff --git a/include/linux/phylink.h b/include/linux/phylink.h index 789c516c6b4a..a83c1a77338f 100644 --- a/include/linux/phylink.h +++ b/include/linux/phylink.h @@ -204,6 +204,7 @@ enum phylink_op_type { * @poll_fixed_state: if true, starts link_poll, * if MAC link is at %MLO_AN_FIXED mode. * @mac_managed_pm: if true, indicate the MAC driver is responsible for PHY PM. + * @mac_requires_rxc: if true, the MAC always requires a receive clock from PHY. * @ovr_an_inband: if true, override PCS to MLO_AN_INBAND * @get_fixed_state: callback to execute to determine the fixed link state, * if MAC link is at %MLO_AN_FIXED mode. @@ -216,6 +217,7 @@ struct phylink_config { enum phylink_op_type type; bool poll_fixed_state; bool mac_managed_pm; + bool mac_requires_rxc; bool ovr_an_inband; void (*get_fixed_state)(struct phylink_config *config, struct phylink_link_state *state);
On Thu, Aug 10, 2023 at 12:15:14AM +0100, Russell King (Oracle) wrote: > On Wed, Aug 09, 2023 at 11:34:19PM +0200, Marek Vasut wrote: > > On 8/9/23 15:40, Andrew Lunn wrote: > > > > > Hm.. how about officially defining this PHY as the clock provider and disable > > > > > PHY automatic hibernation as long as clock is acquired? > > > > > > > > > Sorry, I don't know much about the clock provider/consumer, but I think there > > > > will be more changes if we use clock provider/consume mechanism. > > > > > > Less changes is not always best. What happens when a different PHY is > > > used? > > > > Then the system wouldn't be affected by this AR803x specific behavior. > > I don't think it is AR803x specific behaviour. I think it is an > interesting stmmac behaviour, where it requires the clock from > the PHY in order to do even the most trivial things (like reset > itself.) That is a very interesting design decision. > > how does stmmac hardware complete a power-on reset if it needs > the external clock from a PHY that itself might be in the process > of powering itself up and establishing its clocks... > > There have been *hacks* to phylink requested over the years to > work around this peculiar behaviour by stmmac - and it seems that > it's not common behaviour. > > This kind of thing might affect Cadence's macb driver as well, but > rather than it being a clock from the ethernet PHY, it seems to be > from the serdes PHY built in to the SoC - if I understand what's > reported in the proposed patch commit log (which I don't fully.) > > In the case of stmmac, I don't think it's fair to blame the AR803x. > It's a hardware integration issue - the AR803x implementation which > works fine elsewhere has a problem with the stmmac implementation, > because design decisions made in both implementations end up being > incompatible with each other. > > However, pair them with different implementations, and they're fine. > > Given that stmmac requires a clock from the PHY, I'm of the opinion > that we need to have a way to tell phylib that "hey, this MAC must > always have a receive clock from the PHY, please arrange for that > to happen". AR803x needs to check that and arrange for the receive > clock to always be output. phylib can also use that to ensure that > when EEE mode is active in the PHY, clock-stop support is disabled... > and that's actually a key part to getting EEE properly implemented. > > Clearly, the IEEE 802.3 folk catered for this issue when specifying > EEE, where some MACs must always be fed a receive clock, and so I > think phylib in Linux needs to recognise that this is A Thing that > it should allow MACs to specify. Like this: diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index fcab363d8dfa..a954f1d61709 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -1254,6 +1254,11 @@ static int stmmac_phy_setup(struct stmmac_priv *priv) ~(MAC_10HD | MAC_100HD | MAC_1000HD); priv->phylink_config.mac_managed_pm = true; + /* stmmac always requires a receive clock in order for things like + * hardware reset to work. + */ + priv->phylink_config.mac_requires_rxc = true; + phylink = phylink_create(&priv->phylink_config, fwnode, mode, &stmmac_phylink_mac_ops); if (IS_ERR(phylink)) diff --git a/drivers/net/phy/at803x.c b/drivers/net/phy/at803x.c index 13c4121fa309..619a63a0d14f 100644 --- a/drivers/net/phy/at803x.c +++ b/drivers/net/phy/at803x.c @@ -990,7 +990,8 @@ static int at803x_hibernation_mode_config(struct phy_device *phydev) /* The default after hardware reset is hibernation mode enabled. After * software reset, the value is retained. */ - if (!(priv->flags & AT803X_DISABLE_HIBERNATION_MODE)) + if (!(priv->flags & AT803X_DISABLE_HIBERNATION_MODE) && + !(phydev->dev_flags & PHY_F_RXC_ALWAYS_ON)) return 0; return at803x_debug_reg_mask(phydev, AT803X_DEBUG_REG_HIB_CTRL, diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c index 3e9909b30938..4d1a37487923 100644 --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c @@ -3216,6 +3216,8 @@ static int phy_probe(struct device *dev) goto out; } + phy_disable_interrupts(phydev); + /* Start out supporting everything. Eventually, * a controller will attach, and may modify one * or both of these values @@ -3333,16 +3335,6 @@ static int phy_remove(struct device *dev) return 0; } -static void phy_shutdown(struct device *dev) -{ - struct phy_device *phydev = to_phy_device(dev); - - if (phydev->state == PHY_READY || !phydev->attached_dev) - return; - - phy_disable_interrupts(phydev); -} - /** * phy_driver_register - register a phy_driver with the PHY layer * @new_driver: new phy_driver to register @@ -3376,7 +3368,6 @@ int phy_driver_register(struct phy_driver *new_driver, struct module *owner) new_driver->mdiodrv.driver.bus = &mdio_bus_type; new_driver->mdiodrv.driver.probe = phy_probe; new_driver->mdiodrv.driver.remove = phy_remove; - new_driver->mdiodrv.driver.shutdown = phy_shutdown; new_driver->mdiodrv.driver.owner = owner; new_driver->mdiodrv.driver.probe_type = PROBE_FORCE_SYNCHRONOUS; diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c index 4f1c8bb199e9..6568a2759101 100644 --- a/drivers/net/phy/phylink.c +++ b/drivers/net/phy/phylink.c @@ -1830,6 +1830,8 @@ static int phylink_bringup_phy(struct phylink *pl, struct phy_device *phy, static int phylink_attach_phy(struct phylink *pl, struct phy_device *phy, phy_interface_t interface) { + u32 flags = 0; + if (WARN_ON(pl->cfg_link_an_mode == MLO_AN_FIXED || (pl->cfg_link_an_mode == MLO_AN_INBAND && phy_interface_mode_is_8023z(interface) && !pl->sfp_bus))) @@ -1838,7 +1840,10 @@ static int phylink_attach_phy(struct phylink *pl, struct phy_device *phy, if (pl->phydev) return -EBUSY; - return phy_attach_direct(pl->netdev, phy, 0, interface); + if (pl->config.mac_requires_rxc) + flags |= PHY_F_RXC_ALWAYS_ON; + + return phy_attach_direct(pl->netdev, phy, flags, interface); } /** @@ -1941,6 +1946,9 @@ int phylink_fwnode_phy_connect(struct phylink *pl, pl->link_config.interface = pl->link_interface; } + if (pl->config.mac_requires_rxc) + flags |= PHY_F_RXC_ALWAYS_ON; + ret = phy_attach_direct(pl->netdev, phy_dev, flags, pl->link_interface); phy_device_free(phy_dev); diff --git a/include/linux/phy.h b/include/linux/phy.h index ba08b0e60279..79df5e01707d 100644 --- a/include/linux/phy.h +++ b/include/linux/phy.h @@ -761,6 +761,7 @@ struct phy_device { /* Generic phy_device::dev_flags */ #define PHY_F_NO_IRQ 0x80000000 +#define PHY_F_RXC_ALWAYS_ON BIT(30) static inline struct phy_device *to_phy_device(const struct device *dev) { diff --git a/include/linux/phylink.h b/include/linux/phylink.h index 789c516c6b4a..a83c1a77338f 100644 --- a/include/linux/phylink.h +++ b/include/linux/phylink.h @@ -204,6 +204,7 @@ enum phylink_op_type { * @poll_fixed_state: if true, starts link_poll, * if MAC link is at %MLO_AN_FIXED mode. * @mac_managed_pm: if true, indicate the MAC driver is responsible for PHY PM. + * @mac_requires_rxc: if true, the MAC always requires a receive clock from PHY. * @ovr_an_inband: if true, override PCS to MLO_AN_INBAND * @get_fixed_state: callback to execute to determine the fixed link state, * if MAC link is at %MLO_AN_FIXED mode. @@ -216,6 +217,7 @@ struct phylink_config { enum phylink_op_type type; bool poll_fixed_state; bool mac_managed_pm; + bool mac_requires_rxc; bool ovr_an_inband; void (*get_fixed_state)(struct phylink_config *config, struct phylink_link_state *state);
On Thu, Aug 10, 2023 at 11:34:02AM +0100, Russell King (Oracle) wrote: > On Thu, Aug 10, 2023 at 11:01:53AM +0100, Russell King (Oracle) wrote: > > On Thu, Aug 10, 2023 at 02:49:55AM +0200, Marek Vasut wrote: > > > On 8/10/23 00:06, Andrew Lunn wrote: > > > > On Wed, Aug 09, 2023 at 11:34:19PM +0200, Marek Vasut wrote: > > > > > On 8/9/23 15:40, Andrew Lunn wrote: > > > > > > > > Hm.. how about officially defining this PHY as the clock provider and disable > > > > > > > > PHY automatic hibernation as long as clock is acquired? > > > > > > > > > > > > > > > Sorry, I don't know much about the clock provider/consumer, but I think there > > > > > > > will be more changes if we use clock provider/consume mechanism. > > > > > > > > > > > > Less changes is not always best. What happens when a different PHY is > > > > > > used? > > > > > > > > > > Then the system wouldn't be affected by this AR803x specific behavior. > > > > > > > > Do you know it really is specific to the AR803x? Turning the clock off > > > > seams a reasonable thing to do when saving power, or when there is no > > > > link partner. > > > > > > This hibernation behavior seem specific to this PHY, I haven't seen it on > > > another PHY connected to the EQoS so far. > > > > Marvell PHYs can be programmed so that RXCLK stops when the PHY > > enters power down or energy-detect state, although it defaults to > > always keeping the RGMII interface powered (and thus providing a > > clock.) > > > > One Micrel PHY - "To save more power, the KSZ9031RNX stops the RX_CLK > > clock output to the MAC after 10 or more RX_CLK clock > > cycles have occurred in the receive LPI state." which seems to imply > > if EEE is enabled, then the receive clock will be stopped when > > entering low-power state. > > > > I've said this several times in this thread - I think we need a bit > > in the PHY device's dev_flags to allow the MAC to say "do not power > > down the receive clock" which is used by the PHY drivers to (a) program > > the hardware to prevent the receive clock being stopped in situations > > such as the AR803x hibernate mode, and (b) to program the hardware not > > to stop the receive clock when entering EEE low power. This does seem > > to be a generic thing and not specific to just one PHY - especially as > > the stopping of clocks when entering EEE low power is a IEEE 802.3 > > defined thing. > > Like this: ... > > + phy_disable_interrupts(phydev); > + > /* Start out supporting everything. Eventually, > * a controller will attach, and may modify one > * or both of these values > @@ -3333,16 +3335,6 @@ static int phy_remove(struct device *dev) > return 0; > } > > -static void phy_shutdown(struct device *dev) > -{ > - struct phy_device *phydev = to_phy_device(dev); > - > - if (phydev->state == PHY_READY || !phydev->attached_dev) > - return; > - > - phy_disable_interrupts(phydev); > -} > - Except of shutdown part from other discussion, looks fine for me. What will be the best way to solve this issue for DSA switches attached to MAC with RGMII RXC requirements? Regards, Oleksij
On Thu, Aug 10, 2023 at 02:51:17PM +0200, Oleksij Rempel wrote: > On Thu, Aug 10, 2023 at 11:34:02AM +0100, Russell King (Oracle) wrote: > > On Thu, Aug 10, 2023 at 11:01:53AM +0100, Russell King (Oracle) wrote: > > > On Thu, Aug 10, 2023 at 02:49:55AM +0200, Marek Vasut wrote: > > > > On 8/10/23 00:06, Andrew Lunn wrote: > > > > > On Wed, Aug 09, 2023 at 11:34:19PM +0200, Marek Vasut wrote: > > > > > > On 8/9/23 15:40, Andrew Lunn wrote: > > > > > > > > > Hm.. how about officially defining this PHY as the clock provider and disable > > > > > > > > > PHY automatic hibernation as long as clock is acquired? > > > > > > > > > > > > > > > > > Sorry, I don't know much about the clock provider/consumer, but I think there > > > > > > > > will be more changes if we use clock provider/consume mechanism. > > > > > > > > > > > > > > Less changes is not always best. What happens when a different PHY is > > > > > > > used? > > > > > > > > > > > > Then the system wouldn't be affected by this AR803x specific behavior. > > > > > > > > > > Do you know it really is specific to the AR803x? Turning the clock off > > > > > seams a reasonable thing to do when saving power, or when there is no > > > > > link partner. > > > > > > > > This hibernation behavior seem specific to this PHY, I haven't seen it on > > > > another PHY connected to the EQoS so far. > > > > > > Marvell PHYs can be programmed so that RXCLK stops when the PHY > > > enters power down or energy-detect state, although it defaults to > > > always keeping the RGMII interface powered (and thus providing a > > > clock.) > > > > > > One Micrel PHY - "To save more power, the KSZ9031RNX stops the RX_CLK > > > clock output to the MAC after 10 or more RX_CLK clock > > > cycles have occurred in the receive LPI state." which seems to imply > > > if EEE is enabled, then the receive clock will be stopped when > > > entering low-power state. > > > > > > I've said this several times in this thread - I think we need a bit > > > in the PHY device's dev_flags to allow the MAC to say "do not power > > > down the receive clock" which is used by the PHY drivers to (a) program > > > the hardware to prevent the receive clock being stopped in situations > > > such as the AR803x hibernate mode, and (b) to program the hardware not > > > to stop the receive clock when entering EEE low power. This does seem > > > to be a generic thing and not specific to just one PHY - especially as > > > the stopping of clocks when entering EEE low power is a IEEE 802.3 > > > defined thing. > > > > Like this: > ... > > > > + phy_disable_interrupts(phydev); > > + > > /* Start out supporting everything. Eventually, > > * a controller will attach, and may modify one > > * or both of these values > > @@ -3333,16 +3335,6 @@ static int phy_remove(struct device *dev) > > return 0; > > } > > > > -static void phy_shutdown(struct device *dev) > > -{ > > - struct phy_device *phydev = to_phy_device(dev); > > - > > - if (phydev->state == PHY_READY || !phydev->attached_dev) > > - return; > > - > > - phy_disable_interrupts(phydev); > > -} > > - > > Except of shutdown part from other discussion, looks fine for me. Sigh... clearly I can't cope with email, especially when most of the subject line is hidden! All I see in the index is "Re: [PATCH] net: phy: at803x: Improve hi" for this thread, and I can't remember what the other thread was... and it's buried in email. > What will be the best way to solve this issue for DSA switches attached to > MAC with RGMII RXC requirements? I have no idea - the problem there is the model that has been adopted in Linux is that there is no direct relationship between the DSA switch and the MAC like there is with a PHY. The MAC sees only a fixed-link, as does the DSA switch. So from the software perspective, it's: MAC ---- fixed-link Switch ---- fixed-link And it just so happens that packets pass between the two by "magic". With a PHY, there is a direct relationship: MAC ---- PHY The MAC has a direct relationship with the PHY to the extent that the MAC driver is involved in managing the PHY through phylink/phylib. The effect of that de-coupling is that the MAC driver becomes quite independent of the switch driver - I don't believe the switch driver can query the MAC driver at probe time, nor the other way. If I'm mistaken about that, someone needs to say! What that leaves us with is firmware telling the switch driver...
> > What will be the best way to solve this issue for DSA switches attached to > > MAC with RGMII RXC requirements? > > I have no idea - the problem there is the model that has been adopted > in Linux is that there is no direct relationship between the DSA switch > and the MAC like there is with a PHY. A clock provider/consumer relationship can be expressed in DT. The DSA switch port would provide the clock, rather than the PHY. Andrew
On Thu, Aug 10, 2023 at 03:49:24PM +0200, Andrew Lunn wrote: > > > What will be the best way to solve this issue for DSA switches attached to > > > MAC with RGMII RXC requirements? > > > > I have no idea - the problem there is the model that has been adopted > > in Linux is that there is no direct relationship between the DSA switch > > and the MAC like there is with a PHY. > > A clock provider/consumer relationship can be expressed in DT. The DSA > switch port would provide the clock, rather than the PHY. Then we'll be in to people wanting to do it for PHYs as well, and as we've recently discussed that isn't something we want because of the dependencies it creates between mdio drivers and mac drivers. Wouldn't the same dependency issue also apply for a DSA switch on a MDIO bus, where the MDIO bus is part of the MAC driver?
On Thu, Aug 10, 2023 at 02:54:58PM +0100, Russell King (Oracle) wrote: > On Thu, Aug 10, 2023 at 03:49:24PM +0200, Andrew Lunn wrote: > > > > What will be the best way to solve this issue for DSA switches attached to > > > > MAC with RGMII RXC requirements? > > > > > > I have no idea - the problem there is the model that has been adopted > > > in Linux is that there is no direct relationship between the DSA switch > > > and the MAC like there is with a PHY. > > > > A clock provider/consumer relationship can be expressed in DT. The DSA > > switch port would provide the clock, rather than the PHY. > > Then we'll be in to people wanting to do it for PHYs as well, and as > we've recently discussed that isn't something we want because of the > dependencies it creates between mdio drivers and mac drivers. > > Wouldn't the same dependency issue also apply for a DSA switch on a > MDIO bus, where the MDIO bus is part of the MAC driver? We already have some level of circular dependencies with DSA, e.g. the MAC driver provides the MDIO bus with the switch on it. It registers the MDIO bus, causing the switch to probe. That probe fails because the MAC driver has not registered its interface yet, which is the CPU interface. We end up deferring the switch probe, and by the second attempt, the MAC is fully registered and the switch probes. The circular dependency with a clock consumer/provider between the MAC and switch will be worse. We would need to avoid getting the clock in the probe function. It would need to happen in during open, by which time the switch should be present. MAC drivers also typically connect to their PHY during open, not probe, so i don't see this as being too big a problem. As to unbind, my gut feeling is, the general case is too complex. It is not a simple tree, where you can ensure unbind from the leaves towards the root. Either we let root take its chance, or we just block unbind. Andrew
On Thu, Aug 10, 2023 at 04:23:08PM +0200, Andrew Lunn wrote: > On Thu, Aug 10, 2023 at 02:54:58PM +0100, Russell King (Oracle) wrote: > > On Thu, Aug 10, 2023 at 03:49:24PM +0200, Andrew Lunn wrote: > > > > > What will be the best way to solve this issue for DSA switches attached to > > > > > MAC with RGMII RXC requirements? > > > > > > > > I have no idea - the problem there is the model that has been adopted > > > > in Linux is that there is no direct relationship between the DSA switch > > > > and the MAC like there is with a PHY. > > > > > > A clock provider/consumer relationship can be expressed in DT. The DSA > > > switch port would provide the clock, rather than the PHY. > > > > Then we'll be in to people wanting to do it for PHYs as well, and as > > we've recently discussed that isn't something we want because of the > > dependencies it creates between mdio drivers and mac drivers. > > > > Wouldn't the same dependency issue also apply for a DSA switch on a > > MDIO bus, where the MDIO bus is part of the MAC driver? > > We already have some level of circular dependencies with DSA, e.g. the > MAC driver provides the MDIO bus with the switch on it. It registers > the MDIO bus, causing the switch to probe. That probe fails because > the MAC driver has not registered its interface yet, which is the CPU > interface. We end up deferring the switch probe, and by the second > attempt, the MAC is fully registered and the switch probes. If that sequence occurs with a MAC that wants a clock from DSA, then we're at a loss... The DSA driver probe fails because the MAC hasn't fully registered itself, so doesn't create the clock. The MAC driver tries to get the clock but it doesn't exist, so it defers, tearing down the MDIO bus in the process. > The circular dependency with a clock consumer/provider between the MAC > and switch will be worse. We would need to avoid getting the clock in > the probe function. It would need to happen in during open, by which > time the switch should be present. MAC drivers also typically connect > to their PHY during open, not probe, so i don't see this as being too > big a problem. Providing nothing is happening in the MAC driver initialisation which requires that clock, then that would be fine. Removal should be possible provided the MAC driver doesn't need anything before it removes the MDIO bus.
Hello Russell, On Thu, 10 Aug 2023, Russell King (Oracle) wrote: > > We've had these issues before with stmmac, so this "stmmac needs the > > PHY receive clock" is nothing new - it's had problems with system > > suspend/resume in the past, and I've made suggestions... and when > > there's been two people trying to work on it, I've attempted to get > > them to talk to each other which resulted in nothing further > > happening. > > > > Another solution could possibly be that we reserve bit 30 on the > > PHY dev_flags to indicate that the receive clock must always be > > provided. I suspect that would have an advantage in another > ... > > Something like this for starters: > > diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c > ... I've implemented and tested the general-case solution you proposed to this receive clock issue with stmmac drivers. The core of your suggestion is pretty much unchanged, I just added a phylink_pcs flag for standalone PCS drivers that also need to provide the receive clock. I'd like to send a series for this upstream, which would allow solving this issue for both the DWMAC RZN1 case and the AT803x PHY suspend/hibernate case (and also potentially other cases with a similar bug). I wanted to ask you how you would prefer to be credited in my patch series. I was considering putting you as author and first signer of the initial patch adding the phy_dev flag. Would that be okay or would you prefer something else? Best Regards,
On 12/14/23 09:13, Romain Gantois wrote: > > Hello Russell, > > On Thu, 10 Aug 2023, Russell King (Oracle) wrote: > >>> We've had these issues before with stmmac, so this "stmmac needs the >>> PHY receive clock" is nothing new - it's had problems with system >>> suspend/resume in the past, and I've made suggestions... and when >>> there's been two people trying to work on it, I've attempted to get >>> them to talk to each other which resulted in nothing further >>> happening. >>> >>> Another solution could possibly be that we reserve bit 30 on the >>> PHY dev_flags to indicate that the receive clock must always be >>> provided. I suspect that would have an advantage in another >> ... >> >> Something like this for starters: >> >> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c >> ... > > I've implemented and tested the general-case solution you proposed to this > receive clock issue with stmmac drivers. The core of your suggestion is pretty > much unchanged, I just added a phylink_pcs flag for standalone PCS drivers that > also need to provide the receive clock. > > I'd like to send a series for this upstream, which would allow solving this > issue for both the DWMAC RZN1 case and the AT803x PHY suspend/hibernate case > (and also potentially other cases with a similar bug). > > I wanted to ask you how you would prefer to be credited in my patch series. I > was considering putting you as author and first signer of the initial patch > adding the phy_dev flag. Would that be okay or would you prefer something else? Credit it whichever way you see fit, don't worry, better focus on the fix. I can test the result on MX8MM/MX8MP, so feel free to CC me on that.
On Thu, Dec 14, 2023 at 09:13:58AM +0100, Romain Gantois wrote: > Hello Russell, > > I've implemented and tested the general-case solution you proposed to this > receive clock issue with stmmac drivers. The core of your suggestion is pretty > much unchanged, I just added a phylink_pcs flag for standalone PCS drivers that > also need to provide the receive clock. So this affects the ability of PCS to operate correctly as well as MACs? Would you enlighten which PCS are affected, and what PCS <--> PHY link modes this is required for? > I'd like to send a series for this upstream, which would allow solving this > issue for both the DWMAC RZN1 case and the AT803x PHY suspend/hibernate case > (and also potentially other cases with a similar bug). > > I wanted to ask you how you would prefer to be credited in my patch series. I > was considering putting you as author and first signer of the initial patch > adding the phy_dev flag. Would that be okay or would you prefer something else? It depends how big the changes are from my patches. If more than 50% of the patch remains my work, please retain my authorship. If you wish to also indicate your authorship, then there is a mechanism to do that - Co-developed-by: from submitting-patches.rst: Co-developed-by: states that the patch was co-created by multiple developers; it is used to give attribution to co-authors (in addition to the author attributed by the From: tag) when several people work on a single patch. Since Co-developed-by: denotes authorship, every Co-developed-by: must be immediately followed by a Signed-off-by: of the associated co-author. See submitting-patches.rst for examples of the ordering of the attributations. Thanks for asking.
On Thu, 14 Dec 2023, Russell King (Oracle) wrote: > On Thu, Dec 14, 2023 at 09:13:58AM +0100, Romain Gantois wrote: > > Hello Russell, > > > > I've implemented and tested the general-case solution you proposed to this > > receive clock issue with stmmac drivers. The core of your suggestion is pretty > > much unchanged, I just added a phylink_pcs flag for standalone PCS drivers that > > also need to provide the receive clock. > > So this affects the ability of PCS to operate correctly as well as MACs? > Would you enlighten which PCS are affected, and what PCS <--> PHY link > modes this is required for? The affected hardware is the RZN1 GMAC1 that is found in the r9a06g032 SoC from Renesas. This MAC controller is internally connected to a custom PCS that functions as a RGMII converter. This converter is handled by a standalone PCS driver that is already upstream, unlike the GMAC1 driver. So in hardware, the MAC/PHY links are organised this way: RZN1 GMAC1 <--[GMII]--> MII_CONV1 (internal) <--[RGMII]--> external PHY The issue is that the RX clock from the external PHY has to be transmitted by the MII converter before it reaches GMAC1. Therefore, if the RZN1 PCS driver isn't configured to let the clock through before the stmmac GMAC1 driver initializes its hardware, then said initialization will fail with a vague DMA error. To solve this, I added a flag to phylink_pcs and made the phylink core set it in phylink_validate_mac_and_pcs(). This gives the PCS driver a chance to check the flag in pcs_validate() and allow the clock through before the GMAC1 net device is brought up. Regards,
diff --git a/drivers/net/phy/at803x.c b/drivers/net/phy/at803x.c index 13c4121fa3092..8cb7b39c6cddc 100644 --- a/drivers/net/phy/at803x.c +++ b/drivers/net/phy/at803x.c @@ -986,6 +986,25 @@ static int at8031_pll_config(struct phy_device *phydev) static int at803x_hibernation_mode_config(struct phy_device *phydev) { struct at803x_priv *priv = phydev->priv; + int ret; + + /* Toggle hibernation mode OFF and ON to wake the PHY up and + * make it generate clock on RX_CLK pin for about 10 seconds. + * These clock are needed during start up by MACs like DWMAC + * in NXP i.MX8M Plus to release their DMA from reset. After + * the MAC has started up, the PHY can enter hibernation and + * disable the RX_CLK clock, this poses no problem for the MAC. + */ + ret = at803x_debug_reg_mask(phydev, AT803X_DEBUG_REG_HIB_CTRL, + AT803X_DEBUG_HIB_CTRL_PS_HIB_EN, 0); + if (ret < 0) + return ret; + + ret = at803x_debug_reg_mask(phydev, AT803X_DEBUG_REG_HIB_CTRL, + AT803X_DEBUG_HIB_CTRL_PS_HIB_EN, + AT803X_DEBUG_HIB_CTRL_PS_HIB_EN); + if (ret < 0) + return ret; /* The default after hardware reset is hibernation mode enabled. After * software reset, the value is retained.
Toggle hibernation mode OFF and ON to wake the PHY up and make it generate clock on RX_CLK pin for about 10 seconds. These clock are needed during start up by MACs like DWMAC in NXP i.MX8M Plus to release their DMA from reset. After the MAC has started up, the PHY can enter hibernation and disable the RX_CLK clock, this poses no problem for the MAC. Originally, this issue has been described by NXP in commit 9ecf04016c87 ("net: phy: at803x: add disable hibernation mode support") but this approach fully disables the hibernation support and takes away any power saving benefit. This patch instead makes the PHY generate the clock on start up for 10 seconds, which should be long enough for the EQoS MAC to release DMA from reset. Before this patch on i.MX8M Plus board with AR8031 PHY: " $ ifconfig eth1 up [ 25.576734] imx-dwmac 30bf0000.ethernet eth1: Register MEM_TYPE_PAGE_POOL RxQ-0 [ 25.658916] imx-dwmac 30bf0000.ethernet eth1: PHY [stmmac-1:00] driver [Qualcomm Atheros AR8031/AR8033] (irq=38) [ 26.670276] imx-dwmac 30bf0000.ethernet: Failed to reset the dma [ 26.676322] imx-dwmac 30bf0000.ethernet eth1: stmmac_hw_setup: DMA engine initialization failed [ 26.685103] imx-dwmac 30bf0000.ethernet eth1: __stmmac_open: Hw setup failed ifconfig: SIOCSIFFLAGS: Connection timed out " After this patch on i.MX8M Plus board with AR8031 PHY: " $ ifconfig eth1 up [ 19.419085] imx-dwmac 30bf0000.ethernet eth1: Register MEM_TYPE_PAGE_POOL RxQ-0 [ 19.507380] imx-dwmac 30bf0000.ethernet eth1: PHY [stmmac-1:00] driver [Qualcomm Atheros AR8031/AR8033] (irq=38) [ 19.528464] imx-dwmac 30bf0000.ethernet eth1: No Safety Features support found [ 19.535769] imx-dwmac 30bf0000.ethernet eth1: IEEE 1588-2008 Advanced Timestamp supported [ 19.544302] imx-dwmac 30bf0000.ethernet eth1: registered PTP clock [ 19.552008] imx-dwmac 30bf0000.ethernet eth1: FPE workqueue start [ 19.558152] imx-dwmac 30bf0000.ethernet eth1: configuring for phy/rgmii-id link mode " Signed-off-by: Marek Vasut <marex@denx.de> --- Cc: "David S. Miller" <davem@davemloft.net> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Eric Dumazet <edumazet@google.com> Cc: Heiner Kallweit <hkallweit1@gmail.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Oleksij Rempel <linux@rempel-privat.de> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Wei Fang <wei.fang@nxp.com> Cc: netdev@vger.kernel.org --- drivers/net/phy/at803x.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+)