Message ID | E1tnfRe-0057S9-6W@rmk-PC.armlinux.org.uk (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | net: stmmac: fix setting RE and TE inappropriately | expand |
On Thu, Feb 27, 2025 at 03:05:02PM +0000, Russell King (Oracle) wrote: > Phylink does not permit drivers to mess with the netif carrier, as > this will de-synchronise phylink with the MAC driver. Moreover, > setting and clearing the TE and RE bits via stmmac_mac_set() in this > path is also wrong as the link may not be up. > > Replace the netif_carrier_on(), netif_carrier_off() and > stmmac_mac_set() calls with the appropriate phylink_start() and > phylink_stop() calls, thereby allowing phylink to manage the netif > carrier and TE/RE bits through the .mac_link_up() and .mac_link_down() > methods. > > Note that RE should only be set after the DMA is ready to avoid the > receive FIFO between the MAC and DMA blocks overflowing, so > phylink_start() needs to be placed after DMA has been started. Sorry, i don't know enough about XDP to review this :-( Andrew
On Thu, Feb 27, 2025 at 11:27:57PM +0100, Andrew Lunn wrote: > On Thu, Feb 27, 2025 at 03:05:02PM +0000, Russell King (Oracle) wrote: > > Phylink does not permit drivers to mess with the netif carrier, as > > this will de-synchronise phylink with the MAC driver. Moreover, > > setting and clearing the TE and RE bits via stmmac_mac_set() in this > > path is also wrong as the link may not be up. > > > > Replace the netif_carrier_on(), netif_carrier_off() and > > stmmac_mac_set() calls with the appropriate phylink_start() and > > phylink_stop() calls, thereby allowing phylink to manage the netif > > carrier and TE/RE bits through the .mac_link_up() and .mac_link_down() > > methods. > > > > Note that RE should only be set after the DMA is ready to avoid the > > receive FIFO between the MAC and DMA blocks overflowing, so > > phylink_start() needs to be placed after DMA has been started. > > Sorry, i don't know enough about XDP to review this :-( I suspect there aren't many people who could review it. However, delving into the history, it seems that this commit was responsible for introducing stmmac_xdp_{open,release}, thus intrducing the fidding of the netif carrier which is prohibited by phylink: commit ac746c8520d9d056b6963ecca8ff1da9929d02f1 Author: Ong Boon Leong <boon.leong.ong@intel.com> Date: Thu Nov 11 22:39:49 2021 +0800 net: stmmac: enhance XDP ZC driver level switching performance but that commit was wrong for this very reason. Didn't phylib used to not renegotiate the link if nothing changed across a phy_stop()..phy_start() ? I'm wondering whether my commit is in essence reverting this commit.
On Thu, 27 Feb 2025 15:05:02 +0000 "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk> wrote: > Phylink does not permit drivers to mess with the netif carrier, as > this will de-synchronise phylink with the MAC driver. Moreover, > setting and clearing the TE and RE bits via stmmac_mac_set() in this > path is also wrong as the link may not be up. > > Replace the netif_carrier_on(), netif_carrier_off() and > stmmac_mac_set() calls with the appropriate phylink_start() and > phylink_stop() calls, thereby allowing phylink to manage the netif > carrier and TE/RE bits through the .mac_link_up() and .mac_link_down() > methods. > > Note that RE should only be set after the DMA is ready to avoid the > receive FIFO between the MAC and DMA blocks overflowing, so > phylink_start() needs to be placed after DMA has been started. > > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> > --- > drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 14 ++++++-------- > 1 file changed, 6 insertions(+), 8 deletions(-) > > diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c > index 13796b1c8d7c..84d8b1c9f6d4 100644 > --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c > +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c > @@ -6887,6 +6887,8 @@ void stmmac_xdp_release(struct net_device *dev) > /* Ensure tx function is not running */ > netif_tx_disable(dev); > > + phylink_stop(priv->phylink); > + > /* Disable NAPI process */ > stmmac_disable_all_queues(priv); > > @@ -6902,14 +6904,10 @@ void stmmac_xdp_release(struct net_device *dev) > /* Release and free the Rx/Tx resources */ > free_dma_desc_resources(priv, &priv->dma_conf); > > - /* Disable the MAC Rx/Tx */ > - stmmac_mac_set(priv, priv->ioaddr, false); > - > /* set trans_start so we don't get spurious > * watchdogs during reset > */ > netif_trans_update(dev); > - netif_carrier_off(dev); > } > > int stmmac_xdp_open(struct net_device *dev) > @@ -6992,25 +6990,25 @@ int stmmac_xdp_open(struct net_device *dev) > tx_q->txtimer.function = stmmac_tx_timer; > } > > - /* Enable the MAC Rx/Tx */ > - stmmac_mac_set(priv, priv->ioaddr, true); > - > /* Start Rx & Tx DMA Channels */ > stmmac_start_all_dma(priv); > > + phylink_start(priv->phylink); > + > ret = stmmac_request_irq(dev); > if (ret) > goto irq_error; > > /* Enable NAPI process*/ > stmmac_enable_all_queues(priv); > - netif_carrier_on(dev); > netif_tx_start_all_queues(dev); > stmmac_enable_all_dma_irq(priv); > > return 0; > > irq_error: > + phylink_stop(priv->phylink); > + > for (chan = 0; chan < priv->plat->tx_queues_to_use; chan++) > hrtimer_cancel(&priv->dma_conf.tx_queue[chan].txtimer); > XDP programs work like a charm both before and after this patch. Tested-by: Furong Xu <0x1207@gmail.com>
On Fri, Feb 28, 2025 at 03:31:22PM +0800, Furong Xu wrote: > On Thu, 27 Feb 2025 15:05:02 +0000 > "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk> wrote: > > > Phylink does not permit drivers to mess with the netif carrier, as > > this will de-synchronise phylink with the MAC driver. Moreover, > > setting and clearing the TE and RE bits via stmmac_mac_set() in this > > path is also wrong as the link may not be up. > > > > Replace the netif_carrier_on(), netif_carrier_off() and > > stmmac_mac_set() calls with the appropriate phylink_start() and > > phylink_stop() calls, thereby allowing phylink to manage the netif > > carrier and TE/RE bits through the .mac_link_up() and .mac_link_down() > > methods. > > > > Note that RE should only be set after the DMA is ready to avoid the > > receive FIFO between the MAC and DMA blocks overflowing, so > > phylink_start() needs to be placed after DMA has been started. > > > > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> > XDP programs work like a charm both before and after this patch. > > Tested-by: Furong Xu <0x1207@gmail.com> Thanks for testing this. Could you give a little details of how you actually tested it? The issues here is, when is the link set admin up, requiring that phylink triggers an autoneg etc. For plain old TCP/IP, you at some point use: ip link set eth42 up which will cause the core to call the drivers open() method, which then triggers phylnk. The carrier manipulation which this code replaces seems to suggest you can load an XDP program while the interface is admin down, and that action of loading the program will implicitly set the carrier up. Did you test loading an XDP program on an interface which is admin down? Andrew
On Fri, Feb 28, 2025 at 02:14:29PM +0100, Andrew Lunn wrote: > On Fri, Feb 28, 2025 at 03:31:22PM +0800, Furong Xu wrote: > > On Thu, 27 Feb 2025 15:05:02 +0000 > > "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk> wrote: > > > > > Phylink does not permit drivers to mess with the netif carrier, as > > > this will de-synchronise phylink with the MAC driver. Moreover, > > > setting and clearing the TE and RE bits via stmmac_mac_set() in this > > > path is also wrong as the link may not be up. > > > > > > Replace the netif_carrier_on(), netif_carrier_off() and > > > stmmac_mac_set() calls with the appropriate phylink_start() and > > > phylink_stop() calls, thereby allowing phylink to manage the netif > > > carrier and TE/RE bits through the .mac_link_up() and .mac_link_down() > > > methods. > > > > > > Note that RE should only be set after the DMA is ready to avoid the > > > receive FIFO between the MAC and DMA blocks overflowing, so > > > phylink_start() needs to be placed after DMA has been started. > > > > > > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> > > > XDP programs work like a charm both before and after this patch. > > > > Tested-by: Furong Xu <0x1207@gmail.com> > > Thanks for testing this. > > Could you give a little details of how you actually tested it? > > The issues here is, when is the link set admin up, requiring that > phylink triggers an autoneg etc. > > For plain old TCP/IP, you at some point use: > > ip link set eth42 up > > which will cause the core to call the drivers open() method, which > then triggers phylnk. > > The carrier manipulation which this code replaces seems to suggest you > can load an XDP program while the interface is admin down, and that > action of loading the program will implicitly set the carrier up. It won't. See drivers/net/ethernet/stmicro/stmmac/stmmac_xdp.c::stmmac_xdp_set_prog(): if_running = netif_running(dev); need_update = !!priv->xdp_prog != !!prog; if (if_running && need_update) stmmac_xdp_release(dev); ... if (if_running && need_update) stmmac_xdp_open(dev); stmmac_xdp_open() and stmmac_xdp_release() will only be called if netif_running() returns true as explained yesterday (but maybe without enough detail). This tests __LINK_STATE_START in dev->state, which is set just before calling .ndo_open(), and cleared if it fails or before calling .ndo_stop(). Thus, netif_running() indicates whether the net core thinks the device is adminsitratively "up". Therefore, in the old code, if the programme is changed while the device is administratively down, netif_running() will return false, if_running will be false, and neither stmmac_xdp_release() nor stmmac_xdp_open() will be called. Hope that addresses your concern. Thanks.
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index 13796b1c8d7c..84d8b1c9f6d4 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -6887,6 +6887,8 @@ void stmmac_xdp_release(struct net_device *dev) /* Ensure tx function is not running */ netif_tx_disable(dev); + phylink_stop(priv->phylink); + /* Disable NAPI process */ stmmac_disable_all_queues(priv); @@ -6902,14 +6904,10 @@ void stmmac_xdp_release(struct net_device *dev) /* Release and free the Rx/Tx resources */ free_dma_desc_resources(priv, &priv->dma_conf); - /* Disable the MAC Rx/Tx */ - stmmac_mac_set(priv, priv->ioaddr, false); - /* set trans_start so we don't get spurious * watchdogs during reset */ netif_trans_update(dev); - netif_carrier_off(dev); } int stmmac_xdp_open(struct net_device *dev) @@ -6992,25 +6990,25 @@ int stmmac_xdp_open(struct net_device *dev) tx_q->txtimer.function = stmmac_tx_timer; } - /* Enable the MAC Rx/Tx */ - stmmac_mac_set(priv, priv->ioaddr, true); - /* Start Rx & Tx DMA Channels */ stmmac_start_all_dma(priv); + phylink_start(priv->phylink); + ret = stmmac_request_irq(dev); if (ret) goto irq_error; /* Enable NAPI process*/ stmmac_enable_all_queues(priv); - netif_carrier_on(dev); netif_tx_start_all_queues(dev); stmmac_enable_all_dma_irq(priv); return 0; irq_error: + phylink_stop(priv->phylink); + for (chan = 0; chan < priv->plat->tx_queues_to_use; chan++) hrtimer_cancel(&priv->dma_conf.tx_queue[chan].txtimer);
Phylink does not permit drivers to mess with the netif carrier, as this will de-synchronise phylink with the MAC driver. Moreover, setting and clearing the TE and RE bits via stmmac_mac_set() in this path is also wrong as the link may not be up. Replace the netif_carrier_on(), netif_carrier_off() and stmmac_mac_set() calls with the appropriate phylink_start() and phylink_stop() calls, thereby allowing phylink to manage the netif carrier and TE/RE bits through the .mac_link_up() and .mac_link_down() methods. Note that RE should only be set after the DMA is ready to avoid the receive FIFO between the MAC and DMA blocks overflowing, so phylink_start() needs to be placed after DMA has been started. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> --- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-)