Message ID | 20231127090426.3761729-7-claudiu.beznea.uj@bp.renesas.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | net: ravb: Fixes for the ravb driver | expand |
On 11/27/23 12:04 PM, Claudiu wrote: > From: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> > > On RZ/G3S SMARC Carrier II board having RGMII connections b/w Ethernet > MACs and PHYs it has been discovered that doing unbind/bind for ravb > driver in a loop leads to wrong speed and duplex for Ethernet links and > broken connectivity (the connectivity cannot be restored even with > bringing interface down/up). Before doing unbind/bind the Ethernet > interfaces were configured though systemd. The sh instructions used to > do unbind/bind were: > > $ cd /sys/bus/platform/drivers/ravb/ > $ while :; do echo 11c30000.ethernet > unbind ; \ > echo 11c30000.ethernet > bind; done > > It has been discovered that there is a race b/w IOCTLs initialized by > systemd at the response of success binding and the > "ravb_write(ndev, CCC_OPC_RESET, CCC)" instruction in ravb_remove() as s/instruction/call/, perhaps? > follows: > > 1/ as a result of bind success the user space open/configures the > interfaces tough an IOCTL; the following stack trace has been > identified on RZ/G3S: > > Call trace: > dump_backtrace+0x9c/0x100 > show_stack+0x20/0x38 > dump_stack_lvl+0x48/0x60 > dump_stack+0x18/0x28 > ravb_open+0x70/0xa58 > __dev_open+0xf4/0x1e8 > __dev_change_flags+0x198/0x218 > dev_change_flags+0x2c/0x80 > devinet_ioctl+0x640/0x708 > inet_ioctl+0x1e4/0x200 > sock_do_ioctl+0x50/0x108 > sock_ioctl+0x240/0x358 > __arm64_sys_ioctl+0xb0/0x100 > invoke_syscall+0x50/0x128 > el0_svc_common.constprop.0+0xc8/0xf0 > do_el0_svc+0x24/0x38 > el0_svc+0x34/0xb8 > el0t_64_sync_handler+0xc0/0xc8 > el0t_64_sync+0x190/0x198 > > 2/ this call may execute concurrently with ravb_remove() as the > unbind/bind operation was executed in a loop > 3/ if the operation mode is changed to RESET (though Through? > ravb_write(ndev, CCC_OPC_RESET, CCC) instruction in ravb_remove()) s/instruction/call/, perhaps? > while the above ravb_open() is in progress it may lead to MAC > (or PHY, or MAC-PHY connection, the right point hasn't been identified > at the moment) to be broken, thus the Ethernet connectivity fails to > restore. > > The simple fix for this is to move ravb_write(ndev, CCC_OPC_RESET, CCC)) > after unregister_netdev() to avoid resetting the controller while the > netdev interface is still registered. > > To avoid future issues in ravb_remove(), the patch follows the proper order > of operations in ravb_remove(): reverse order compared with ravb_probe(). > This avoids described races as the IOCTLs as well as unregister_netdev() > (called now at the beginning of ravb_remove()) calls rtnl_lock() before > continuing and IOCTLs check (though devinet_ioctl()) if device is still > registered just after taking the lock: > > int devinet_ioctl(struct net *net, unsigned int cmd, struct ifreq *ifr) > { > // ... > > rtnl_lock(); > > ret = -ENODEV; > dev = __dev_get_by_name(net, ifr->ifr_name); > if (!dev) > goto done; > > // ... > done: > rtnl_unlock(); > out: > return ret; > } > > Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper") > Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> [...] Sorry for overlooking this race (and other bugs) when prepping the driver for upstream! MBR, Sergey
diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c index 805720166ef3..9cad10db59b7 100644 --- a/drivers/net/ethernet/renesas/ravb_main.c +++ b/drivers/net/ethernet/renesas/ravb_main.c @@ -2894,22 +2894,26 @@ static void ravb_remove(struct platform_device *pdev) struct ravb_private *priv = netdev_priv(ndev); const struct ravb_hw_info *info = priv->info; - /* Stop PTP Clock driver */ - if (info->ccc_gac) - ravb_ptp_stop(ndev); - - clk_disable_unprepare(priv->gptp_clk); - clk_disable_unprepare(priv->refclk); - - /* Set reset mode */ - ravb_write(ndev, CCC_OPC_RESET, CCC); unregister_netdev(ndev); if (info->nc_queues) netif_napi_del(&priv->napi[RAVB_NC]); netif_napi_del(&priv->napi[RAVB_BE]); + ravb_mdio_release(priv); + + /* Stop PTP Clock driver */ + if (info->ccc_gac) + ravb_ptp_stop(ndev); + dma_free_coherent(ndev->dev.parent, priv->desc_bat_size, priv->desc_bat, priv->desc_bat_dma); + + /* Set reset mode */ + ravb_write(ndev, CCC_OPC_RESET, CCC); + + clk_disable_unprepare(priv->gptp_clk); + clk_disable_unprepare(priv->refclk); + pm_runtime_put_sync(&pdev->dev); pm_runtime_disable(&pdev->dev); reset_control_assert(priv->rstc);