Message ID | 20250403135311.545633-8-shaojijie@huawei.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | There are some bugfix for hibmcge driver | expand |
On Thu, Apr 03, 2025 at 09:53:11PM +0800, Jijie Shao wrote: > After detecting the np_link_fail exception, > the driver attempts to fix the exception by > using phy_stop() and phy_start() in the scheduled task. > > However, hbg_fix_np_link_fail() and .ndo_stop() > may be concurrently executed. As a result, > phy_stop() is executed twice, and the following Calltrace occurs: > > hibmcge 0000:84:00.2 enp132s0f2: Link is Down > hibmcge 0000:84:00.2: failed to link between MAC and PHY, try to fix... > ------------[ cut here ]------------ > called from state HALTED > WARNING: CPU: 71 PID: 23391 at drivers/net/phy/phy.c:1503 phy_stop... > ... > pc : phy_stop+0x138/0x180 > lr : phy_stop+0x138/0x180 > sp : ffff8000c76bbd40 > x29: ffff8000c76bbd40 x28: 0000000000000000 x27: 0000000000000000 > x26: ffff2020047358c0 x25: ffff202004735940 x24: ffff20200000e405 > x23: ffff2020060e5178 x22: ffff2020060e4000 x21: ffff2020060e49c0 > x20: ffff2020060e5170 x19: ffff20202538e000 x18: 0000000000000020 > x17: 0000000000000000 x16: ffffcede02e28f40 x15: ffffffffffffffff > x14: 0000000000000000 x13: 205d313933333254 x12: 5b5d393430303233 > x11: ffffcede04555958 x10: ffffcede04495918 x9 : ffffcede0274fee0 > x8 : 00000000000bffe8 x7 : c0000000ffff7fff x6 : 0000000000000001 > x5 : 00000000002bffa8 x4 : 0000000000000000 x3 : 0000000000000000 > x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff20202e429480 > Call trace: > phy_stop+0x138/0x180 > hbg_fix_np_link_fail+0x4c/0x90 [hibmcge] > hbg_service_task+0xfc/0x148 [hibmcge] > process_one_work+0x180/0x398 > worker_thread+0x210/0x328 > kthread+0xe0/0xf0 > ret_from_fork+0x10/0x20 > ---[ end trace 0000000000000000 ]--- > > This patch adds the rtnl_lock to hbg_fix_np_link_fail() > to ensure that other operations are not performed concurrently. > In addition, np_link_fail exception can be fixed > only when the PHY is link. > > Fixes: e0306637e85d ("net: hibmcge: Add support for mac link exception handling feature") > Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org>
diff --git a/drivers/net/ethernet/hisilicon/hibmcge/hbg_mdio.c b/drivers/net/ethernet/hisilicon/hibmcge/hbg_mdio.c index f29a937ad087..42b0083c9193 100644 --- a/drivers/net/ethernet/hisilicon/hibmcge/hbg_mdio.c +++ b/drivers/net/ethernet/hisilicon/hibmcge/hbg_mdio.c @@ -2,6 +2,7 @@ // Copyright (c) 2024 Hisilicon Limited. #include <linux/phy.h> +#include <linux/rtnetlink.h> #include "hbg_common.h" #include "hbg_hw.h" #include "hbg_mdio.h" @@ -133,12 +134,17 @@ void hbg_fix_np_link_fail(struct hbg_priv *priv) { struct device *dev = &priv->pdev->dev; + rtnl_lock(); + if (priv->stats.np_link_fail_cnt >= HBG_NP_LINK_FAIL_RETRY_TIMES) { dev_err(dev, "failed to fix the MAC link status\n"); priv->stats.np_link_fail_cnt = 0; - return; + goto unlock; } + if (!priv->mac.phydev->link) + goto unlock; + priv->stats.np_link_fail_cnt++; dev_err(dev, "failed to link between MAC and PHY, try to fix...\n"); @@ -147,6 +153,9 @@ void hbg_fix_np_link_fail(struct hbg_priv *priv) */ hbg_phy_stop(priv); hbg_phy_start(priv); + +unlock: + rtnl_unlock(); } static void hbg_phy_adjust_link(struct net_device *netdev)
After detecting the np_link_fail exception, the driver attempts to fix the exception by using phy_stop() and phy_start() in the scheduled task. However, hbg_fix_np_link_fail() and .ndo_stop() may be concurrently executed. As a result, phy_stop() is executed twice, and the following Calltrace occurs: hibmcge 0000:84:00.2 enp132s0f2: Link is Down hibmcge 0000:84:00.2: failed to link between MAC and PHY, try to fix... ------------[ cut here ]------------ called from state HALTED WARNING: CPU: 71 PID: 23391 at drivers/net/phy/phy.c:1503 phy_stop... ... pc : phy_stop+0x138/0x180 lr : phy_stop+0x138/0x180 sp : ffff8000c76bbd40 x29: ffff8000c76bbd40 x28: 0000000000000000 x27: 0000000000000000 x26: ffff2020047358c0 x25: ffff202004735940 x24: ffff20200000e405 x23: ffff2020060e5178 x22: ffff2020060e4000 x21: ffff2020060e49c0 x20: ffff2020060e5170 x19: ffff20202538e000 x18: 0000000000000020 x17: 0000000000000000 x16: ffffcede02e28f40 x15: ffffffffffffffff x14: 0000000000000000 x13: 205d313933333254 x12: 5b5d393430303233 x11: ffffcede04555958 x10: ffffcede04495918 x9 : ffffcede0274fee0 x8 : 00000000000bffe8 x7 : c0000000ffff7fff x6 : 0000000000000001 x5 : 00000000002bffa8 x4 : 0000000000000000 x3 : 0000000000000000 x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff20202e429480 Call trace: phy_stop+0x138/0x180 hbg_fix_np_link_fail+0x4c/0x90 [hibmcge] hbg_service_task+0xfc/0x148 [hibmcge] process_one_work+0x180/0x398 worker_thread+0x210/0x328 kthread+0xe0/0xf0 ret_from_fork+0x10/0x20 ---[ end trace 0000000000000000 ]--- This patch adds the rtnl_lock to hbg_fix_np_link_fail() to ensure that other operations are not performed concurrently. In addition, np_link_fail exception can be fixed only when the PHY is link. Fixes: e0306637e85d ("net: hibmcge: Add support for mac link exception handling feature") Signed-off-by: Jijie Shao <shaojijie@huawei.com> --- drivers/net/ethernet/hisilicon/hibmcge/hbg_mdio.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-)