diff mbox series

net: stmmac: don't attach interface until resume finishes

Message ID 20210928083620.29090-1-macpaul.lin@mediatek.com (mailing list archive)
State New, archived
Headers show
Series net: stmmac: don't attach interface until resume finishes | expand

Commit Message

Macpaul Lin Sept. 28, 2021, 8:36 a.m. UTC
From: Leon Yu <leoyu@nvidia.com>

commit 31096c3e8b1163c6e966bf4d1f36d8b699008f84 upstream.

Commit 14b41a2959fb ("net: stmmac: Delete txtimer in suspend()") was the
first attempt to fix a race between mod_timer() and setup_timer()
during stmmac_resume(). However the issue still exists as the commit
only addressed half of the issue.

Same race can still happen as stmmac_resume() re-attaches interface
way too early - even before hardware is fully initialized.  Worse,
doing so allows network traffic to restart and stmmac_tx_timer_arm()
being called in the middle of stmmac_resume(), which re-init tx timers
in stmmac_init_coalesce().  timer_list will be corrupted and system
crashes as a result of race between mod_timer() and setup_timer().

  systemd--1995    2.... 552950018us : stmmac_suspend: 4994
  ksoftirq-9       0..s2 553123133us : stmmac_tx_timer_arm: 2276
  systemd--1995    0.... 553127896us : stmmac_resume: 5101
  systemd--320     7...2 553132752us : stmmac_tx_timer_arm: 2276
  (sd-exec-1999    5...2 553135204us : stmmac_tx_timer_arm: 2276
  ---------------------------------
  pc : run_timer_softirq+0x468/0x5e0
  lr : run_timer_softirq+0x570/0x5e0
  Call trace:
   run_timer_softirq+0x468/0x5e0
   __do_softirq+0x124/0x398
   irq_exit+0xd8/0xe0
   __handle_domain_irq+0x6c/0xc0
   gic_handle_irq+0x60/0xb0
   el1_irq+0xb8/0x180
   arch_cpu_idle+0x38/0x230
   default_idle_call+0x24/0x3c
   do_idle+0x1e0/0x2b8
   cpu_startup_entry+0x28/0x48
   secondary_start_kernel+0x1b4/0x208

Fix this by deferring netif_device_attach() to the end of
stmmac_resume().

Signed-off-by: Leon Yu <leoyu@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Greg Kroah-Hartman Oct. 4, 2021, 10:11 a.m. UTC | #1
On Tue, Sep 28, 2021 at 04:36:20PM +0800, Macpaul Lin wrote:
> From: Leon Yu <leoyu@nvidia.com>
> 
> commit 31096c3e8b1163c6e966bf4d1f36d8b699008f84 upstream.
> 
> Commit 14b41a2959fb ("net: stmmac: Delete txtimer in suspend()") was the
> first attempt to fix a race between mod_timer() and setup_timer()
> during stmmac_resume(). However the issue still exists as the commit
> only addressed half of the issue.
> 
> Same race can still happen as stmmac_resume() re-attaches interface
> way too early - even before hardware is fully initialized.  Worse,
> doing so allows network traffic to restart and stmmac_tx_timer_arm()
> being called in the middle of stmmac_resume(), which re-init tx timers
> in stmmac_init_coalesce().  timer_list will be corrupted and system
> crashes as a result of race between mod_timer() and setup_timer().
> 
>   systemd--1995    2.... 552950018us : stmmac_suspend: 4994
>   ksoftirq-9       0..s2 553123133us : stmmac_tx_timer_arm: 2276
>   systemd--1995    0.... 553127896us : stmmac_resume: 5101
>   systemd--320     7...2 553132752us : stmmac_tx_timer_arm: 2276
>   (sd-exec-1999    5...2 553135204us : stmmac_tx_timer_arm: 2276
>   ---------------------------------
>   pc : run_timer_softirq+0x468/0x5e0
>   lr : run_timer_softirq+0x570/0x5e0
>   Call trace:
>    run_timer_softirq+0x468/0x5e0
>    __do_softirq+0x124/0x398
>    irq_exit+0xd8/0xe0
>    __handle_domain_irq+0x6c/0xc0
>    gic_handle_irq+0x60/0xb0
>    el1_irq+0xb8/0x180
>    arch_cpu_idle+0x38/0x230
>    default_idle_call+0x24/0x3c
>    do_idle+0x1e0/0x2b8
>    cpu_startup_entry+0x28/0x48
>    secondary_start_kernel+0x1b4/0x208
> 
> Fix this by deferring netif_device_attach() to the end of
> stmmac_resume().
> 
> Signed-off-by: Leon Yu <leoyu@nvidia.com>
> Signed-off-by: David S. Miller <davem@davemloft.net>

Whenever you forward on a patch, you should add yourself to the
signed-off-by chain.

I'll just add you to the cc: to let us know who asked for this patch.

thanks,

greg k-h
diff mbox series

Patch

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 10d28be73f45..56d227b31dbd 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -4853,8 +4853,6 @@  int stmmac_resume(struct device *dev)
 			stmmac_mdio_reset(priv->mii);
 	}
 
-	netif_device_attach(ndev);
-
 	mutex_lock(&priv->lock);
 
 	stmmac_reset_queues_param(priv);
@@ -4878,6 +4876,8 @@  int stmmac_resume(struct device *dev)
 
 	phylink_mac_change(priv->phylink, true);
 
+	netif_device_attach(ndev);
+
 	return 0;
 }
 EXPORT_SYMBOL_GPL(stmmac_resume);