Message ID | 20241022172153.217890-1-jdamato@fastly.com (mailing list archive) |
---|---|
State | RFC |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [RFC,iwl-net] e1000: Hold RTNL when e1000_down can be called | expand |
On Tue, Oct 22, 2024 at 05:21:53PM +0000, Joe Damato wrote: > e1000_down calls netif_queue_set_napi, which assumes that RTNL is held. > > There are a few paths for e1000_down to be called in e1000 where RTNL is > not currently being held: > - e1000_shutdown (pci shutdown) > - e1000_suspend (power management) > - e1000_reinit_locked (via e1000_reset_task delayed work) > > Hold RTNL in two places to fix this issue: > - e1000_reset_task > - __e1000_shutdown (which is called from both e1000_shutdown and > e1000_suspend). It looks like there's one other spot I missed: e1000_io_error_detected (pci error handler) which should also hold rtnl_lock: + if (netif_running(netdev)) { + rtnl_lock(); e1000_down(adapter); + rtnl_unlock(); + } I can send that update in the v2, but I'll wait to see if Intel has suggestions on the below. > The other paths which call e1000_down seemingly hold RTNL and are OK: > - e1000_close (ndo_stop) > - e1000_change_mtu (ndo_change_mtu) > > I'm submitting this is as an RFC because: > - the e1000_reinit_locked issue appears very similar to commit > 21f857f0321d ("e1000e: add rtnl_lock() to e1000_reset_task"), which > fixes a similar issue in e1000e > > however > > - adding rtnl to e1000_reinit_locked seemingly conflicts with an > earlier e1000 commit b2f963bfaeba ("e1000: fix lockdep warning in > e1000_reset_task"). > > Hopefully Intel can weigh in and shed some light on the correct way to > go. > > Fixes: 8f7ff18a5ec7 ("e1000: Link NAPI instances to queues and IRQs") > Reported-by: Dmitry Antipov <dmantipov@yandex.ru> > Closes: https://lore.kernel.org/netdev/8cf62307-1965-46a0-a411-ff0080090ff9@yandex.ru/ > Signed-off-by: Joe Damato <jdamato@fastly.com>
diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c index 4de9b156b2be..9ed99c75d59e 100644 --- a/drivers/net/ethernet/intel/e1000/e1000_main.c +++ b/drivers/net/ethernet/intel/e1000/e1000_main.c @@ -3509,7 +3509,9 @@ static void e1000_reset_task(struct work_struct *work) container_of(work, struct e1000_adapter, reset_task); e_err(drv, "Reset adapter\n"); + rtnl_lock(); e1000_reinit_locked(adapter); + rtnl_unlock(); } /** @@ -5074,7 +5076,9 @@ static int __e1000_shutdown(struct pci_dev *pdev, bool *enable_wake) usleep_range(10000, 20000); WARN_ON(test_bit(__E1000_RESETTING, &adapter->flags)); + rtnl_lock(); e1000_down(adapter); + rtnl_unlock(); } status = er32(STATUS);
e1000_down calls netif_queue_set_napi, which assumes that RTNL is held. There are a few paths for e1000_down to be called in e1000 where RTNL is not currently being held: - e1000_shutdown (pci shutdown) - e1000_suspend (power management) - e1000_reinit_locked (via e1000_reset_task delayed work) Hold RTNL in two places to fix this issue: - e1000_reset_task - __e1000_shutdown (which is called from both e1000_shutdown and e1000_suspend). The other paths which call e1000_down seemingly hold RTNL and are OK: - e1000_close (ndo_stop) - e1000_change_mtu (ndo_change_mtu) I'm submitting this is as an RFC because: - the e1000_reinit_locked issue appears very similar to commit 21f857f0321d ("e1000e: add rtnl_lock() to e1000_reset_task"), which fixes a similar issue in e1000e however - adding rtnl to e1000_reinit_locked seemingly conflicts with an earlier e1000 commit b2f963bfaeba ("e1000: fix lockdep warning in e1000_reset_task"). Hopefully Intel can weigh in and shed some light on the correct way to go. Fixes: 8f7ff18a5ec7 ("e1000: Link NAPI instances to queues and IRQs") Reported-by: Dmitry Antipov <dmantipov@yandex.ru> Closes: https://lore.kernel.org/netdev/8cf62307-1965-46a0-a411-ff0080090ff9@yandex.ru/ Signed-off-by: Joe Damato <jdamato@fastly.com> --- drivers/net/ethernet/intel/e1000/e1000_main.c | 4 ++++ 1 file changed, 4 insertions(+) base-commit: d811ac148f0afd2f3f7e1cd7f54de8da973ec5e3