Message ID | 20230313160645.3332457-1-sassmann@kpanic.de (mailing list archive) |
---|---|
State | Awaiting Upstream |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net,v2] iavf: fix hang on reboot with ice | expand |
On Mon, Mar 13, 2023 at 05:06:45PM +0100, Stefan Assmann wrote: > When a system with E810 with existing VFs gets rebooted the following > hang may be observed. > > Pid 1 is hung in iavf_remove(), part of a network driver: > PID: 1 TASK: ffff965400e5a340 CPU: 24 COMMAND: "systemd-shutdow" > #0 [ffffaad04005fa50] __schedule at ffffffff8b3239cb > #1 [ffffaad04005fae8] schedule at ffffffff8b323e2d > #2 [ffffaad04005fb00] schedule_hrtimeout_range_clock at ffffffff8b32cebc > #3 [ffffaad04005fb80] usleep_range_state at ffffffff8b32c930 > #4 [ffffaad04005fbb0] iavf_remove at ffffffffc12b9b4c [iavf] > #5 [ffffaad04005fbf0] pci_device_remove at ffffffff8add7513 > #6 [ffffaad04005fc10] device_release_driver_internal at ffffffff8af08baa > #7 [ffffaad04005fc40] pci_stop_bus_device at ffffffff8adcc5fc > #8 [ffffaad04005fc60] pci_stop_and_remove_bus_device at ffffffff8adcc81e > #9 [ffffaad04005fc70] pci_iov_remove_virtfn at ffffffff8adf9429 > #10 [ffffaad04005fca8] sriov_disable at ffffffff8adf98e4 > #11 [ffffaad04005fcc8] ice_free_vfs at ffffffffc04bb2c8 [ice] > #12 [ffffaad04005fd10] ice_remove at ffffffffc04778fe [ice] > #13 [ffffaad04005fd38] ice_shutdown at ffffffffc0477946 [ice] > #14 [ffffaad04005fd50] pci_device_shutdown at ffffffff8add58f1 > #15 [ffffaad04005fd70] device_shutdown at ffffffff8af05386 > #16 [ffffaad04005fd98] kernel_restart at ffffffff8a92a870 > #17 [ffffaad04005fda8] __do_sys_reboot at ffffffff8a92abd6 > #18 [ffffaad04005fee0] do_syscall_64 at ffffffff8b317159 > #19 [ffffaad04005ff08] __context_tracking_enter at ffffffff8b31b6fc > #20 [ffffaad04005ff18] syscall_exit_to_user_mode at ffffffff8b31b50d > #21 [ffffaad04005ff28] do_syscall_64 at ffffffff8b317169 > #22 [ffffaad04005ff50] entry_SYSCALL_64_after_hwframe at ffffffff8b40009b > RIP: 00007f1baa5c13d7 RSP: 00007fffbcc55a98 RFLAGS: 00000202 > RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1baa5c13d7 > RDX: 0000000001234567 RSI: 0000000028121969 RDI: 00000000fee1dead > RBP: 00007fffbcc55ca0 R8: 0000000000000000 R9: 00007fffbcc54e90 > R10: 00007fffbcc55050 R11: 0000000000000202 R12: 0000000000000005 > R13: 0000000000000000 R14: 00007fffbcc55af0 R15: 0000000000000000 > ORIG_RAX: 00000000000000a9 CS: 0033 SS: 002b > > During reboot all drivers PM shutdown callbacks are invoked. > In iavf_shutdown() the adapter state is changed to __IAVF_REMOVE. > In ice_shutdown() the call chain above is executed, which at some point > calls iavf_remove(). However iavf_remove() expects the VF to be in one > of the states __IAVF_RUNNING, __IAVF_DOWN or __IAVF_INIT_FAILED. If > that's not the case it sleeps forever. > So if iavf_shutdown() gets invoked before iavf_remove() the system will > hang indefinitely because the adapter is already in state __IAVF_REMOVE. > > Fix this by returning from iavf_remove() if the state is __IAVF_REMOVE, > as we already went through iavf_shutdown(). > > Fixes: 974578017fc1 ("iavf: Add waiting so the port is initialized in remove") > Fixes: a8417330f8a5 ("iavf: Fix race condition between iavf_shutdown and iavf_remove") > Reported-by: Marius Cornea <mcornea@redhat.com> > Signed-off-by: Stefan Assmann <sassmann@kpanic.de> > --- > v2: return instead of breaking the while (1) loop > This avoids going through remove code twice and is how things worked > before a8417330f8a5. Good catch. Indeed there was such a logic before that patch. Thanks, Michal Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> > > drivers/net/ethernet/intel/iavf/iavf_main.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c > index 3273aeb8fa67..ce7071e9af15 100644 > --- a/drivers/net/ethernet/intel/iavf/iavf_main.c > +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c > @@ -5066,6 +5066,11 @@ static void iavf_remove(struct pci_dev *pdev) > mutex_unlock(&adapter->crit_lock); > break; > } > + /* Simply return if we already went through iavf_shutdown */ > + if (adapter->state == __IAVF_REMOVE) { > + mutex_unlock(&adapter->crit_lock); > + return; > + } > > mutex_unlock(&adapter->crit_lock); > usleep_range(500, 1000); > -- > 2.39.1 >
diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c index 3273aeb8fa67..ce7071e9af15 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_main.c +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c @@ -5066,6 +5066,11 @@ static void iavf_remove(struct pci_dev *pdev) mutex_unlock(&adapter->crit_lock); break; } + /* Simply return if we already went through iavf_shutdown */ + if (adapter->state == __IAVF_REMOVE) { + mutex_unlock(&adapter->crit_lock); + return; + } mutex_unlock(&adapter->crit_lock); usleep_range(500, 1000);
When a system with E810 with existing VFs gets rebooted the following hang may be observed. Pid 1 is hung in iavf_remove(), part of a network driver: PID: 1 TASK: ffff965400e5a340 CPU: 24 COMMAND: "systemd-shutdow" #0 [ffffaad04005fa50] __schedule at ffffffff8b3239cb #1 [ffffaad04005fae8] schedule at ffffffff8b323e2d #2 [ffffaad04005fb00] schedule_hrtimeout_range_clock at ffffffff8b32cebc #3 [ffffaad04005fb80] usleep_range_state at ffffffff8b32c930 #4 [ffffaad04005fbb0] iavf_remove at ffffffffc12b9b4c [iavf] #5 [ffffaad04005fbf0] pci_device_remove at ffffffff8add7513 #6 [ffffaad04005fc10] device_release_driver_internal at ffffffff8af08baa #7 [ffffaad04005fc40] pci_stop_bus_device at ffffffff8adcc5fc #8 [ffffaad04005fc60] pci_stop_and_remove_bus_device at ffffffff8adcc81e #9 [ffffaad04005fc70] pci_iov_remove_virtfn at ffffffff8adf9429 #10 [ffffaad04005fca8] sriov_disable at ffffffff8adf98e4 #11 [ffffaad04005fcc8] ice_free_vfs at ffffffffc04bb2c8 [ice] #12 [ffffaad04005fd10] ice_remove at ffffffffc04778fe [ice] #13 [ffffaad04005fd38] ice_shutdown at ffffffffc0477946 [ice] #14 [ffffaad04005fd50] pci_device_shutdown at ffffffff8add58f1 #15 [ffffaad04005fd70] device_shutdown at ffffffff8af05386 #16 [ffffaad04005fd98] kernel_restart at ffffffff8a92a870 #17 [ffffaad04005fda8] __do_sys_reboot at ffffffff8a92abd6 #18 [ffffaad04005fee0] do_syscall_64 at ffffffff8b317159 #19 [ffffaad04005ff08] __context_tracking_enter at ffffffff8b31b6fc #20 [ffffaad04005ff18] syscall_exit_to_user_mode at ffffffff8b31b50d #21 [ffffaad04005ff28] do_syscall_64 at ffffffff8b317169 #22 [ffffaad04005ff50] entry_SYSCALL_64_after_hwframe at ffffffff8b40009b RIP: 00007f1baa5c13d7 RSP: 00007fffbcc55a98 RFLAGS: 00000202 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1baa5c13d7 RDX: 0000000001234567 RSI: 0000000028121969 RDI: 00000000fee1dead RBP: 00007fffbcc55ca0 R8: 0000000000000000 R9: 00007fffbcc54e90 R10: 00007fffbcc55050 R11: 0000000000000202 R12: 0000000000000005 R13: 0000000000000000 R14: 00007fffbcc55af0 R15: 0000000000000000 ORIG_RAX: 00000000000000a9 CS: 0033 SS: 002b During reboot all drivers PM shutdown callbacks are invoked. In iavf_shutdown() the adapter state is changed to __IAVF_REMOVE. In ice_shutdown() the call chain above is executed, which at some point calls iavf_remove(). However iavf_remove() expects the VF to be in one of the states __IAVF_RUNNING, __IAVF_DOWN or __IAVF_INIT_FAILED. If that's not the case it sleeps forever. So if iavf_shutdown() gets invoked before iavf_remove() the system will hang indefinitely because the adapter is already in state __IAVF_REMOVE. Fix this by returning from iavf_remove() if the state is __IAVF_REMOVE, as we already went through iavf_shutdown(). Fixes: 974578017fc1 ("iavf: Add waiting so the port is initialized in remove") Fixes: a8417330f8a5 ("iavf: Fix race condition between iavf_shutdown and iavf_remove") Reported-by: Marius Cornea <mcornea@redhat.com> Signed-off-by: Stefan Assmann <sassmann@kpanic.de> --- v2: return instead of breaking the while (1) loop This avoids going through remove code twice and is how things worked before a8417330f8a5. drivers/net/ethernet/intel/iavf/iavf_main.c | 5 +++++ 1 file changed, 5 insertions(+)