diff mbox series

[net,v2] net: atlantic: fix warning during hot unplug

Message ID 20250203143604.24930-3-mail@jakemoroni.com (mailing list archive)
State Accepted
Commit 028676bb189ed6d1b550a0fc570a9d695b6acfd3
Delegated to: Netdev Maintainers
Headers show
Series [net,v2] net: atlantic: fix warning during hot unplug | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 10 of 10 maintainers
netdev/build_clang success Errors and warnings before: 2 this patch: 2
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 10 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2025-02-04--00-00 (tests: 886)

Commit Message

Jacob Moroni Feb. 3, 2025, 2:36 p.m. UTC
Firmware deinitialization performs MMIO accesses which are not
necessary if the device has already been removed. In some cases,
these accesses happen via readx_poll_timeout_atomic which ends up
timing out, resulting in a warning at hw_atl2_utils_fw.c:112:

[  104.595913] Call Trace:
[  104.595915]  <TASK>
[  104.595918]  ? show_regs+0x6c/0x80
[  104.595923]  ? __warn+0x8d/0x150
[  104.595925]  ? aq_a2_fw_deinit+0xcf/0xe0 [atlantic]
[  104.595934]  ? report_bug+0x182/0x1b0
[  104.595938]  ? handle_bug+0x6e/0xb0
[  104.595940]  ? exc_invalid_op+0x18/0x80
[  104.595942]  ? asm_exc_invalid_op+0x1b/0x20
[  104.595944]  ? aq_a2_fw_deinit+0xcf/0xe0 [atlantic]
[  104.595952]  ? aq_a2_fw_deinit+0xcf/0xe0 [atlantic]
[  104.595959]  aq_nic_deinit.part.0+0xbd/0xf0 [atlantic]
[  104.595964]  aq_nic_deinit+0x17/0x30 [atlantic]
[  104.595970]  aq_ndev_close+0x2b/0x40 [atlantic]
[  104.595975]  __dev_close_many+0xad/0x160
[  104.595978]  dev_close_many+0x99/0x170
[  104.595979]  unregister_netdevice_many_notify+0x18b/0xb20
[  104.595981]  ? __call_rcu_common+0xcd/0x700
[  104.595984]  unregister_netdevice_queue+0xc6/0x110
[  104.595986]  unregister_netdev+0x1c/0x30
[  104.595988]  aq_pci_remove+0xb1/0xc0 [atlantic]

Fix this by skipping firmware deinitialization altogether if the
PCI device is no longer present.

Tested with an AQC113 attached via Thunderbolt by performing
repeated unplug cycles while traffic was running via iperf.

Fixes: 97bde5c4f909 ("net: ethernet: aquantia: Support for NIC-specific code")
Signed-off-by: Jacob Moroni <mail@jakemoroni.com>
Reviewed-by: Igor Russkikh <irusskikh@marvell.com>
---
 drivers/net/ethernet/aquantia/atlantic/aq_nic.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Simon Horman Feb. 4, 2025, 10:54 a.m. UTC | #1
On Mon, Feb 03, 2025 at 09:36:05AM -0500, Jacob Moroni wrote:
> Firmware deinitialization performs MMIO accesses which are not
> necessary if the device has already been removed. In some cases,
> these accesses happen via readx_poll_timeout_atomic which ends up
> timing out, resulting in a warning at hw_atl2_utils_fw.c:112:
> 
> [  104.595913] Call Trace:
> [  104.595915]  <TASK>
> [  104.595918]  ? show_regs+0x6c/0x80
> [  104.595923]  ? __warn+0x8d/0x150
> [  104.595925]  ? aq_a2_fw_deinit+0xcf/0xe0 [atlantic]
> [  104.595934]  ? report_bug+0x182/0x1b0
> [  104.595938]  ? handle_bug+0x6e/0xb0
> [  104.595940]  ? exc_invalid_op+0x18/0x80
> [  104.595942]  ? asm_exc_invalid_op+0x1b/0x20
> [  104.595944]  ? aq_a2_fw_deinit+0xcf/0xe0 [atlantic]
> [  104.595952]  ? aq_a2_fw_deinit+0xcf/0xe0 [atlantic]
> [  104.595959]  aq_nic_deinit.part.0+0xbd/0xf0 [atlantic]
> [  104.595964]  aq_nic_deinit+0x17/0x30 [atlantic]
> [  104.595970]  aq_ndev_close+0x2b/0x40 [atlantic]
> [  104.595975]  __dev_close_many+0xad/0x160
> [  104.595978]  dev_close_many+0x99/0x170
> [  104.595979]  unregister_netdevice_many_notify+0x18b/0xb20
> [  104.595981]  ? __call_rcu_common+0xcd/0x700
> [  104.595984]  unregister_netdevice_queue+0xc6/0x110
> [  104.595986]  unregister_netdev+0x1c/0x30
> [  104.595988]  aq_pci_remove+0xb1/0xc0 [atlantic]
> 
> Fix this by skipping firmware deinitialization altogether if the
> PCI device is no longer present.
> 
> Tested with an AQC113 attached via Thunderbolt by performing
> repeated unplug cycles while traffic was running via iperf.
> 
> Fixes: 97bde5c4f909 ("net: ethernet: aquantia: Support for NIC-specific code")
> Signed-off-by: Jacob Moroni <mail@jakemoroni.com>
> Reviewed-by: Igor Russkikh <irusskikh@marvell.com>

Thanks for addressing my review of v1.

Reviewed-by: Simon Horman <horms@kernel.org>
patchwork-bot+netdevbpf@kernel.org Feb. 4, 2025, 10:20 p.m. UTC | #2
Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Mon,  3 Feb 2025 09:36:05 -0500 you wrote:
> Firmware deinitialization performs MMIO accesses which are not
> necessary if the device has already been removed. In some cases,
> these accesses happen via readx_poll_timeout_atomic which ends up
> timing out, resulting in a warning at hw_atl2_utils_fw.c:112:
> 
> [  104.595913] Call Trace:
> [  104.595915]  <TASK>
> [  104.595918]  ? show_regs+0x6c/0x80
> [  104.595923]  ? __warn+0x8d/0x150
> [  104.595925]  ? aq_a2_fw_deinit+0xcf/0xe0 [atlantic]
> [  104.595934]  ? report_bug+0x182/0x1b0
> [  104.595938]  ? handle_bug+0x6e/0xb0
> [  104.595940]  ? exc_invalid_op+0x18/0x80
> [  104.595942]  ? asm_exc_invalid_op+0x1b/0x20
> [  104.595944]  ? aq_a2_fw_deinit+0xcf/0xe0 [atlantic]
> [  104.595952]  ? aq_a2_fw_deinit+0xcf/0xe0 [atlantic]
> [  104.595959]  aq_nic_deinit.part.0+0xbd/0xf0 [atlantic]
> [  104.595964]  aq_nic_deinit+0x17/0x30 [atlantic]
> [  104.595970]  aq_ndev_close+0x2b/0x40 [atlantic]
> [  104.595975]  __dev_close_many+0xad/0x160
> [  104.595978]  dev_close_many+0x99/0x170
> [  104.595979]  unregister_netdevice_many_notify+0x18b/0xb20
> [  104.595981]  ? __call_rcu_common+0xcd/0x700
> [  104.595984]  unregister_netdevice_queue+0xc6/0x110
> [  104.595986]  unregister_netdev+0x1c/0x30
> [  104.595988]  aq_pci_remove+0xb1/0xc0 [atlantic]
> 
> [...]

Here is the summary with links:
  - [net,v2] net: atlantic: fix warning during hot unplug
    https://git.kernel.org/netdev/net/c/028676bb189e

You are awesome, thank you!
diff mbox series

Patch

diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
index fe0e3e2a8117..71e50fc65c14 100644
--- a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
@@ -1441,7 +1441,9 @@  void aq_nic_deinit(struct aq_nic_s *self, bool link_down)
 	aq_ptp_ring_free(self);
 	aq_ptp_free(self);
 
-	if (likely(self->aq_fw_ops->deinit) && link_down) {
+	/* May be invoked during hot unplug. */
+	if (pci_device_is_present(self->pdev) &&
+	    likely(self->aq_fw_ops->deinit) && link_down) {
 		mutex_lock(&self->fwreq_mutex);
 		self->aq_fw_ops->deinit(self->aq_hw);
 		mutex_unlock(&self->fwreq_mutex);