diff mbox series

[net-next,1/7] eth: tg3: fix calling napi_enable() in atomic context

Message ID 20250121221519.392014-2-kuba@kernel.org (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series eth: fix calling napi_enable() in atomic context | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 9 of 9 maintainers
netdev/build_clang success Errors and warnings before: 1 this patch: 1
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 171 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 2 this patch: 2
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2025-01-22--06-00 (tests: 885)

Commit Message

Jakub Kicinski Jan. 21, 2025, 10:15 p.m. UTC
tg3 has a spin lock protecting most of the config,
switch to taking netdev_lock() explicitly on enable/start
paths. Disable/stop paths seem to not be under the spin
lock (since napi_disable() already couldn't sleep),
so leave that side as is.

tg3_restart_hw() releases and re-takes the spin lock,
we need to do the same because dev_close() needs to
take netdev_lock().

Fixes: 413f0271f396 ("net: protect NAPI enablement with netdev_lock()")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/dcfd56bc-de32-4b11-9e19-d8bd1543745d@stanley.mountain
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
CC: pavan.chebbi@broadcom.com
CC: mchan@broadcom.com
CC: kuniyu@amazon.com
CC: romieu@fr.zoreil.com
---
 drivers/net/ethernet/broadcom/tg3.c | 33 +++++++++++++++++++++++++----
 1 file changed, 29 insertions(+), 4 deletions(-)

Comments

Michael Chan Jan. 21, 2025, 11:06 p.m. UTC | #1
On Tue, Jan 21, 2025 at 2:15 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> tg3 has a spin lock protecting most of the config,
> switch to taking netdev_lock() explicitly on enable/start
> paths. Disable/stop paths seem to not be under the spin
> lock (since napi_disable() already couldn't sleep),

You meant napi_disable() could sleep, right?
Jakub Kicinski Jan. 21, 2025, 11:49 p.m. UTC | #2
On Tue, 21 Jan 2025 15:06:36 -0800 Michael Chan wrote:
> On Tue, Jan 21, 2025 at 2:15 PM Jakub Kicinski <kuba@kernel.org> wrote:
> >
> > tg3 has a spin lock protecting most of the config,
> > switch to taking netdev_lock() explicitly on enable/start
> > paths. Disable/stop paths seem to not be under the spin
> > lock (since napi_disable() already couldn't sleep),  
> 
> You meant napi_disable() could sleep, right?

Yes :)
Michael Chan Jan. 22, 2025, 7:41 a.m. UTC | #3
On Tue, Jan 21, 2025 at 2:15 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> tg3 has a spin lock protecting most of the config,
> switch to taking netdev_lock() explicitly on enable/start
> paths. Disable/stop paths seem to not be under the spin
> lock (since napi_disable() already couldn't sleep),
> so leave that side as is.
>
> tg3_restart_hw() releases and re-takes the spin lock,
> we need to do the same because dev_close() needs to
> take netdev_lock().

The patch looks good to me.  One minor suggestion is to add the Sparse macros:

__releases(tp->dev->lock)
__acquires(tp->dev->lock)

to tg3_restart_hw() since we already do the same thing there for tp->lock.
Eric Dumazet Jan. 22, 2025, 2:10 p.m. UTC | #4
On Tue, Jan 21, 2025 at 11:15 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> tg3 has a spin lock protecting most of the config,
> switch to taking netdev_lock() explicitly on enable/start
> paths. Disable/stop paths seem to not be under the spin
> lock (since napi_disable() already couldn't sleep),
> so leave that side as is.
>
> tg3_restart_hw() releases and re-takes the spin lock,
> we need to do the same because dev_close() needs to
> take netdev_lock().
>
> Fixes: 413f0271f396 ("net: protect NAPI enablement with netdev_lock()")
> Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
> Link: https://lore.kernel.org/dcfd56bc-de32-4b11-9e19-d8bd1543745d@stanley.mountain
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---

Reviewed-by: Eric Dumazet <edumazet@google.com>
Jakub Kicinski Jan. 22, 2025, 2:27 p.m. UTC | #5
On Tue, 21 Jan 2025 23:41:53 -0800 Michael Chan wrote:
> One minor suggestion is to add the Sparse macros:
> 
> __releases(tp->dev->lock)
> __acquires(tp->dev->lock)
> 
> to tg3_restart_hw() since we already do the same thing there for tp->lock.

Does anyone actually use the sparse lock checking?
IIUC it's disabled by default, it's too noisy.
netdev_lock / netdev_unlock don't have the annotations.
Jakub Kicinski Jan. 22, 2025, 11:48 p.m. UTC | #6
On Wed, 22 Jan 2025 06:27:13 -0800 Jakub Kicinski wrote:
> netdev_lock / netdev_unlock don't have the annotations.

Looks like sparse -Wcontext is happy either way, so I'll add it.
Simon Horman Jan. 23, 2025, 10:48 a.m. UTC | #7
On Wed, Jan 22, 2025 at 03:48:33PM -0800, Jakub Kicinski wrote:
> On Wed, 22 Jan 2025 06:27:13 -0800 Jakub Kicinski wrote:
> > netdev_lock / netdev_unlock don't have the annotations.
> 
> Looks like sparse -Wcontext is happy either way, so I'll add it.

FWIIW, I do notice when these annotations get flagged by Sparse.
Well, at lest some of the time. Though I do confess that I'm not
always able to figure out a satisfactory resolution.
diff mbox series

Patch

diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
index 9cc8db10a8d6..4aebba80bcd2 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -7424,7 +7424,7 @@  static void tg3_napi_enable(struct tg3 *tp)
 
 	for (i = 0; i < tp->irq_cnt; i++) {
 		tnapi = &tp->napi[i];
-		napi_enable(&tnapi->napi);
+		napi_enable_locked(&tnapi->napi);
 		if (tnapi->tx_buffers) {
 			netif_queue_set_napi(tp->dev, txq_idx,
 					     NETDEV_QUEUE_TYPE_TX,
@@ -7445,9 +7445,10 @@  static void tg3_napi_init(struct tg3 *tp)
 	int i;
 
 	for (i = 0; i < tp->irq_cnt; i++) {
-		netif_napi_add(tp->dev, &tp->napi[i].napi,
-			       i ? tg3_poll_msix : tg3_poll);
-		netif_napi_set_irq(&tp->napi[i].napi, tp->napi[i].irq_vec);
+		netif_napi_add_locked(tp->dev, &tp->napi[i].napi,
+				      i ? tg3_poll_msix : tg3_poll);
+		netif_napi_set_irq_locked(&tp->napi[i].napi,
+					  tp->napi[i].irq_vec);
 	}
 }
 
@@ -11271,7 +11272,9 @@  static int tg3_restart_hw(struct tg3 *tp, bool reset_phy)
 		tg3_timer_stop(tp);
 		tp->irq_sync = 0;
 		tg3_napi_enable(tp);
+		netdev_unlock(tp->dev);
 		dev_close(tp->dev);
+		netdev_lock(tp->dev);
 		tg3_full_lock(tp, 0);
 	}
 	return err;
@@ -11299,6 +11302,7 @@  static void tg3_reset_task(struct work_struct *work)
 
 	tg3_netif_stop(tp);
 
+	netdev_lock(tp->dev);
 	tg3_full_lock(tp, 1);
 
 	if (tg3_flag(tp, TX_RECOVERY_PENDING)) {
@@ -11318,12 +11322,14 @@  static void tg3_reset_task(struct work_struct *work)
 		 * call cancel_work_sync() and wait forever.
 		 */
 		tg3_flag_clear(tp, RESET_TASK_PENDING);
+		netdev_unlock(tp->dev);
 		dev_close(tp->dev);
 		goto out;
 	}
 
 	tg3_netif_start(tp);
 	tg3_full_unlock(tp);
+	netdev_unlock(tp->dev);
 	tg3_phy_start(tp);
 	tg3_flag_clear(tp, RESET_TASK_PENDING);
 out:
@@ -11683,9 +11689,11 @@  static int tg3_start(struct tg3 *tp, bool reset_phy, bool test_irq,
 	if (err)
 		goto out_ints_fini;
 
+	netdev_lock(dev);
 	tg3_napi_init(tp);
 
 	tg3_napi_enable(tp);
+	netdev_unlock(dev);
 
 	for (i = 0; i < tp->irq_cnt; i++) {
 		err = tg3_request_irq(tp, i);
@@ -12569,6 +12577,7 @@  static int tg3_set_ringparam(struct net_device *dev,
 		irq_sync = 1;
 	}
 
+	netdev_lock(dev);
 	tg3_full_lock(tp, irq_sync);
 
 	tp->rx_pending = ering->rx_pending;
@@ -12597,6 +12606,7 @@  static int tg3_set_ringparam(struct net_device *dev,
 	}
 
 	tg3_full_unlock(tp);
+	netdev_unlock(dev);
 
 	if (irq_sync && !err)
 		tg3_phy_start(tp);
@@ -12678,6 +12688,7 @@  static int tg3_set_pauseparam(struct net_device *dev, struct ethtool_pauseparam
 			irq_sync = 1;
 		}
 
+		netdev_lock(dev);
 		tg3_full_lock(tp, irq_sync);
 
 		if (epause->autoneg)
@@ -12707,6 +12718,7 @@  static int tg3_set_pauseparam(struct net_device *dev, struct ethtool_pauseparam
 		}
 
 		tg3_full_unlock(tp);
+		netdev_unlock(dev);
 	}
 
 	tp->phy_flags |= TG3_PHYFLG_USER_CONFIGURED;
@@ -13911,6 +13923,7 @@  static void tg3_self_test(struct net_device *dev, struct ethtool_test *etest,
 			data[TG3_INTERRUPT_TEST] = 1;
 		}
 
+		netdev_lock(dev);
 		tg3_full_lock(tp, 0);
 
 		tg3_halt(tp, RESET_KIND_SHUTDOWN, 1);
@@ -13922,6 +13935,7 @@  static void tg3_self_test(struct net_device *dev, struct ethtool_test *etest,
 		}
 
 		tg3_full_unlock(tp);
+		netdev_unlock(dev);
 
 		if (irq_sync && !err2)
 			tg3_phy_start(tp);
@@ -14365,6 +14379,7 @@  static int tg3_change_mtu(struct net_device *dev, int new_mtu)
 
 	tg3_set_mtu(dev, tp, new_mtu);
 
+	netdev_lock(dev);
 	tg3_full_lock(tp, 1);
 
 	tg3_halt(tp, RESET_KIND_SHUTDOWN, 1);
@@ -14384,6 +14399,7 @@  static int tg3_change_mtu(struct net_device *dev, int new_mtu)
 		tg3_netif_start(tp);
 
 	tg3_full_unlock(tp);
+	netdev_unlock(dev);
 
 	if (!err)
 		tg3_phy_start(tp);
@@ -18164,6 +18180,7 @@  static int tg3_resume(struct device *device)
 
 	netif_device_attach(dev);
 
+	netdev_lock(dev);
 	tg3_full_lock(tp, 0);
 
 	tg3_ape_driver_state_change(tp, RESET_KIND_INIT);
@@ -18180,6 +18197,7 @@  static int tg3_resume(struct device *device)
 
 out:
 	tg3_full_unlock(tp);
+	netdev_unlock(dev);
 
 	if (!err)
 		tg3_phy_start(tp);
@@ -18260,7 +18278,9 @@  static pci_ers_result_t tg3_io_error_detected(struct pci_dev *pdev,
 done:
 	if (state == pci_channel_io_perm_failure) {
 		if (netdev) {
+			netdev_lock(netdev);
 			tg3_napi_enable(tp);
+			netdev_unlock(netdev);
 			dev_close(netdev);
 		}
 		err = PCI_ERS_RESULT_DISCONNECT;
@@ -18314,7 +18334,9 @@  static pci_ers_result_t tg3_io_slot_reset(struct pci_dev *pdev)
 
 done:
 	if (rc != PCI_ERS_RESULT_RECOVERED && netdev && netif_running(netdev)) {
+		netdev_lock(netdev);
 		tg3_napi_enable(tp);
+		netdev_unlock(netdev);
 		dev_close(netdev);
 	}
 	rtnl_unlock();
@@ -18340,12 +18362,14 @@  static void tg3_io_resume(struct pci_dev *pdev)
 	if (!netdev || !netif_running(netdev))
 		goto done;
 
+	netdev_lock(netdev);
 	tg3_full_lock(tp, 0);
 	tg3_ape_driver_state_change(tp, RESET_KIND_INIT);
 	tg3_flag_set(tp, INIT_COMPLETE);
 	err = tg3_restart_hw(tp, true);
 	if (err) {
 		tg3_full_unlock(tp);
+		netdev_unlock(netdev);
 		netdev_err(netdev, "Cannot restart hardware after reset.\n");
 		goto done;
 	}
@@ -18357,6 +18381,7 @@  static void tg3_io_resume(struct pci_dev *pdev)
 	tg3_netif_start(tp);
 
 	tg3_full_unlock(tp);
+	netdev_unlock(netdev);
 
 	tg3_phy_start(tp);