Message ID | 20250402133123.840173-1-ap420073@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Commit | e4546c6498c68ba65929fcbcf54ff1947fe53f48 |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] eth: bnxt: fix deadlock in the mgmt_ops | expand |
On 04/02, Taehee Yoo wrote: > When queue is being reset, callbacks of mgmt_ops are called by > netdev_nl_bind_rx_doit(). > The netdev_nl_bind_rx_doit() first acquires netdev_lock() and then calls > callbacks. > So, mgmt_ops callbacks should not acquire netdev_lock() internaly. > > The bnxt_queue_{start | stop}() calls napi_{enable | disable}() but they > internally acquire netdev_lock(). > So, deadlock occurs. > > To avoid deadlock, napi_{enable | disable}_locked() should be used > instead. > > Fixes: 413f0271f396 ("net: protect NAPI enablement with netdev_lock()") > Signed-off-by: Taehee Yoo <ap420073@gmail.com> Oh, wow, how did I miss that... Thanks for the fix! Acked-by: Stanislav Fomichev <sdf@fomichev.me>
On Wed, Apr 2, 2025 at 6:31 AM Taehee Yoo <ap420073@gmail.com> wrote: > > When queue is being reset, callbacks of mgmt_ops are called by > netdev_nl_bind_rx_doit(). > The netdev_nl_bind_rx_doit() first acquires netdev_lock() and then calls > callbacks. > So, mgmt_ops callbacks should not acquire netdev_lock() internaly. > > The bnxt_queue_{start | stop}() calls napi_{enable | disable}() but they > internally acquire netdev_lock(). > So, deadlock occurs. > > To avoid deadlock, napi_{enable | disable}_locked() should be used > instead. > > Fixes: 413f0271f396 ("net: protect NAPI enablement with netdev_lock()") > Signed-off-by: Taehee Yoo <ap420073@gmail.com> Thanks. Reviewed-by: Michael Chan <michael.chan@broadcom.com>
On Wed, 2 Apr 2025 13:31:23 +0000 Taehee Yoo wrote: > When queue is being reset, callbacks of mgmt_ops are called by > netdev_nl_bind_rx_doit(). > The netdev_nl_bind_rx_doit() first acquires netdev_lock() and then calls > callbacks. > So, mgmt_ops callbacks should not acquire netdev_lock() internaly. > > The bnxt_queue_{start | stop}() calls napi_{enable | disable}() but they > internally acquire netdev_lock(). > So, deadlock occurs. > > To avoid deadlock, napi_{enable | disable}_locked() should be used > instead. > > Fixes: 413f0271f396 ("net: protect NAPI enablement with netdev_lock()") Too far back, v6.14 doesn't need this fix. I think the Fixes tags should be: Fixes: cae03e5bdd9e ("net: hold netdev instance lock during queue operations") No need to repost, I'll change when applying. Unless I'm wrong.
Hello: This patch was applied to netdev/net.git (main) by Jakub Kicinski <kuba@kernel.org>: On Wed, 2 Apr 2025 13:31:23 +0000 you wrote: > When queue is being reset, callbacks of mgmt_ops are called by > netdev_nl_bind_rx_doit(). > The netdev_nl_bind_rx_doit() first acquires netdev_lock() and then calls > callbacks. > So, mgmt_ops callbacks should not acquire netdev_lock() internaly. > > The bnxt_queue_{start | stop}() calls napi_{enable | disable}() but they > internally acquire netdev_lock(). > So, deadlock occurs. > > [...] Here is the summary with links: - [net] eth: bnxt: fix deadlock in the mgmt_ops https://git.kernel.org/netdev/net/c/e4546c6498c6 You are awesome, thank you!
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 1a70605fad38..28ee12186c37 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -15909,7 +15909,7 @@ static int bnxt_queue_start(struct net_device *dev, void *qmem, int idx) goto err_reset; } - napi_enable(&bnapi->napi); + napi_enable_locked(&bnapi->napi); bnxt_db_nq_arm(bp, &cpr->cp_db, cpr->cp_raw_cons); for (i = 0; i < bp->nr_vnics; i++) { @@ -15931,7 +15931,7 @@ static int bnxt_queue_start(struct net_device *dev, void *qmem, int idx) err_reset: netdev_err(bp->dev, "Unexpected HWRM error during queue start rc: %d\n", rc); - napi_enable(&bnapi->napi); + napi_enable_locked(&bnapi->napi); bnxt_db_nq_arm(bp, &cpr->cp_db, cpr->cp_raw_cons); bnxt_reset_task(bp, true); return rc; @@ -15971,7 +15971,7 @@ static int bnxt_queue_stop(struct net_device *dev, void *qmem, int idx) * completion is handled in NAPI to guarantee no more DMA on that ring * after seeing the completion. */ - napi_disable(&bnapi->napi); + napi_disable_locked(&bnapi->napi); if (bp->tph_mode) { bnxt_hwrm_cp_ring_free(bp, rxr->rx_cpr);
When queue is being reset, callbacks of mgmt_ops are called by netdev_nl_bind_rx_doit(). The netdev_nl_bind_rx_doit() first acquires netdev_lock() and then calls callbacks. So, mgmt_ops callbacks should not acquire netdev_lock() internaly. The bnxt_queue_{start | stop}() calls napi_{enable | disable}() but they internally acquire netdev_lock(). So, deadlock occurs. To avoid deadlock, napi_{enable | disable}_locked() should be used instead. Fixes: 413f0271f396 ("net: protect NAPI enablement with netdev_lock()") Signed-off-by: Taehee Yoo <ap420073@gmail.com> --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)