diff mbox series

net/mlx5: Fix variable not being completed when function returns

Message ID 20250106091426.256243-1-zhaochenguang@kylinos.cn (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series net/mlx5: Fix variable not being completed when function returns | expand

Checks

Context Check Description
netdev/series_format warning Single patches do not need cover letters; Target tree name not specified in the subject
netdev/tree_selection success Guessed tree name to be net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1 this patch: 1
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 9 of 9 maintainers
netdev/build_clang success Errors and warnings before: 2 this patch: 2
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 1 this patch: 1
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 7 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2025-01-06--15-00 (tests: 887)

Commit Message

Chenguang Zhao Jan. 6, 2025, 9:14 a.m. UTC
The cmd_work_handler function returns from the child function
    cmd_alloc_index because the allocate command entry fails,
    Before returning, there is no complete ent->slotted.

    The patch fixes it.

	Trace:

     mlx5_core 0000:01:00.0: cmd_work_handler:877:(pid 3880418): failed to
	  allocate command entry
     INFO: task kworker/13:2:4055883 blocked for more than 120 seconds.
           Not tainted 4.19.90-25.44.v2101.ky10.aarch64 #1
     "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
	  this message.
     kworker/13:2    D    0 4055883      2 0x00000228
     Workqueue: events mlx5e_tx_dim_work [mlx5_core]
     Call trace:
      __switch_to+0xe8/0x150
      __schedule+0x2a8/0x9b8
      schedule+0x2c/0x88
      schedule_timeout+0x204/0x478
      wait_for_common+0x154/0x250
      wait_for_completion+0x28/0x38
      cmd_exec+0x7a0/0xa00 [mlx5_core]
      mlx5_cmd_exec+0x54/0x80 [mlx5_core]
      mlx5_core_modify_cq+0x6c/0x80 [mlx5_core]
      mlx5_core_modify_cq_moderation+0xa0/0xb8 [mlx5_core]
      mlx5e_tx_dim_work+0x54/0x68 [mlx5_core]
      process_one_work+0x1b0/0x448
      worker_thread+0x54/0x468
      kthread+0x134/0x138
      ret_from_fork+0x10/0x18

Signed-off-by: Chenguang Zhao <zhaochenguang@kylinos.cn>
---
 drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Moshe Shemesh Jan. 7, 2025, 10:02 a.m. UTC | #1
On 1/6/2025 11:14 AM, Chenguang Zhao wrote:
>      The cmd_work_handler function returns from the child function
>      cmd_alloc_index because the allocate command entry fails,
>      Before returning, there is no complete ent->slotted.

nit : s/Before/before

> 
>      The patch fixes it.
> 
> 	Trace:
> 
>       mlx5_core 0000:01:00.0: cmd_work_handler:877:(pid 3880418): failed to
> 	  allocate command entry
>       INFO: task kworker/13:2:4055883 blocked for more than 120 seconds.
>             Not tainted 4.19.90-25.44.v2101.ky10.aarch64 #1
>       "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> 	  this message.
>       kworker/13:2    D    0 4055883      2 0x00000228
>       Workqueue: events mlx5e_tx_dim_work [mlx5_core]
>       Call trace:
>        __switch_to+0xe8/0x150
>        __schedule+0x2a8/0x9b8
>        schedule+0x2c/0x88
>        schedule_timeout+0x204/0x478
>        wait_for_common+0x154/0x250
>        wait_for_completion+0x28/0x38
>        cmd_exec+0x7a0/0xa00 [mlx5_core]
>        mlx5_cmd_exec+0x54/0x80 [mlx5_core]
>        mlx5_core_modify_cq+0x6c/0x80 [mlx5_core]
>        mlx5_core_modify_cq_moderation+0xa0/0xb8 [mlx5_core]
>        mlx5e_tx_dim_work+0x54/0x68 [mlx5_core]
>        process_one_work+0x1b0/0x448
>        worker_thread+0x54/0x468
>        kthread+0x134/0x138
>        ret_from_fork+0x10/0x18
> 
> Signed-off-by: Chenguang Zhao <zhaochenguang@kylinos.cn>

Thanks for your fix!
Please add the following fixes tag:
Fixes: 485d65e13571 ("net/mlx5: Add a timeout to acquire the command 
queue semaphore")
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>

> ---
>   drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
> index 6bd8a18e3af3..e733b81e18a2 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
> @@ -1013,6 +1013,7 @@ static void cmd_work_handler(struct work_struct *work)
>   				complete(&ent->done);
>   			}
>   			up(&cmd->vars.sem);
> +			complete(&ent->slotted);
>   			return;
>   		}
>   	} else {
diff mbox series

Patch

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
index 6bd8a18e3af3..e733b81e18a2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
@@ -1013,6 +1013,7 @@  static void cmd_work_handler(struct work_struct *work)
 				complete(&ent->done);
 			}
 			up(&cmd->vars.sem);
+			complete(&ent->slotted);
 			return;
 		}
 	} else {