diff mbox series

[v2] net/mlx5: Fix variable not being completed when function returns

Message ID 20250108030009.68520-1-zhaochenguang@kylinos.cn (mailing list archive)
State Superseded
Headers show
Series [v2] net/mlx5: Fix variable not being completed when function returns | expand

Commit Message

Chenguang Zhao Jan. 8, 2025, 3 a.m. UTC
The cmd_work_handler function returns from the child function
    cmd_alloc_index because the allocate command entry fails,
    Before returning, there is no complete ent->slotted.

    The patch fixes it.

     mlx5_core 0000:01:00.0: cmd_work_handler:877:(pid 3880418): failed to allocate command entry
     INFO: task kworker/13:2:4055883 blocked for more than 120 seconds.
           Not tainted 4.19.90-25.44.v2101.ky10.aarch64 #1
     "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
     kworker/13:2    D    0 4055883      2 0x00000228
     Workqueue: events mlx5e_tx_dim_work [mlx5_core]
     Call trace:
      __switch_to+0xe8/0x150
      __schedule+0x2a8/0x9b8
      schedule+0x2c/0x88
      schedule_timeout+0x204/0x478
      wait_for_common+0x154/0x250
      wait_for_completion+0x28/0x38
      cmd_exec+0x7a0/0xa00 [mlx5_core]
      mlx5_cmd_exec+0x54/0x80 [mlx5_core]
      mlx5_core_modify_cq+0x6c/0x80 [mlx5_core]
      mlx5_core_modify_cq_moderation+0xa0/0xb8 [mlx5_core]
      mlx5e_tx_dim_work+0x54/0x68 [mlx5_core]
      process_one_work+0x1b0/0x448
      worker_thread+0x54/0x468
      kthread+0x134/0x138
      ret_from_fork+0x10/0x18

    Fixes: 485d65e13571 ("net/mlx5: Add a timeout to acquire the command queue semaphore")

Signed-off-by: Chenguang Zhao zhaochenguang@kylinos.cn
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
---
v2:
	add Fixes tag and Reviewed-by
---
 drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Tariq Toukan Jan. 9, 2025, 1:25 p.m. UTC | #1
On 08/01/2025 5:00, Chenguang Zhao wrote:
>      The cmd_work_handler function returns from the child function
>      cmd_alloc_index because the allocate command entry fails,
>      Before returning, there is no complete ent->slotted.
> 
>      The patch fixes it.
> 

Unnecessary indentation.

>       mlx5_core 0000:01:00.0: cmd_work_handler:877:(pid 3880418): failed to allocate command entry
>       INFO: task kworker/13:2:4055883 blocked for more than 120 seconds.
>             Not tainted 4.19.90-25.44.v2101.ky10.aarch64 #1
>       "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>       kworker/13:2    D    0 4055883      2 0x00000228
>       Workqueue: events mlx5e_tx_dim_work [mlx5_core]
>       Call trace:
>        __switch_to+0xe8/0x150
>        __schedule+0x2a8/0x9b8
>        schedule+0x2c/0x88
>        schedule_timeout+0x204/0x478
>        wait_for_common+0x154/0x250
>        wait_for_completion+0x28/0x38
>        cmd_exec+0x7a0/0xa00 [mlx5_core]
>        mlx5_cmd_exec+0x54/0x80 [mlx5_core]
>        mlx5_core_modify_cq+0x6c/0x80 [mlx5_core]
>        mlx5_core_modify_cq_moderation+0xa0/0xb8 [mlx5_core]
>        mlx5e_tx_dim_work+0x54/0x68 [mlx5_core]
>        process_one_work+0x1b0/0x448
>        worker_thread+0x54/0x468
>        kthread+0x134/0x138
>        ret_from_fork+0x10/0x18
> 
>      Fixes: 485d65e13571 ("net/mlx5: Add a timeout to acquire the command queue semaphore")

Also for the Fixes tag.

Other than that:
Acked-by: Tariq Toukan <tariqt@nvidia.com>


> 
> Signed-off-by: Chenguang Zhao zhaochenguang@kylinos.cn
> Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
> ---
> v2:
> 	add Fixes tag and Reviewed-by
> ---
>   drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
> index 6bd8a18e3af3..e733b81e18a2 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
> @@ -1013,6 +1013,7 @@ static void cmd_work_handler(struct work_struct *work)
>   				complete(&ent->done);
>   			}
>   			up(&cmd->vars.sem);
> +			complete(&ent->slotted);
>   			return;
>   		}
>   	} else {
Jakub Kicinski Jan. 9, 2025, 4:29 p.m. UTC | #2
On Thu, 9 Jan 2025 15:25:36 +0200 Tariq Toukan wrote:
> >       mlx5_core 0000:01:00.0: cmd_work_handler:877:(pid 3880418): failed to allocate command entry
> >       INFO: task kworker/13:2:4055883 blocked for more than 120 seconds.
> >             Not tainted 4.19.90-25.44.v2101.ky10.aarch64 #1
> >       "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >       kworker/13:2    D    0 4055883      2 0x00000228
> >       Workqueue: events mlx5e_tx_dim_work [mlx5_core]
> >       Call trace:
> >        __switch_to+0xe8/0x150
> >        __schedule+0x2a8/0x9b8
> >        schedule+0x2c/0x88
> >        schedule_timeout+0x204/0x478
> >        wait_for_common+0x154/0x250
> >        wait_for_completion+0x28/0x38
> >        cmd_exec+0x7a0/0xa00 [mlx5_core]
> >        mlx5_cmd_exec+0x54/0x80 [mlx5_core]
> >        mlx5_core_modify_cq+0x6c/0x80 [mlx5_core]
> >        mlx5_core_modify_cq_moderation+0xa0/0xb8 [mlx5_core]
> >        mlx5e_tx_dim_work+0x54/0x68 [mlx5_core]
> >        process_one_work+0x1b0/0x448
> >        worker_thread+0x54/0x468
> >        kthread+0x134/0x138
> >        ret_from_fork+0x10/0x18
> > 
> >      Fixes: 485d65e13571 ("net/mlx5: Add a timeout to acquire the command queue semaphore")  
> 
> Also for the Fixes tag.
> 
> Other than that:
> Acked-by: Tariq Toukan <tariqt@nvidia.com>

rewritten the commit message and applied, thanks!
patchwork-bot+netdevbpf@kernel.org Jan. 9, 2025, 4:30 p.m. UTC | #3
Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Wed,  8 Jan 2025 11:00:09 +0800 you wrote:
> The cmd_work_handler function returns from the child function
>     cmd_alloc_index because the allocate command entry fails,
>     Before returning, there is no complete ent->slotted.
> 
>     The patch fixes it.
> 
>      mlx5_core 0000:01:00.0: cmd_work_handler:877:(pid 3880418): failed to allocate command entry
>      INFO: task kworker/13:2:4055883 blocked for more than 120 seconds.
>            Not tainted 4.19.90-25.44.v2101.ky10.aarch64 #1
>      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>      kworker/13:2    D    0 4055883      2 0x00000228
>      Workqueue: events mlx5e_tx_dim_work [mlx5_core]
>      Call trace:
>       __switch_to+0xe8/0x150
>       __schedule+0x2a8/0x9b8
>       schedule+0x2c/0x88
>       schedule_timeout+0x204/0x478
>       wait_for_common+0x154/0x250
>       wait_for_completion+0x28/0x38
>       cmd_exec+0x7a0/0xa00 [mlx5_core]
>       mlx5_cmd_exec+0x54/0x80 [mlx5_core]
>       mlx5_core_modify_cq+0x6c/0x80 [mlx5_core]
>       mlx5_core_modify_cq_moderation+0xa0/0xb8 [mlx5_core]
>       mlx5e_tx_dim_work+0x54/0x68 [mlx5_core]
>       process_one_work+0x1b0/0x448
>       worker_thread+0x54/0x468
>       kthread+0x134/0x138
>       ret_from_fork+0x10/0x18
> 
> [...]

Here is the summary with links:
  - [v2] net/mlx5: Fix variable not being completed when function returns
    https://git.kernel.org/netdev/net/c/0e2909c6bec9

You are awesome, thank you!
diff mbox series

Patch

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
index 6bd8a18e3af3..e733b81e18a2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
@@ -1013,6 +1013,7 @@  static void cmd_work_handler(struct work_struct *work)
 				complete(&ent->done);
 			}
 			up(&cmd->vars.sem);
+			complete(&ent->slotted);
 			return;
 		}
 	} else {