Message ID | 20250106091426.256243-1-zhaochenguang@kylinos.cn (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | net/mlx5: Fix variable not being completed when function returns | expand |
On 1/6/2025 11:14 AM, Chenguang Zhao wrote: > The cmd_work_handler function returns from the child function > cmd_alloc_index because the allocate command entry fails, > Before returning, there is no complete ent->slotted. nit : s/Before/before > > The patch fixes it. > > Trace: > > mlx5_core 0000:01:00.0: cmd_work_handler:877:(pid 3880418): failed to > allocate command entry > INFO: task kworker/13:2:4055883 blocked for more than 120 seconds. > Not tainted 4.19.90-25.44.v2101.ky10.aarch64 #1 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables > this message. > kworker/13:2 D 0 4055883 2 0x00000228 > Workqueue: events mlx5e_tx_dim_work [mlx5_core] > Call trace: > __switch_to+0xe8/0x150 > __schedule+0x2a8/0x9b8 > schedule+0x2c/0x88 > schedule_timeout+0x204/0x478 > wait_for_common+0x154/0x250 > wait_for_completion+0x28/0x38 > cmd_exec+0x7a0/0xa00 [mlx5_core] > mlx5_cmd_exec+0x54/0x80 [mlx5_core] > mlx5_core_modify_cq+0x6c/0x80 [mlx5_core] > mlx5_core_modify_cq_moderation+0xa0/0xb8 [mlx5_core] > mlx5e_tx_dim_work+0x54/0x68 [mlx5_core] > process_one_work+0x1b0/0x448 > worker_thread+0x54/0x468 > kthread+0x134/0x138 > ret_from_fork+0x10/0x18 > > Signed-off-by: Chenguang Zhao <zhaochenguang@kylinos.cn> Thanks for your fix! Please add the following fixes tag: Fixes: 485d65e13571 ("net/mlx5: Add a timeout to acquire the command queue semaphore") Reviewed-by: Moshe Shemesh <moshe@nvidia.com> > --- > drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c > index 6bd8a18e3af3..e733b81e18a2 100644 > --- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c > +++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c > @@ -1013,6 +1013,7 @@ static void cmd_work_handler(struct work_struct *work) > complete(&ent->done); > } > up(&cmd->vars.sem); > + complete(&ent->slotted); > return; > } > } else {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c index 6bd8a18e3af3..e733b81e18a2 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c @@ -1013,6 +1013,7 @@ static void cmd_work_handler(struct work_struct *work) complete(&ent->done); } up(&cmd->vars.sem); + complete(&ent->slotted); return; } } else {
The cmd_work_handler function returns from the child function cmd_alloc_index because the allocate command entry fails, Before returning, there is no complete ent->slotted. The patch fixes it. Trace: mlx5_core 0000:01:00.0: cmd_work_handler:877:(pid 3880418): failed to allocate command entry INFO: task kworker/13:2:4055883 blocked for more than 120 seconds. Not tainted 4.19.90-25.44.v2101.ky10.aarch64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kworker/13:2 D 0 4055883 2 0x00000228 Workqueue: events mlx5e_tx_dim_work [mlx5_core] Call trace: __switch_to+0xe8/0x150 __schedule+0x2a8/0x9b8 schedule+0x2c/0x88 schedule_timeout+0x204/0x478 wait_for_common+0x154/0x250 wait_for_completion+0x28/0x38 cmd_exec+0x7a0/0xa00 [mlx5_core] mlx5_cmd_exec+0x54/0x80 [mlx5_core] mlx5_core_modify_cq+0x6c/0x80 [mlx5_core] mlx5_core_modify_cq_moderation+0xa0/0xb8 [mlx5_core] mlx5e_tx_dim_work+0x54/0x68 [mlx5_core] process_one_work+0x1b0/0x448 worker_thread+0x54/0x468 kthread+0x134/0x138 ret_from_fork+0x10/0x18 Signed-off-by: Chenguang Zhao <zhaochenguang@kylinos.cn> --- drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 1 + 1 file changed, 1 insertion(+)