diff mbox series

[net] net/mlx5: Fix error handling in mlx5_init_one_light()

Message ID a2bb6a55-5415-4c15-bee9-9e63f4b6a339@moroto.mountain (mailing list archive)
State Rejected
Delegated to: Netdev Maintainers
Headers show
Series [net] net/mlx5: Fix error handling in mlx5_init_one_light() | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 926 this patch: 926
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 10 of 10 maintainers
netdev/build_clang success Errors and warnings before: 937 this patch: 937
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 937 this patch: 937
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 29 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-05-10--18-00 (tests: 1014)

Commit Message

Dan Carpenter May 9, 2024, 11 a.m. UTC
If mlx5_query_hca_caps_light() fails then calling devl_unregister() or
devl_unlock() is a bug.  It's not registered and it's not locked.  That
will trigger a stack trace in this case because devl_unregister() checks
both those things at the start of the function.

If mlx5_devlink_params_register() fails then this code will call
devl_unregister() and devl_unlock() twice which will again lead to a
stack trace or possibly something worse as well.

Fixes: bf729988303a ("net/mlx5: Restore mistakenly dropped parts in register devlink flow")
Fixes: c6e77aa9dd82 ("net/mlx5: Register devlink first under devlink lock")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
---
 drivers/net/ethernet/mellanox/mlx5/core/main.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

Comments

Larysa Zaremba May 10, 2024, 6:44 a.m. UTC | #1
On Thu, May 09, 2024 at 02:00:18PM +0300, Dan Carpenter wrote:
> If mlx5_query_hca_caps_light() fails then calling devl_unregister() or
> devl_unlock() is a bug.  It's not registered and it's not locked.  That
> will trigger a stack trace in this case because devl_unregister() checks
> both those things at the start of the function.
> 
> If mlx5_devlink_params_register() fails then this code will call
> devl_unregister() and devl_unlock() twice which will again lead to a
> stack trace or possibly something worse as well.
> 
> Fixes: bf729988303a ("net/mlx5: Restore mistakenly dropped parts in register devlink flow")
> Fixes: c6e77aa9dd82 ("net/mlx5: Register devlink first under devlink lock")

Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>

> Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
> ---
>  drivers/net/ethernet/mellanox/mlx5/core/main.c | 10 ++++------
>  1 file changed, 4 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
> index 331ce47f51a1..105c98160327 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
> @@ -1690,7 +1690,7 @@ int mlx5_init_one_light(struct mlx5_core_dev *dev)
>  	err = mlx5_query_hca_caps_light(dev);
>  	if (err) {
>  		mlx5_core_warn(dev, "mlx5_query_hca_caps_light err=%d\n", err);
> -		goto query_hca_caps_err;
> +		goto err_function_disable;
>  	}
>  
>  	devl_lock(devlink);
> @@ -1699,18 +1699,16 @@ int mlx5_init_one_light(struct mlx5_core_dev *dev)
>  	err = mlx5_devlink_params_register(priv_to_devlink(dev));
>  	if (err) {
>  		mlx5_core_warn(dev, "mlx5_devlink_param_reg err = %d\n", err);
> -		goto params_reg_err;
> +		goto err_unregister;
>  	}
>  
>  	devl_unlock(devlink);
>  	return 0;
>  
> -params_reg_err:
> -	devl_unregister(devlink);
> -	devl_unlock(devlink);
> -query_hca_caps_err:
> +err_unregister:
>  	devl_unregister(devlink);
>  	devl_unlock(devlink);
> +err_function_disable:
>  	mlx5_function_disable(dev, true);
>  out:
>  	dev->state = MLX5_DEVICE_STATE_INTERNAL_ERROR;
> -- 
> 2.43.0
> 
>
Simon Horman May 11, 2024, 2:23 p.m. UTC | #2
On Thu, May 09, 2024 at 02:00:18PM +0300, Dan Carpenter wrote:
> If mlx5_query_hca_caps_light() fails then calling devl_unregister() or
> devl_unlock() is a bug.  It's not registered and it's not locked.  That
> will trigger a stack trace in this case because devl_unregister() checks
> both those things at the start of the function.
> 
> If mlx5_devlink_params_register() fails then this code will call
> devl_unregister() and devl_unlock() twice which will again lead to a
> stack trace or possibly something worse as well.
> 
> Fixes: bf729988303a ("net/mlx5: Restore mistakenly dropped parts in register devlink flow")
> Fixes: c6e77aa9dd82 ("net/mlx5: Register devlink first under devlink lock")
> Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>

Hi Dan,

I believe that after you posted this patch, a different fix for this was
added to net as:

3c453e8cc672 ("net/mlx5: Fix peer devlink set for SF representor devlink port")
Dan Carpenter May 12, 2024, 8:20 a.m. UTC | #3
On Sat, May 11, 2024 at 03:23:04PM +0100, Simon Horman wrote:
> On Thu, May 09, 2024 at 02:00:18PM +0300, Dan Carpenter wrote:
> > If mlx5_query_hca_caps_light() fails then calling devl_unregister() or
> > devl_unlock() is a bug.  It's not registered and it's not locked.  That
> > will trigger a stack trace in this case because devl_unregister() checks
> > both those things at the start of the function.
> > 
> > If mlx5_devlink_params_register() fails then this code will call
> > devl_unregister() and devl_unlock() twice which will again lead to a
> > stack trace or possibly something worse as well.
> > 
> > Fixes: bf729988303a ("net/mlx5: Restore mistakenly dropped parts in register devlink flow")
> > Fixes: c6e77aa9dd82 ("net/mlx5: Register devlink first under devlink lock")
> > Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
> 
> Hi Dan,
> 
> I believe that after you posted this patch, a different fix for this was
> added to net as:
> 
> 3c453e8cc672 ("net/mlx5: Fix peer devlink set for SF representor devlink port")
> 

Ah good.  Plus that patch has been tested.

regards,
dan carpenter
diff mbox series

Patch

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index 331ce47f51a1..105c98160327 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -1690,7 +1690,7 @@  int mlx5_init_one_light(struct mlx5_core_dev *dev)
 	err = mlx5_query_hca_caps_light(dev);
 	if (err) {
 		mlx5_core_warn(dev, "mlx5_query_hca_caps_light err=%d\n", err);
-		goto query_hca_caps_err;
+		goto err_function_disable;
 	}
 
 	devl_lock(devlink);
@@ -1699,18 +1699,16 @@  int mlx5_init_one_light(struct mlx5_core_dev *dev)
 	err = mlx5_devlink_params_register(priv_to_devlink(dev));
 	if (err) {
 		mlx5_core_warn(dev, "mlx5_devlink_param_reg err = %d\n", err);
-		goto params_reg_err;
+		goto err_unregister;
 	}
 
 	devl_unlock(devlink);
 	return 0;
 
-params_reg_err:
-	devl_unregister(devlink);
-	devl_unlock(devlink);
-query_hca_caps_err:
+err_unregister:
 	devl_unregister(devlink);
 	devl_unlock(devlink);
+err_function_disable:
 	mlx5_function_disable(dev, true);
 out:
 	dev->state = MLX5_DEVICE_STATE_INTERNAL_ERROR;