Message ID | a2bb6a55-5415-4c15-bee9-9e63f4b6a339@moroto.mountain (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Series | [net] net/mlx5: Fix error handling in mlx5_init_one_light() | expand |
On Thu, May 09, 2024 at 02:00:18PM +0300, Dan Carpenter wrote: > If mlx5_query_hca_caps_light() fails then calling devl_unregister() or > devl_unlock() is a bug. It's not registered and it's not locked. That > will trigger a stack trace in this case because devl_unregister() checks > both those things at the start of the function. > > If mlx5_devlink_params_register() fails then this code will call > devl_unregister() and devl_unlock() twice which will again lead to a > stack trace or possibly something worse as well. > > Fixes: bf729988303a ("net/mlx5: Restore mistakenly dropped parts in register devlink flow") > Fixes: c6e77aa9dd82 ("net/mlx5: Register devlink first under devlink lock") Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> > Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> > --- > drivers/net/ethernet/mellanox/mlx5/core/main.c | 10 ++++------ > 1 file changed, 4 insertions(+), 6 deletions(-) > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c > index 331ce47f51a1..105c98160327 100644 > --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c > +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c > @@ -1690,7 +1690,7 @@ int mlx5_init_one_light(struct mlx5_core_dev *dev) > err = mlx5_query_hca_caps_light(dev); > if (err) { > mlx5_core_warn(dev, "mlx5_query_hca_caps_light err=%d\n", err); > - goto query_hca_caps_err; > + goto err_function_disable; > } > > devl_lock(devlink); > @@ -1699,18 +1699,16 @@ int mlx5_init_one_light(struct mlx5_core_dev *dev) > err = mlx5_devlink_params_register(priv_to_devlink(dev)); > if (err) { > mlx5_core_warn(dev, "mlx5_devlink_param_reg err = %d\n", err); > - goto params_reg_err; > + goto err_unregister; > } > > devl_unlock(devlink); > return 0; > > -params_reg_err: > - devl_unregister(devlink); > - devl_unlock(devlink); > -query_hca_caps_err: > +err_unregister: > devl_unregister(devlink); > devl_unlock(devlink); > +err_function_disable: > mlx5_function_disable(dev, true); > out: > dev->state = MLX5_DEVICE_STATE_INTERNAL_ERROR; > -- > 2.43.0 > >
On Thu, May 09, 2024 at 02:00:18PM +0300, Dan Carpenter wrote: > If mlx5_query_hca_caps_light() fails then calling devl_unregister() or > devl_unlock() is a bug. It's not registered and it's not locked. That > will trigger a stack trace in this case because devl_unregister() checks > both those things at the start of the function. > > If mlx5_devlink_params_register() fails then this code will call > devl_unregister() and devl_unlock() twice which will again lead to a > stack trace or possibly something worse as well. > > Fixes: bf729988303a ("net/mlx5: Restore mistakenly dropped parts in register devlink flow") > Fixes: c6e77aa9dd82 ("net/mlx5: Register devlink first under devlink lock") > Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Hi Dan, I believe that after you posted this patch, a different fix for this was added to net as: 3c453e8cc672 ("net/mlx5: Fix peer devlink set for SF representor devlink port")
On Sat, May 11, 2024 at 03:23:04PM +0100, Simon Horman wrote: > On Thu, May 09, 2024 at 02:00:18PM +0300, Dan Carpenter wrote: > > If mlx5_query_hca_caps_light() fails then calling devl_unregister() or > > devl_unlock() is a bug. It's not registered and it's not locked. That > > will trigger a stack trace in this case because devl_unregister() checks > > both those things at the start of the function. > > > > If mlx5_devlink_params_register() fails then this code will call > > devl_unregister() and devl_unlock() twice which will again lead to a > > stack trace or possibly something worse as well. > > > > Fixes: bf729988303a ("net/mlx5: Restore mistakenly dropped parts in register devlink flow") > > Fixes: c6e77aa9dd82 ("net/mlx5: Register devlink first under devlink lock") > > Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> > > Hi Dan, > > I believe that after you posted this patch, a different fix for this was > added to net as: > > 3c453e8cc672 ("net/mlx5: Fix peer devlink set for SF representor devlink port") > Ah good. Plus that patch has been tested. regards, dan carpenter
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c index 331ce47f51a1..105c98160327 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c @@ -1690,7 +1690,7 @@ int mlx5_init_one_light(struct mlx5_core_dev *dev) err = mlx5_query_hca_caps_light(dev); if (err) { mlx5_core_warn(dev, "mlx5_query_hca_caps_light err=%d\n", err); - goto query_hca_caps_err; + goto err_function_disable; } devl_lock(devlink); @@ -1699,18 +1699,16 @@ int mlx5_init_one_light(struct mlx5_core_dev *dev) err = mlx5_devlink_params_register(priv_to_devlink(dev)); if (err) { mlx5_core_warn(dev, "mlx5_devlink_param_reg err = %d\n", err); - goto params_reg_err; + goto err_unregister; } devl_unlock(devlink); return 0; -params_reg_err: - devl_unregister(devlink); - devl_unlock(devlink); -query_hca_caps_err: +err_unregister: devl_unregister(devlink); devl_unlock(devlink); +err_function_disable: mlx5_function_disable(dev, true); out: dev->state = MLX5_DEVICE_STATE_INTERNAL_ERROR;
If mlx5_query_hca_caps_light() fails then calling devl_unregister() or devl_unlock() is a bug. It's not registered and it's not locked. That will trigger a stack trace in this case because devl_unregister() checks both those things at the start of the function. If mlx5_devlink_params_register() fails then this code will call devl_unregister() and devl_unlock() twice which will again lead to a stack trace or possibly something worse as well. Fixes: bf729988303a ("net/mlx5: Restore mistakenly dropped parts in register devlink flow") Fixes: c6e77aa9dd82 ("net/mlx5: Register devlink first under devlink lock") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> --- drivers/net/ethernet/mellanox/mlx5/core/main.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-)