Message ID | 20250214132453.4108-1-michal.swiatkowski@linux.intel.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net,v1] devlink: fix xa_alloc_cyclic error handling | expand |
On Fri, Feb 14, 2025 at 02:24:53PM +0100, Michal Swiatkowski wrote: > Pierre Riteau <pierre@stackhpc.com> found suspicious handling an error > from xa_alloc_cyclic() in scheduler code [1]. The same is done in > devlink_rel_alloc(). If the same bug exists twice it might exist more times. Did you find this instance by searching the whole tree? Or just networking? This is also something which would be good to have the static analysers check for. I wounder if smatch can check this? Andrew > > In case of returning 1 from xa_alloc_cyclic() (wrapping) ERR_PTR(1) will > be returned, which will cause IS_ERR() to be false. Which can lead to > dereference not allocated pointer (rel). > > Fix it by checking if err is lower than zero. > > This wasn't found in real usecase, only noticed. Credit to Pierre. > > Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> > --- > net/devlink/core.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/net/devlink/core.c b/net/devlink/core.c > index f49cd83f1955..7203c39532fc 100644 > --- a/net/devlink/core.c > +++ b/net/devlink/core.c > @@ -117,7 +117,7 @@ static struct devlink_rel *devlink_rel_alloc(void) > > err = xa_alloc_cyclic(&devlink_rels, &rel->index, rel, > xa_limit_32b, &next, GFP_KERNEL); > - if (err) { > + if (err < 0) { > kfree(rel); > return ERR_PTR(err); > } > -- > 2.42.0 > >
On Fri, Feb 14, 2025 at 02:44:49PM +0100, Andrew Lunn wrote: > On Fri, Feb 14, 2025 at 02:24:53PM +0100, Michal Swiatkowski wrote: > > Pierre Riteau <pierre@stackhpc.com> found suspicious handling an error > > from xa_alloc_cyclic() in scheduler code [1]. The same is done in > > devlink_rel_alloc(). > > If the same bug exists twice it might exist more times. Did you find > this instance by searching the whole tree? Or just networking? > > This is also something which would be good to have the static > analysers check for. I wounder if smatch can check this? > > Andrew > You are right, I checked only net folder and there are two usage like that in drivers. I will send v2 with wider fixing, thanks. It can be not so easy to check. What if someone want to treat wrapping as an error (don't know if it is valid)? If one of the caller is checking err < 0 it will be fine. Thanks, Michal > > > > In case of returning 1 from xa_alloc_cyclic() (wrapping) ERR_PTR(1) will > > be returned, which will cause IS_ERR() to be false. Which can lead to > > dereference not allocated pointer (rel). > > > > Fix it by checking if err is lower than zero. > > > > This wasn't found in real usecase, only noticed. Credit to Pierre. > > > > Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> > > --- > > net/devlink/core.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/net/devlink/core.c b/net/devlink/core.c > > index f49cd83f1955..7203c39532fc 100644 > > --- a/net/devlink/core.c > > +++ b/net/devlink/core.c > > @@ -117,7 +117,7 @@ static struct devlink_rel *devlink_rel_alloc(void) > > > > err = xa_alloc_cyclic(&devlink_rels, &rel->index, rel, > > xa_limit_32b, &next, GFP_KERNEL); > > - if (err) { > > + if (err < 0) { > > kfree(rel); > > return ERR_PTR(err); > > } > > -- > > 2.42.0 > > > >
On Fri, Feb 14, 2025 at 02:58:41PM +0100, Michal Swiatkowski wrote: > On Fri, Feb 14, 2025 at 02:44:49PM +0100, Andrew Lunn wrote: > > On Fri, Feb 14, 2025 at 02:24:53PM +0100, Michal Swiatkowski wrote: > > > Pierre Riteau <pierre@stackhpc.com> found suspicious handling an error > > > from xa_alloc_cyclic() in scheduler code [1]. The same is done in > > > devlink_rel_alloc(). > > > > If the same bug exists twice it might exist more times. Did you find > > this instance by searching the whole tree? Or just networking? > > > > This is also something which would be good to have the static > > analysers check for. I wounder if smatch can check this? > > > > Andrew > > > > You are right, I checked only net folder and there are two usage like > that in drivers. I will send v2 with wider fixing, thanks. > > It can be not so easy to check. What if someone want to treat wrapping > as an error (don't know if it is valid)? If one of the caller is > checking err < 0 it will be fine. I put Dan in Cc:, lets see what he thinks. There is at least one other functions i can think of which has similar behaviour, < 0 on error, 0 or 1 are both different sorts of success. If there are two, there are probably more. Having tooling to find this sort of problem would be nice, even if it has a high false positive rate and needs combining with manual inspection. Andrew
On Fri, Feb 14, 2025 at 02:44:49PM +0100, Andrew Lunn wrote: > On Fri, Feb 14, 2025 at 02:24:53PM +0100, Michal Swiatkowski wrote: > > Pierre Riteau <pierre@stackhpc.com> found suspicious handling an error > > from xa_alloc_cyclic() in scheduler code [1]. The same is done in > > devlink_rel_alloc(). > > If the same bug exists twice it might exist more times. Did you find > this instance by searching the whole tree? Or just networking? > > This is also something which would be good to have the static > analysers check for. I wounder if smatch can check this? That's a great idea, thanks! I'll try a couple experiments and see what works tomorrow. I've add these lines to check_zero_to_err_ptr.c 183 max = rl_max(estate_rl(sm->state)); 184 if (max.value > 0 && !sval_is_a_max(max)) 185 sm_warning("passing non-max range '%s' to '%s'", sm->state->name, fn); 186 I'm hoping this one works. It complains about any positive returns except for when the return is "some non-zero value". 194 if (estate_get_single_value(tmp->state, &sval) && 195 (sval.value < -4096 || sval.value > 0)) { 196 sm_warning("passing invalid error code %lld to '%s'", sval.value, fn); 197 return; 198 } This one might miss some bugs but it should catch most stuff and have few false positives. Both of them work on this example. net/devlink/core.c:122 devlink_rel_alloc() warn: passing non-max range '(-4095)-(-1),1' to 'ERR_PTR' net/devlink/core.c:122 devlink_rel_alloc() warn: passing invalid error code 1 to 'ERR_PTR' regards, dan carpenter
On Sun, Feb 16, 2025 at 06:06:48PM +0300, Dan Carpenter wrote: > On Fri, Feb 14, 2025 at 02:44:49PM +0100, Andrew Lunn wrote: > > On Fri, Feb 14, 2025 at 02:24:53PM +0100, Michal Swiatkowski wrote: > > > Pierre Riteau <pierre@stackhpc.com> found suspicious handling an error > > > from xa_alloc_cyclic() in scheduler code [1]. The same is done in > > > devlink_rel_alloc(). > > > > If the same bug exists twice it might exist more times. Did you find > > this instance by searching the whole tree? Or just networking? > > > > This is also something which would be good to have the static > > analysers check for. I wounder if smatch can check this? > > That's a great idea, thanks! I'll try a couple experiments and see what > works tomorrow. I've add these lines to check_zero_to_err_ptr.c > > 183 max = rl_max(estate_rl(sm->state)); > 184 if (max.value > 0 && !sval_is_a_max(max)) > 185 sm_warning("passing non-max range '%s' to '%s'", sm->state->name, fn); > 186 > > I'm hoping this one works. It complains about any positive returns > except for when the return is "some non-zero value". > > 194 if (estate_get_single_value(tmp->state, &sval) && > 195 (sval.value < -4096 || sval.value > 0)) { > 196 sm_warning("passing invalid error code %lld to '%s'", sval.value, fn); > 197 return; > 198 } > > This one might miss some bugs but it should catch most stuff and have few > false positives. Both of them work on this example. > > net/devlink/core.c:122 devlink_rel_alloc() warn: passing non-max range '(-4095)-(-1),1' to 'ERR_PTR' > net/devlink/core.c:122 devlink_rel_alloc() warn: passing invalid error code 1 to 'ERR_PTR' Nice. In networking, ethernet PHYs, there are a few functions ending in _changed() have the same behaviour: * Returns negative errno, 0 if there was no change, and 1 in case of change So there is the potential for the same issue with mdiobus_modify_changed(), phy_modify_changed(), phy_modify_mmd_changed(), phy_modify_paged_changed(). Hope this helps with testing. Andrew
Both versions find 97 warning but they're slightly different so I should create some kind of combined check. drivers/input/touchscreen/cyttsp_core.c:658 cyttsp_probe() warn: passing non-max range 's32min-(-1),1' to 'ERR_PTR' drivers/gpu/drm/mediatek/mtk_dp.c:2736 mtk_dp_probe() warn: passing non-max range 's32min-(-1),1' to 'dev_err_probe' drivers/clk/clk-gpio.c:371 clk_gated_fixed_probe() warn: passing non-max range 's32min-(-1),1' to 'dev_err_probe' drivers/net/ethernet/socionext/netsec.c:1902 netsec_acpi_probe() warn: passing non-max range 's32min-(-1),1' to 'dev_err_probe' drivers/net/ethernet/socionext/netsec.c:1909 netsec_acpi_probe() warn: passing non-max range 's32min-(-1),1' to 'dev_err_probe' drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c:2050 mcp251xfd_probe() warn: passing non-max range 's32min-(-1),1' to 'dev_err_probe' drivers/net/can/spi/hi311x.c:857 hi3110_can_probe() warn: passing non-max range 's32min-(-1),1' to 'dev_err_probe' drivers/net/can/dev/dev.c:496 can_get_termination() warn: passing non-max range 's32min-(-1),1' to 'ERR_PTR' drivers/pwm/pwm-sl28cpld.c:228 sl28cpld_pwm_probe() warn: passing non-max range 's32min-(-1),1' to 'ERR_PTR' drivers/leds/leds-is31fl319x.c:423 is31fl319x_parse_fw() warn: passing non-max range 's32min-(-1),1' to 'dev_err_probe' drivers/leds/rgb/leds-mt6370-rgb.c:738 mt6370_assign_multicolor_info() warn: passing non-max range 's32min-(-1),1' to 'dev_err_probe' drivers/leds/rgb/leds-mt6370-rgb.c:945 mt6370_leds_probe() warn: passing non-max range 's32min-(-1),1' to 'dev_err_probe' drivers/leds/rgb/leds-mt6370-rgb.c:952 mt6370_leds_probe() warn: passing non-max range 's32min-(-1),1' to 'dev_err_probe' The first return from acpi_data_prop_read() could be 1. I'll report this. drivers/gpu/drm/msm/msm_gem_submit.c:537 msm_parse_deps() warn: passing non-max range '(-4095)-(-1),22' to 'ERR_PTR' Clear bug. Double negative -(-EINVAL). I'll send a patch. drivers/gpu/drm/xe/xe_guc.c:1241 xe_guc_suspend() warn: passing non-max range '(-110),(-71),(-6),1-268435455' to 'ERR_PTR' drivers/gpu/drm/i915/gt/uc/intel_guc.c:698 intel_guc_suspend() warn: passing non-max range '1-268435455' to 'ERR_PTR' Smatch gets confused by the return FIELD_GET(GUC_HXG_RESPONSE_MSG_0_DATA0, header); in xe_guc_mmio_send_recv(). drivers/spi/spi-pxa2xx-platform.c:101 pxa2xx_spi_init_pdata() warn: passing non-max range 's32min-(-1),1' to 'ERR_PTR' drivers/spi/spi-cs42l43.c:396 cs42l43_spi_probe() warn: passing non-max range 's32min-(-23),(-21)-(-1),1' to 'dev_err_probe' drivers/spi/spi-rockchip-sfc.c:654 rockchip_sfc_probe() warn: passing non-max range 's32min-(-1),1' to 'dev_err_probe' drivers/net/dsa/mv88e6xxx/pcs-6352.c:250 marvell_c22_pcs_link_up() warn: passing non-max range '(-110),(-95),(-16),1' to 'ERR_PTR' False positive. marvell_c22_pcs_restore_page() is complicated. drivers/net/phy/phy_device.c:1150 phy_connect() warn: passing non-max range 's32min-(-1),1' to 'ERR_PTR' drivers/net/phy/phy_device.c:1651 phy_attach() warn: passing non-max range 's32min-(-1),1' to 'ERR_PTR' drivers/net/ethernet/ethoc.c:719 ethoc_mdio_probe() warn: passing non-max range 's32min-(-1),1' to 'dev_err_probe' drivers/net/ethernet/hisilicon/hibmcge/hbg_mdio.c:178 hbg_phy_connect() warn: passing non-max range 's32min-(-1),1' to 'dev_err_probe' xa_alloc_cyclic() returns 1 if the allocation succeeded after wrapping. I'll report this. drivers/net/ethernet/aquantia/atlantic/aq_ring.c:481 aq_xdp_run_prog() warn: passing non-max range '(-16),0,5-6' to 'ERR_PTR' Bug. Mixing error codes from aq_hw_err_from_flags() and NETDEV_TX_BUSY. drivers/media/i2c/ds90ub913.c:856 ub913_probe() warn: passing non-max range 's32min-(-1),1' to 'dev_err_probe' This is a transient bug that went away after a database rebuild. I'll debug the rest over time. regards, dan carpenter
On Sun, Feb 16, 2025 at 05:08:23PM +0100, Andrew Lunn wrote: > * Returns negative errno, 0 if there was no change, and 1 in case of change > > So there is the potential for the same issue with > mdiobus_modify_changed(), phy_modify_changed(), > phy_modify_mmd_changed(), phy_modify_paged_changed(). Hope this helps > with testing. Thanks, that's useful. The first check would have caught all these, but the second one would only have caught the first three. The combined check will catch them all. :) regards, dan carpenter
On 2/14/25 2:58 PM, Michal Swiatkowski wrote: > On Fri, Feb 14, 2025 at 02:44:49PM +0100, Andrew Lunn wrote: >> On Fri, Feb 14, 2025 at 02:24:53PM +0100, Michal Swiatkowski wrote: >>> Pierre Riteau <pierre@stackhpc.com> found suspicious handling an error >>> from xa_alloc_cyclic() in scheduler code [1]. The same is done in >>> devlink_rel_alloc(). >> >> If the same bug exists twice it might exist more times. Did you find >> this instance by searching the whole tree? Or just networking? >> >> This is also something which would be good to have the static >> analysers check for. I wounder if smatch can check this? >> >> Andrew >> > > You are right, I checked only net folder and there are two usage like > that in drivers. I will send v2 with wider fixing, thanks. While at that, please add the suitable fixes tag(s). Thanks, Paolo
diff --git a/net/devlink/core.c b/net/devlink/core.c index f49cd83f1955..7203c39532fc 100644 --- a/net/devlink/core.c +++ b/net/devlink/core.c @@ -117,7 +117,7 @@ static struct devlink_rel *devlink_rel_alloc(void) err = xa_alloc_cyclic(&devlink_rels, &rel->index, rel, xa_limit_32b, &next, GFP_KERNEL); - if (err) { + if (err < 0) { kfree(rel); return ERR_PTR(err); }
Pierre Riteau <pierre@stackhpc.com> found suspicious handling an error from xa_alloc_cyclic() in scheduler code [1]. The same is done in devlink_rel_alloc(). In case of returning 1 from xa_alloc_cyclic() (wrapping) ERR_PTR(1) will be returned, which will cause IS_ERR() to be false. Which can lead to dereference not allocated pointer (rel). Fix it by checking if err is lower than zero. This wasn't found in real usecase, only noticed. Credit to Pierre. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> --- net/devlink/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)