Message ID | 20241013-vchiq_arm-of_node_put-v1-1-f72b2a6e47d0@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | staging: vchiq_arm: Fix missing refcount decrement in error path for fw_node | expand |
Hi Javier, Thank you for the patch. On 13/10/24 4:12 pm, Javier Carrasco wrote: > An error path was introduced without including the required call to > of_node_put() to decrement the node's refcount and avoid leaking memory. > If the call to kzalloc() for 'mgmt' fails, the probe returns without > decrementing the refcount. > > Use the automatic cleanup facility to fix the bug and protect the code > against new error paths where the call to of_node_put() might be missing > again. > > Cc: stable@vger.kernel.org > Fixes: 1c9e16b73166 ("staging: vc04_services: vchiq_arm: Split driver static and runtime data") > Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> > --- > drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c | 6 ++---- > 1 file changed, 2 insertions(+), 4 deletions(-) > > diff --git a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c > index 27ceaac8f6cc..792cf3a807e1 100644 > --- a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c > +++ b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c > @@ -1332,7 +1332,8 @@ MODULE_DEVICE_TABLE(of, vchiq_of_match); > > static int vchiq_probe(struct platform_device *pdev) > { > - struct device_node *fw_node; > + struct device_node *fw_node __free(device_node) = > + of_find_compatible_node(NULL, NULL, "raspberrypi,bcm2835-firmware"); How about : + struct device_node *fw_node __free(device_node) = NULL; > const struct vchiq_platform_info *info; > struct vchiq_drv_mgmt *mgmt; > int ret; > @@ -1341,8 +1342,6 @@ static int vchiq_probe(struct platform_device *pdev) > if (!info) > return -EINVAL; > > - fw_node = of_find_compatible_node(NULL, NULL, > - "raspberrypi,bcm2835-firmware"); And undo this (i.e. keep the of_find_compatible_node() call here This helps with readability as there is a NULL check just after this. > if (!fw_node) { > dev_err(&pdev->dev, "Missing firmware node\n"); > return -ENOENT; > @@ -1353,7 +1352,6 @@ static int vchiq_probe(struct platform_device *pdev) > return -ENOMEM; > > mgmt->fw = devm_rpi_firmware_get(&pdev->dev, fw_node); > - of_node_put(fw_node); And this change remains the same. > if (!mgmt->fw) > return -EPROBE_DEFER; > > > --- > base-commit: d61a00525464bfc5fe92c6ad713350988e492b88 > change-id: 20241013-vchiq_arm-of_node_put-60a5eaaafd70 > > Best regards,
On 13/10/2024 13:36, Umang Jain wrote: > Hi Javier, > > Thank you for the patch. > > On 13/10/24 4:12 pm, Javier Carrasco wrote: >> An error path was introduced without including the required call to >> of_node_put() to decrement the node's refcount and avoid leaking memory. >> If the call to kzalloc() for 'mgmt' fails, the probe returns without >> decrementing the refcount. >> >> Use the automatic cleanup facility to fix the bug and protect the code >> against new error paths where the call to of_node_put() might be missing >> again. >> >> Cc: stable@vger.kernel.org >> Fixes: 1c9e16b73166 ("staging: vc04_services: vchiq_arm: Split driver >> static and runtime data") >> Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> >> --- >> drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c | 6 + >> +---- >> 1 file changed, 2 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/staging/vc04_services/interface/vchiq_arm/ >> vchiq_arm.c b/drivers/staging/vc04_services/interface/vchiq_arm/ >> vchiq_arm.c >> index 27ceaac8f6cc..792cf3a807e1 100644 >> --- a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c >> +++ b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c >> @@ -1332,7 +1332,8 @@ MODULE_DEVICE_TABLE(of, vchiq_of_match); >> static int vchiq_probe(struct platform_device *pdev) >> { >> - struct device_node *fw_node; >> + struct device_node *fw_node __free(device_node) = >> + of_find_compatible_node(NULL, NULL, "raspberrypi,bcm2835- >> firmware"); > > How about : > > + struct device_node *fw_node __free(device_node) = NULL; > >> const struct vchiq_platform_info *info; >> struct vchiq_drv_mgmt *mgmt; >> int ret; >> @@ -1341,8 +1342,6 @@ static int vchiq_probe(struct platform_device >> *pdev) >> if (!info) >> return -EINVAL; >> - fw_node = of_find_compatible_node(NULL, NULL, >> - "raspberrypi,bcm2835-firmware"); > > And undo this (i.e. keep the of_find_compatible_node() call here > > This helps with readability as there is a NULL check just after this. >> if (!fw_node) { >> dev_err(&pdev->dev, "Missing firmware node\n"); >> return -ENOENT; >> @@ -1353,7 +1352,6 @@ static int vchiq_probe(struct platform_device >> *pdev) >> return -ENOMEM; >> mgmt->fw = devm_rpi_firmware_get(&pdev->dev, fw_node); >> - of_node_put(fw_node); > > And this change remains the same. >> if (!mgmt->fw) >> return -EPROBE_DEFER; >> >> --- >> base-commit: d61a00525464bfc5fe92c6ad713350988e492b88 >> change-id: 20241013-vchiq_arm-of_node_put-60a5eaaafd70 >> >> Best regards, > Hi Umang, Sure, I am fine with that too. Depending on the maintainer, the preferred approach varies: a single initialization at the top whenever possible, a declaration right before its first usage (not my favorite), or a NULL initialization first. I will send a v2 with the latter i.e. what you suggested, as it keeps everything more similar to what it used to be. Thanks and best regards, Javier Carrasco
On 13/10/2024 13:36, Umang Jain wrote: > Hi Javier, > > Thank you for the patch. > > On 13/10/24 4:12 pm, Javier Carrasco wrote: >> An error path was introduced without including the required call to >> of_node_put() to decrement the node's refcount and avoid leaking memory. >> If the call to kzalloc() for 'mgmt' fails, the probe returns without >> decrementing the refcount. >> >> Use the automatic cleanup facility to fix the bug and protect the code >> against new error paths where the call to of_node_put() might be missing >> again. >> >> Cc: stable@vger.kernel.org >> Fixes: 1c9e16b73166 ("staging: vc04_services: vchiq_arm: Split driver static and runtime data") >> Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> >> --- >> drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c | 6 ++---- >> 1 file changed, 2 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c >> index 27ceaac8f6cc..792cf3a807e1 100644 >> --- a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c >> +++ b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c >> @@ -1332,7 +1332,8 @@ MODULE_DEVICE_TABLE(of, vchiq_of_match); >> >> static int vchiq_probe(struct platform_device *pdev) >> { >> - struct device_node *fw_node; >> + struct device_node *fw_node __free(device_node) = >> + of_find_compatible_node(NULL, NULL, "raspberrypi,bcm2835-firmware"); > > How about : > > + struct device_node *fw_node __free(device_node) = NULL; > >> const struct vchiq_platform_info *info; >> struct vchiq_drv_mgmt *mgmt; >> int ret; >> @@ -1341,8 +1342,6 @@ static int vchiq_probe(struct platform_device *pdev) >> if (!info) >> return -EINVAL; >> >> - fw_node = of_find_compatible_node(NULL, NULL, >> - "raspberrypi,bcm2835-firmware"); > > And undo this (i.e. keep the of_find_compatible_node() call here The point of using cleanup is to have constructor and destructor in one place, not split. This is not in the spirit of cleanup. Linus also commented on this and cleanup.h *explicitly* recommends not doing so. It also lead to real bugs from time to time, so no, please do not insist on such weird way. Best regards, Krzysztof
On Sun, Oct 13, 2024 at 12:42:32PM +0200, Javier Carrasco wrote: > An error path was introduced without including the required call to > of_node_put() to decrement the node's refcount and avoid leaking memory. > If the call to kzalloc() for 'mgmt' fails, the probe returns without > decrementing the refcount. > > Use the automatic cleanup facility to fix the bug and protect the code > against new error paths where the call to of_node_put() might be missing > again. > > Cc: stable@vger.kernel.org > Fixes: 1c9e16b73166 ("staging: vc04_services: vchiq_arm: Split driver static and runtime data") > Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> > --- > drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c | 6 ++---- > 1 file changed, 2 insertions(+), 4 deletions(-) > > diff --git a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c > index 27ceaac8f6cc..792cf3a807e1 100644 > --- a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c > +++ b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c > @@ -1332,7 +1332,8 @@ MODULE_DEVICE_TABLE(of, vchiq_of_match); > > static int vchiq_probe(struct platform_device *pdev) > { > - struct device_node *fw_node; > + struct device_node *fw_node __free(device_node) = > + of_find_compatible_node(NULL, NULL, "raspberrypi,bcm2835-firmware"); > const struct vchiq_platform_info *info; > struct vchiq_drv_mgmt *mgmt; > int ret; > @@ -1341,8 +1342,6 @@ static int vchiq_probe(struct platform_device *pdev) > if (!info) > return -EINVAL; > > - fw_node = of_find_compatible_node(NULL, NULL, > - "raspberrypi,bcm2835-firmware"); Perhaps it's better to declare the variable here so that the function and the error handling are next to each other. if (!info) return -EINVAL; struct device_node *fw_node __free(device_node) = of_find_compatible_node(NULL, NULL, "raspberrypi,bcm2835-firmware"); if (!fw_node) { ... This is why we lifted the rule that variables had to be declared at the start of a function. regards, dan carpenter
On 14/10/2024 09:22, Dan Carpenter wrote: > On Sun, Oct 13, 2024 at 12:42:32PM +0200, Javier Carrasco wrote: >> An error path was introduced without including the required call to >> of_node_put() to decrement the node's refcount and avoid leaking memory. >> If the call to kzalloc() for 'mgmt' fails, the probe returns without >> decrementing the refcount. >> >> Use the automatic cleanup facility to fix the bug and protect the code >> against new error paths where the call to of_node_put() might be missing >> again. >> >> Cc: stable@vger.kernel.org >> Fixes: 1c9e16b73166 ("staging: vc04_services: vchiq_arm: Split driver static and runtime data") >> Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> >> --- >> drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c | 6 ++---- >> 1 file changed, 2 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c >> index 27ceaac8f6cc..792cf3a807e1 100644 >> --- a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c >> +++ b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c >> @@ -1332,7 +1332,8 @@ MODULE_DEVICE_TABLE(of, vchiq_of_match); >> >> static int vchiq_probe(struct platform_device *pdev) >> { >> - struct device_node *fw_node; >> + struct device_node *fw_node __free(device_node) = >> + of_find_compatible_node(NULL, NULL, "raspberrypi,bcm2835-firmware"); >> const struct vchiq_platform_info *info; >> struct vchiq_drv_mgmt *mgmt; >> int ret; >> @@ -1341,8 +1342,6 @@ static int vchiq_probe(struct platform_device *pdev) >> if (!info) >> return -EINVAL; >> >> - fw_node = of_find_compatible_node(NULL, NULL, >> - "raspberrypi,bcm2835-firmware"); > > Perhaps it's better to declare the variable here so that the function and the > error handling are next to each other. > > if (!info) > return -EINVAL; > > struct device_node *fw_node __free(device_node) = > of_find_compatible_node(NULL, NULL, "raspberrypi,bcm2835-firmware"); > if (!fw_node) { > > ... > > This is why we lifted the rule that variables had to be declared at the start > of a function. > > regards, > dan carpenter > This approach is great as long as the maintainer accepts mid-scope variable declaration and the goto instructions get refactored, as stated in cleanup.h. The first point is not being that problematic so far, but the second one is trickier, and we all have to take special care to avoid such issues, even if they don't look dangerous in the current code, because adding a goto where there cleanup attribute is already used can be overlooked as well. Actually there are goto instructions in the function, but at least in their current form they are as harmless as useless. I will refactor them anyway in another patch to stick to the recommendations, and declare the device_node right before its first usage for v2. Thanks and best regards, Javier Carrasco
On Mon, Oct 14, 2024 at 09:59:49AM +0200, Javier Carrasco wrote: > This approach is great as long as the maintainer accepts mid-scope > variable declaration and the goto instructions get refactored, as stated > in cleanup.h. > > The first point is not being that problematic so far, but the second one > is trickier, and we all have to take special care to avoid such issues, > even if they don't look dangerous in the current code, because adding a > goto where there cleanup attribute is already used can be overlooked as > well. > To be honest, I don't really understand this paragraph. I think maybe you're talking about if we declare the variable at the top and forget to initialize it to NULL? It leads to an uninitialized variable if we exit the function before it is initialized. > Actually there are goto instructions in the function, but at least in > their current form they are as harmless as useless. Yep. Feel free to delete them. regards, dan carpenter
On 14/10/2024 10:12, Dan Carpenter wrote: > On Mon, Oct 14, 2024 at 09:59:49AM +0200, Javier Carrasco wrote: >> This approach is great as long as the maintainer accepts mid-scope >> variable declaration and the goto instructions get refactored, as stated >> in cleanup.h. >> >> The first point is not being that problematic so far, but the second one >> is trickier, and we all have to take special care to avoid such issues, >> even if they don't look dangerous in the current code, because adding a >> goto where there cleanup attribute is already used can be overlooked as >> well. >> > > To be honest, I don't really understand this paragraph. I think maybe you're > talking about if we declare the variable at the top and forget to initialize it > to NULL? It leads to an uninitialized variable if we exit the function before > it is initialized. > No, I am talking about declaring the variable mid-scope, and later on adding a goto before that declaration in a different patch, let's say far above the variable declaration. As soon as a goto is added, care must be taken to make sure that we don't have variables with the cleanup attribute in the scope. Just something to take into account. >> Actually there are goto instructions in the function, but at least in >> their current form they are as harmless as useless. > > Yep. Feel free to delete them. > > regards, > dan carpenter >
On Mon, Oct 14, 2024 at 10:15:25AM +0200, Javier Carrasco wrote: > On 14/10/2024 10:12, Dan Carpenter wrote: > > On Mon, Oct 14, 2024 at 09:59:49AM +0200, Javier Carrasco wrote: > >> This approach is great as long as the maintainer accepts mid-scope > >> variable declaration and the goto instructions get refactored, as stated > >> in cleanup.h. > >> > >> The first point is not being that problematic so far, but the second one > >> is trickier, and we all have to take special care to avoid such issues, > >> even if they don't look dangerous in the current code, because adding a > >> goto where there cleanup attribute is already used can be overlooked as > >> well. > >> > > > > To be honest, I don't really understand this paragraph. I think maybe you're > > talking about if we declare the variable at the top and forget to initialize it > > to NULL? It leads to an uninitialized variable if we exit the function before > > it is initialized. > > > > No, I am talking about declaring the variable mid-scope, and later on > adding a goto before that declaration in a different patch, let's say > far above the variable declaration. As soon as a goto is added, care > must be taken to make sure that we don't have variables with the cleanup > attribute in the scope. Just something to take into account. For this simple probe function, it's not an issue, please just make it as simple and clean as possible. thanks, greg k-h
On Mon, Oct 14, 2024 at 10:15:25AM +0200, Javier Carrasco wrote: > On 14/10/2024 10:12, Dan Carpenter wrote: > > On Mon, Oct 14, 2024 at 09:59:49AM +0200, Javier Carrasco wrote: > >> This approach is great as long as the maintainer accepts mid-scope > >> variable declaration and the goto instructions get refactored, as stated > >> in cleanup.h. > >> > >> The first point is not being that problematic so far, but the second one > >> is trickier, and we all have to take special care to avoid such issues, > >> even if they don't look dangerous in the current code, because adding a > >> goto where there cleanup attribute is already used can be overlooked as > >> well. > >> > > > > To be honest, I don't really understand this paragraph. I think maybe you're > > talking about if we declare the variable at the top and forget to initialize it > > to NULL? It leads to an uninitialized variable if we exit the function before > > it is initialized. > > > > No, I am talking about declaring the variable mid-scope, and later on > adding a goto before that declaration in a different patch, let's say > far above the variable declaration. As soon as a goto is added, care > must be taken to make sure that we don't have variables with the cleanup > attribute in the scope. Just something to take into account. > Huh. That's an interesting point. If you have: if (ret) goto done; struct device_node *fw_node __free(device_node) = something; Then fw_node isn't initialized when we get to done. However, in my simple test this triggered a build failure with Clang so I believe we would catch this sort of bug pretty quickly. regards, dan carpenter
On 14/10/2024 10:39, Dan Carpenter wrote: > On Mon, Oct 14, 2024 at 10:15:25AM +0200, Javier Carrasco wrote: >> On 14/10/2024 10:12, Dan Carpenter wrote: >>> On Mon, Oct 14, 2024 at 09:59:49AM +0200, Javier Carrasco wrote: >>>> This approach is great as long as the maintainer accepts mid-scope >>>> variable declaration and the goto instructions get refactored, as stated >>>> in cleanup.h. >>>> >>>> The first point is not being that problematic so far, but the second one >>>> is trickier, and we all have to take special care to avoid such issues, >>>> even if they don't look dangerous in the current code, because adding a >>>> goto where there cleanup attribute is already used can be overlooked as >>>> well. >>>> >>> >>> To be honest, I don't really understand this paragraph. I think maybe you're >>> talking about if we declare the variable at the top and forget to initialize it >>> to NULL? It leads to an uninitialized variable if we exit the function before >>> it is initialized. >>> >> >> No, I am talking about declaring the variable mid-scope, and later on >> adding a goto before that declaration in a different patch, let's say >> far above the variable declaration. As soon as a goto is added, care >> must be taken to make sure that we don't have variables with the cleanup >> attribute in the scope. Just something to take into account. >> > > Huh. That's an interesting point. If you have: > > if (ret) > goto done; > > struct device_node *fw_node __free(device_node) = something; > > Then fw_node isn't initialized when we get to done. However, in my simple test > this triggered a build failure with Clang so I believe we would catch this sort > of bug pretty quickly. > > regards, > dan carpenter > Yes, the only pity is that GCC (I guess still the most common compiler for the Linux kernel) stays silent, and it happily builds a buggy image. But as you said, the patch will trigger some alarms as soon as it is sent upstream. In this particular case, and as Greg pointed out, that is not a real threat anyway. My digression comes to an end, and v2 is on its way. Thanks and best regards, Javier Carrasco
On 14/10/2024 09:22, Dan Carpenter wrote: >> @@ -1341,8 +1342,6 @@ static int vchiq_probe(struct platform_device *pdev) >> if (!info) >> return -EINVAL; >> >> - fw_node = of_find_compatible_node(NULL, NULL, >> - "raspberrypi,bcm2835-firmware"); > > Perhaps it's better to declare the variable here so that the function and the > error handling are next to each other. > > if (!info) > return -EINVAL; > > struct device_node *fw_node __free(device_node) = > of_find_compatible_node(NULL, NULL, "raspberrypi,bcm2835-firmware"); > if (!fw_node) { > > ... > > This is why we lifted the rule that variables had to be declared at the start > of a function. > Ack, this is how this should look like. Best regards, Krzysztof
On Mon, Oct 14, 2024 at 10:49:28AM +0200, Javier Carrasco wrote: > In this particular case, and as Greg pointed out, that is not a real > threat anyway. My digression comes to an end, and v2 is on its way. > I mean the other thing that we would accept is if you moved the NULL check to be the first thing after the declaration block... regards, dan carpenter
diff --git a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c index 27ceaac8f6cc..792cf3a807e1 100644 --- a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c +++ b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c @@ -1332,7 +1332,8 @@ MODULE_DEVICE_TABLE(of, vchiq_of_match); static int vchiq_probe(struct platform_device *pdev) { - struct device_node *fw_node; + struct device_node *fw_node __free(device_node) = + of_find_compatible_node(NULL, NULL, "raspberrypi,bcm2835-firmware"); const struct vchiq_platform_info *info; struct vchiq_drv_mgmt *mgmt; int ret; @@ -1341,8 +1342,6 @@ static int vchiq_probe(struct platform_device *pdev) if (!info) return -EINVAL; - fw_node = of_find_compatible_node(NULL, NULL, - "raspberrypi,bcm2835-firmware"); if (!fw_node) { dev_err(&pdev->dev, "Missing firmware node\n"); return -ENOENT; @@ -1353,7 +1352,6 @@ static int vchiq_probe(struct platform_device *pdev) return -ENOMEM; mgmt->fw = devm_rpi_firmware_get(&pdev->dev, fw_node); - of_node_put(fw_node); if (!mgmt->fw) return -EPROBE_DEFER;
An error path was introduced without including the required call to of_node_put() to decrement the node's refcount and avoid leaking memory. If the call to kzalloc() for 'mgmt' fails, the probe returns without decrementing the refcount. Use the automatic cleanup facility to fix the bug and protect the code against new error paths where the call to of_node_put() might be missing again. Cc: stable@vger.kernel.org Fixes: 1c9e16b73166 ("staging: vc04_services: vchiq_arm: Split driver static and runtime data") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> --- drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) --- base-commit: d61a00525464bfc5fe92c6ad713350988e492b88 change-id: 20241013-vchiq_arm-of_node_put-60a5eaaafd70 Best regards,