diff mbox series

[v2,01/15] ARM: actions: fix a leaked reference by adding missing of_node_put

Message ID 1551785646-46173-1-git-send-email-wen.yang99@zte.com.cn (mailing list archive)
State New, archived
Headers show
Series [v2,01/15] ARM: actions: fix a leaked reference by adding missing of_node_put | expand

Commit Message

Wen Yang March 5, 2019, 11:33 a.m. UTC
The call to of_get_next_child returns a node pointer with refcount
incremented thus it must be explicitly decremented after the last
usage.

Detected by coccinelle with the following warnings:
./arch/arm/mach-actions/platsmp.c:112:2-8: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 103, but without a corresponding object release within this function.
./arch/arm/mach-actions/platsmp.c:124:2-8: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 115, but without a corresponding object release within this function.
./arch/arm/mach-actions/platsmp.c:137:3-9: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 128, but without a corresponding object release within this function.

Signed-off-by: Wen Yang <wen.yang99@zte.com.cn>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Cc: "Andreas Färber" <afaerber@suse.de>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
---
v2->v1: add a missing space between "adding" and "missing"

 arch/arm/mach-actions/platsmp.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Russell King (Oracle) March 5, 2019, 11:40 a.m. UTC | #1
On Tue, Mar 05, 2019 at 07:33:52PM +0800, Wen Yang wrote:
> The call to of_get_next_child returns a node pointer with refcount
> incremented thus it must be explicitly decremented after the last
> usage.
> 
> Detected by coccinelle with the following warnings:
> ./arch/arm/mach-actions/platsmp.c:112:2-8: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 103, but without a corresponding object release within this function.
> ./arch/arm/mach-actions/platsmp.c:124:2-8: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 115, but without a corresponding object release within this function.
> ./arch/arm/mach-actions/platsmp.c:137:3-9: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 128, but without a corresponding object release within this function.

I question this.  Your reasoning is that the node is no longer used
so the reference count needs to be put.

However, in all these cases, data is read from the nodes properties
and the device remains in-use for the life of the kernel.  There is
a big difference here.

With normal drivers, each device is bound to their associated device
node associated with the device.  When the device node goes away, then
the corresponding device goes away too, which causes the driver to be
unbound from the device.

However, there is another class of "driver" which are the ones below,
where they are "permanent" devices.  These can never go away, even if
the device node refcount hits zero and the device node is freed - the
device is still present and in-use in the system.  So, having the
device node refcount hit zero is actually a bug: what that's saying
is the system device (eg, SCU) has gone away.  If you somehow were to
remove the SCU from the system, you'd end up severing the connection
between the CPU cores and the rest of the system - obviously resulting
in a dead system!

So, what is the point of dropping these refcounts for devices that can
never go away - and thus their associated device_node should also never
go away?

> 
> Signed-off-by: Wen Yang <wen.yang99@zte.com.cn>
> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
> Cc: "Andreas Färber" <afaerber@suse.de>
> Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
> Cc: Russell King <linux@armlinux.org.uk>
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> ---
> v2->v1: add a missing space between "adding" and "missing"
> 
>  arch/arm/mach-actions/platsmp.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/arch/arm/mach-actions/platsmp.c b/arch/arm/mach-actions/platsmp.c
> index 4fd479c..1a8e078 100644
> --- a/arch/arm/mach-actions/platsmp.c
> +++ b/arch/arm/mach-actions/platsmp.c
> @@ -107,6 +107,7 @@ static void __init s500_smp_prepare_cpus(unsigned int max_cpus)
>  	}
>  
>  	timer_base_addr = of_iomap(node, 0);
> +	of_node_put(node);
>  	if (!timer_base_addr) {
>  		pr_err("%s: could not map timer registers\n", __func__);
>  		return;
> @@ -119,6 +120,7 @@ static void __init s500_smp_prepare_cpus(unsigned int max_cpus)
>  	}
>  
>  	sps_base_addr = of_iomap(node, 0);
> +	of_node_put(node);
>  	if (!sps_base_addr) {
>  		pr_err("%s: could not map sps registers\n", __func__);
>  		return;
> @@ -132,6 +134,7 @@ static void __init s500_smp_prepare_cpus(unsigned int max_cpus)
>  		}
>  
>  		scu_base_addr = of_iomap(node, 0);
> +		of_node_put(node);
>  		if (!scu_base_addr) {
>  			pr_err("%s: could not map scu registers\n", __func__);
>  			return;
> -- 
> 2.9.5
> 
>
Manivannan Sadhasivam March 9, 2019, 2:17 a.m. UTC | #2
Hi Russel,

On Tue, Mar 05, 2019 at 11:40:48AM +0000, Russell King - ARM Linux admin wrote:
> On Tue, Mar 05, 2019 at 07:33:52PM +0800, Wen Yang wrote:
> > The call to of_get_next_child returns a node pointer with refcount
> > incremented thus it must be explicitly decremented after the last
> > usage.
> > 
> > Detected by coccinelle with the following warnings:
> > ./arch/arm/mach-actions/platsmp.c:112:2-8: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 103, but without a corresponding object release within this function.
> > ./arch/arm/mach-actions/platsmp.c:124:2-8: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 115, but without a corresponding object release within this function.
> > ./arch/arm/mach-actions/platsmp.c:137:3-9: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 128, but without a corresponding object release within this function.
> 
> I question this.  Your reasoning is that the node is no longer used
> so the reference count needs to be put.
> 
> However, in all these cases, data is read from the nodes properties
> and the device remains in-use for the life of the kernel.  There is
> a big difference here.
> 
> With normal drivers, each device is bound to their associated device
> node associated with the device.  When the device node goes away, then
> the corresponding device goes away too, which causes the driver to be
> unbound from the device.
> 
> However, there is another class of "driver" which are the ones below,
> where they are "permanent" devices.  These can never go away, even if
> the device node refcount hits zero and the device node is freed - the
> device is still present and in-use in the system.  So, having the
> device node refcount hit zero is actually a bug: what that's saying
> is the system device (eg, SCU) has gone away.  If you somehow were to
> remove the SCU from the system, you'd end up severing the connection
> between the CPU cores and the rest of the system - obviously resulting
> in a dead system!
> 
> So, what is the point of dropping these refcounts for devices that can
> never go away - and thus their associated device_node should also never
> go away?
> 

Yes, practically we would never hit this case but theoretically we should
decrement the refcount for nodes/properties whenever we are done with it.
As you know, there are 'n' number of places in kernel where we can see the
refcount not being put after use. So I would welcome these kind of patches
to set an example for someone who tries to use the of_* calls in future.

IMO, DT should've handled the refcount internally without exposing the
pointers to external world.

Thanks,
Mani

> > 
> > Signed-off-by: Wen Yang <wen.yang99@zte.com.cn>
> > Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
> > Cc: "Andreas Färber" <afaerber@suse.de>
> > Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
> > Cc: Russell King <linux@armlinux.org.uk>
> > Cc: linux-arm-kernel@lists.infradead.org
> > Cc: linux-kernel@vger.kernel.org
> > ---
> > v2->v1: add a missing space between "adding" and "missing"
> > 
> >  arch/arm/mach-actions/platsmp.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/arch/arm/mach-actions/platsmp.c b/arch/arm/mach-actions/platsmp.c
> > index 4fd479c..1a8e078 100644
> > --- a/arch/arm/mach-actions/platsmp.c
> > +++ b/arch/arm/mach-actions/platsmp.c
> > @@ -107,6 +107,7 @@ static void __init s500_smp_prepare_cpus(unsigned int max_cpus)
> >  	}
> >  
> >  	timer_base_addr = of_iomap(node, 0);
> > +	of_node_put(node);
> >  	if (!timer_base_addr) {
> >  		pr_err("%s: could not map timer registers\n", __func__);
> >  		return;
> > @@ -119,6 +120,7 @@ static void __init s500_smp_prepare_cpus(unsigned int max_cpus)
> >  	}
> >  
> >  	sps_base_addr = of_iomap(node, 0);
> > +	of_node_put(node);
> >  	if (!sps_base_addr) {
> >  		pr_err("%s: could not map sps registers\n", __func__);
> >  		return;
> > @@ -132,6 +134,7 @@ static void __init s500_smp_prepare_cpus(unsigned int max_cpus)
> >  		}
> >  
> >  		scu_base_addr = of_iomap(node, 0);
> > +		of_node_put(node);
> >  		if (!scu_base_addr) {
> >  			pr_err("%s: could not map scu registers\n", __func__);
> >  			return;
> > -- 
> > 2.9.5
> > 
> > 
> 
> -- 
> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
> According to speedtest.net: 11.9Mbps down 500kbps up
Russell King (Oracle) March 9, 2019, 8:14 a.m. UTC | #3
On Sat, Mar 09, 2019 at 07:47:42AM +0530, Manivannan Sadhasivam wrote:
> Hi Russel,
> 
> On Tue, Mar 05, 2019 at 11:40:48AM +0000, Russell King - ARM Linux admin wrote:
> > On Tue, Mar 05, 2019 at 07:33:52PM +0800, Wen Yang wrote:
> > > The call to of_get_next_child returns a node pointer with refcount
> > > incremented thus it must be explicitly decremented after the last
> > > usage.
> > > 
> > > Detected by coccinelle with the following warnings:
> > > ./arch/arm/mach-actions/platsmp.c:112:2-8: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 103, but without a corresponding object release within this function.
> > > ./arch/arm/mach-actions/platsmp.c:124:2-8: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 115, but without a corresponding object release within this function.
> > > ./arch/arm/mach-actions/platsmp.c:137:3-9: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 128, but without a corresponding object release within this function.
> > 
> > I question this.  Your reasoning is that the node is no longer used
> > so the reference count needs to be put.
> > 
> > However, in all these cases, data is read from the nodes properties
> > and the device remains in-use for the life of the kernel.  There is
> > a big difference here.
> > 
> > With normal drivers, each device is bound to their associated device
> > node associated with the device.  When the device node goes away, then
> > the corresponding device goes away too, which causes the driver to be
> > unbound from the device.
> > 
> > However, there is another class of "driver" which are the ones below,
> > where they are "permanent" devices.  These can never go away, even if
> > the device node refcount hits zero and the device node is freed - the
> > device is still present and in-use in the system.  So, having the
> > device node refcount hit zero is actually a bug: what that's saying
> > is the system device (eg, SCU) has gone away.  If you somehow were to
> > remove the SCU from the system, you'd end up severing the connection
> > between the CPU cores and the rest of the system - obviously resulting
> > in a dead system!
> > 
> > So, what is the point of dropping these refcounts for devices that can
> > never go away - and thus their associated device_node should also never
> > go away?
> > 
> 
> Yes, practically we would never hit this case but theoretically we should
> decrement the refcount for nodes/properties whenever we are done with it.
> As you know, there are 'n' number of places in kernel where we can see the
> refcount not being put after use. So I would welcome these kind of patches
> to set an example for someone who tries to use the of_* calls in future.
> 
> IMO, DT should've handled the refcount internally without exposing the
> pointers to external world.

It doesn't, that's my point.

In the case of normal drivers, there's an _extra_ refcount held by the
device that is created - see the of_node_get() in of_device_alloc().
This refcount exists for the lifetime of the device structure.  That
refcount exists for the duration that the device exists, which bounds
the lifetime of the availability of the device to the driver.

In effect, while the device driver is bound, there is a refcount on
the device node.  So, the device node is guaranteed to be around for
as long as the device driver is bound to the device.

For the cases being addressed in these patches, there is no driver, so
there is no bounding of the lifetime: the expectation is that the
lifetime is the duration of the kernel.  If such a device node were to
be deleted, then there is no way to unbind the driver, and if we have
dropped the refcount, the device node will be immediately freed.
However, the device is still in use.

These are a different "class" of driver.
diff mbox series

Patch

diff --git a/arch/arm/mach-actions/platsmp.c b/arch/arm/mach-actions/platsmp.c
index 4fd479c..1a8e078 100644
--- a/arch/arm/mach-actions/platsmp.c
+++ b/arch/arm/mach-actions/platsmp.c
@@ -107,6 +107,7 @@  static void __init s500_smp_prepare_cpus(unsigned int max_cpus)
 	}
 
 	timer_base_addr = of_iomap(node, 0);
+	of_node_put(node);
 	if (!timer_base_addr) {
 		pr_err("%s: could not map timer registers\n", __func__);
 		return;
@@ -119,6 +120,7 @@  static void __init s500_smp_prepare_cpus(unsigned int max_cpus)
 	}
 
 	sps_base_addr = of_iomap(node, 0);
+	of_node_put(node);
 	if (!sps_base_addr) {
 		pr_err("%s: could not map sps registers\n", __func__);
 		return;
@@ -132,6 +134,7 @@  static void __init s500_smp_prepare_cpus(unsigned int max_cpus)
 		}
 
 		scu_base_addr = of_iomap(node, 0);
+		of_node_put(node);
 		if (!scu_base_addr) {
 			pr_err("%s: could not map scu registers\n", __func__);
 			return;