diff mbox series

net: ethernet: nxp: Fix a possible memory leak in lpc_mii_probe()

Message ID 20240909092948.1118381-1-ruanjinjie@huawei.com (mailing list archive)
State New, archived
Headers show
Series net: ethernet: nxp: Fix a possible memory leak in lpc_mii_probe() | expand

Commit Message

Jinjie Ruan Sept. 9, 2024, 9:29 a.m. UTC
of_phy_find_device() calls bus_find_device(), which calls get_device()
on the returned struct device * to increment the refcount. The current
implementation does not decrement the refcount, which causes memory leak.

So add the missing phy_device_free() call to decrement the
refcount via put_device() to balance the refcount.

Fixes: 3503bf024b3e ("net: lpc_eth: parse phy nodes from device tree")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
 drivers/net/ethernet/nxp/lpc_eth.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Andrew Lunn Sept. 9, 2024, 12:54 p.m. UTC | #1
On Mon, Sep 09, 2024 at 05:29:48PM +0800, Jinjie Ruan wrote:
> of_phy_find_device() calls bus_find_device(), which calls get_device()
> on the returned struct device * to increment the refcount. The current
> implementation does not decrement the refcount, which causes memory leak.
> 
> So add the missing phy_device_free() call to decrement the
> refcount via put_device() to balance the refcount.

Why is a device reference counted?

To stop is disappearing.

> @@ -768,6 +768,9 @@ static int lpc_mii_probe(struct net_device *ndev)
>  		return -ENODEV;
>  	}
>  
> +	if (pldat->phy_node)
> +		phy_device_free(phydev);
> +
>  	phydev = phy_connect(ndev, phydev_name(phydev),
>  			     &lpc_handle_link_change,
>  			     lpc_phy_interface_mode(&pldat->pdev->dev));

Think about this code. We use of_phy_find_device to get the device,
taking a reference on it. While we hold that reference, we know it
cannot disappear and we passed it to phy_connect(), passing it into
the phylib layer. Deep down, phy_attach_direct() is called which does
a get_device() taking a reference on the device. That is the phylib
layer saying it is using it, it does not want it to disappear.

Now think about your change. As soon as you new phy_device_free() is
called, the device can disappear. phylib is then going to use
something which has gone. Bad things will happen.

So with changes like this, you need to think about lifetimes of things
being protected by a reference count. When has lpc_mii_probe(), or the
lpc driver as a whole finished with phydev? There are two obvious
alternatives i can think of.

1) It wants to keep hold of the reference until the driver remove() is
called, so you should be releasing the reference in
lpc_eth_drv_remove().

2) Once the phydev is passed to the phylib layer for it to manage,
this driver does not need to care about it any more. So it just needs
to hold the reference until after phy_connect() returns.

Memory leaks are an annoyance, but generally have little effect,
especially in probe/remove code which gets called once. Accessing
something which has gone is going to cause an Opps.

So, you need to think about the lifetime of objects you are
manipulating the reference counts on. You want to state in the commit
message your understanding of these lifetimes so the reviewer can
sanity check them.

FYI: Ignore anything you have learned while fixing device tree
reference counting bugs. Lifetimes of OF objects is probably very
broken.

	Andrew

---
pw-bot: cr
Jinjie Ruan Sept. 9, 2024, 1:17 p.m. UTC | #2
On 2024/9/9 20:54, Andrew Lunn wrote:
> On Mon, Sep 09, 2024 at 05:29:48PM +0800, Jinjie Ruan wrote:
>> of_phy_find_device() calls bus_find_device(), which calls get_device()
>> on the returned struct device * to increment the refcount. The current
>> implementation does not decrement the refcount, which causes memory leak.
>>
>> So add the missing phy_device_free() call to decrement the
>> refcount via put_device() to balance the refcount.
> 
> Why is a device reference counted?
> 
> To stop is disappearing.
> 
>> @@ -768,6 +768,9 @@ static int lpc_mii_probe(struct net_device *ndev)
>>  		return -ENODEV;
>>  	}
>>  
>> +	if (pldat->phy_node)
>> +		phy_device_free(phydev);
>> +
>>  	phydev = phy_connect(ndev, phydev_name(phydev),
>>  			     &lpc_handle_link_change,
>>  			     lpc_phy_interface_mode(&pldat->pdev->dev));
> 
> Think about this code. We use of_phy_find_device to get the device,
> taking a reference on it. While we hold that reference, we know it
> cannot disappear and we passed it to phy_connect(), passing it into
> the phylib layer. Deep down, phy_attach_direct() is called which does
> a get_device() taking a reference on the device. That is the phylib
> layer saying it is using it, it does not want it to disappear.
> 
> Now think about your change. As soon as you new phy_device_free() is
> called, the device can disappear. phylib is then going to use
> something which has gone. Bad things will happen.

Hi, Andrew,
Thank you to share me your a wealth of relevant knowledge. My knowledge
of reference count is relatively shallow.

> 
> So with changes like this, you need to think about lifetimes of things
> being protected by a reference count. When has lpc_mii_probe(), or the
> lpc driver as a whole finished with phydev? There are two obvious
> alternatives i can think of.
> 
> 1) It wants to keep hold of the reference until the driver remove() is
> called, so you should be releasing the reference in
> lpc_eth_drv_remove().
> 
> 2) Once the phydev is passed to the phylib layer for it to manage,
> this driver does not need to care about it any more. So it just needs
> to hold the reference until after phy_connect() returns.

I think this is a good chance to put the device after phy_connect().

> 
> Memory leaks are an annoyance, but generally have little effect,
> especially in probe/remove code which gets called once. Accessing
> something which has gone is going to cause an Opps.
> 
> So, you need to think about the lifetime of objects you are
> manipulating the reference counts on. You want to state in the commit
> message your understanding of these lifetimes so the reviewer can
> sanity check them.>
> FYI: Ignore anything you have learned while fixing device tree
> reference counting bugs. Lifetimes of OF objects is probably very
> broken.
> 
> 	Andrew
> 
> ---
> pw-bot: cr
diff mbox series

Patch

diff --git a/drivers/net/ethernet/nxp/lpc_eth.c b/drivers/net/ethernet/nxp/lpc_eth.c
index dd3e58a1319c..8bff394991e4 100644
--- a/drivers/net/ethernet/nxp/lpc_eth.c
+++ b/drivers/net/ethernet/nxp/lpc_eth.c
@@ -768,6 +768,9 @@  static int lpc_mii_probe(struct net_device *ndev)
 		return -ENODEV;
 	}
 
+	if (pldat->phy_node)
+		phy_device_free(phydev);
+
 	phydev = phy_connect(ndev, phydev_name(phydev),
 			     &lpc_handle_link_change,
 			     lpc_phy_interface_mode(&pldat->pdev->dev));