Message ID | 20240909092948.1118381-1-ruanjinjie@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | net: ethernet: nxp: Fix a possible memory leak in lpc_mii_probe() | expand |
On Mon, Sep 09, 2024 at 05:29:48PM +0800, Jinjie Ruan wrote: > of_phy_find_device() calls bus_find_device(), which calls get_device() > on the returned struct device * to increment the refcount. The current > implementation does not decrement the refcount, which causes memory leak. > > So add the missing phy_device_free() call to decrement the > refcount via put_device() to balance the refcount. Why is a device reference counted? To stop is disappearing. > @@ -768,6 +768,9 @@ static int lpc_mii_probe(struct net_device *ndev) > return -ENODEV; > } > > + if (pldat->phy_node) > + phy_device_free(phydev); > + > phydev = phy_connect(ndev, phydev_name(phydev), > &lpc_handle_link_change, > lpc_phy_interface_mode(&pldat->pdev->dev)); Think about this code. We use of_phy_find_device to get the device, taking a reference on it. While we hold that reference, we know it cannot disappear and we passed it to phy_connect(), passing it into the phylib layer. Deep down, phy_attach_direct() is called which does a get_device() taking a reference on the device. That is the phylib layer saying it is using it, it does not want it to disappear. Now think about your change. As soon as you new phy_device_free() is called, the device can disappear. phylib is then going to use something which has gone. Bad things will happen. So with changes like this, you need to think about lifetimes of things being protected by a reference count. When has lpc_mii_probe(), or the lpc driver as a whole finished with phydev? There are two obvious alternatives i can think of. 1) It wants to keep hold of the reference until the driver remove() is called, so you should be releasing the reference in lpc_eth_drv_remove(). 2) Once the phydev is passed to the phylib layer for it to manage, this driver does not need to care about it any more. So it just needs to hold the reference until after phy_connect() returns. Memory leaks are an annoyance, but generally have little effect, especially in probe/remove code which gets called once. Accessing something which has gone is going to cause an Opps. So, you need to think about the lifetime of objects you are manipulating the reference counts on. You want to state in the commit message your understanding of these lifetimes so the reviewer can sanity check them. FYI: Ignore anything you have learned while fixing device tree reference counting bugs. Lifetimes of OF objects is probably very broken. Andrew --- pw-bot: cr
On 2024/9/9 20:54, Andrew Lunn wrote: > On Mon, Sep 09, 2024 at 05:29:48PM +0800, Jinjie Ruan wrote: >> of_phy_find_device() calls bus_find_device(), which calls get_device() >> on the returned struct device * to increment the refcount. The current >> implementation does not decrement the refcount, which causes memory leak. >> >> So add the missing phy_device_free() call to decrement the >> refcount via put_device() to balance the refcount. > > Why is a device reference counted? > > To stop is disappearing. > >> @@ -768,6 +768,9 @@ static int lpc_mii_probe(struct net_device *ndev) >> return -ENODEV; >> } >> >> + if (pldat->phy_node) >> + phy_device_free(phydev); >> + >> phydev = phy_connect(ndev, phydev_name(phydev), >> &lpc_handle_link_change, >> lpc_phy_interface_mode(&pldat->pdev->dev)); > > Think about this code. We use of_phy_find_device to get the device, > taking a reference on it. While we hold that reference, we know it > cannot disappear and we passed it to phy_connect(), passing it into > the phylib layer. Deep down, phy_attach_direct() is called which does > a get_device() taking a reference on the device. That is the phylib > layer saying it is using it, it does not want it to disappear. > > Now think about your change. As soon as you new phy_device_free() is > called, the device can disappear. phylib is then going to use > something which has gone. Bad things will happen. Hi, Andrew, Thank you to share me your a wealth of relevant knowledge. My knowledge of reference count is relatively shallow. > > So with changes like this, you need to think about lifetimes of things > being protected by a reference count. When has lpc_mii_probe(), or the > lpc driver as a whole finished with phydev? There are two obvious > alternatives i can think of. > > 1) It wants to keep hold of the reference until the driver remove() is > called, so you should be releasing the reference in > lpc_eth_drv_remove(). > > 2) Once the phydev is passed to the phylib layer for it to manage, > this driver does not need to care about it any more. So it just needs > to hold the reference until after phy_connect() returns. I think this is a good chance to put the device after phy_connect(). > > Memory leaks are an annoyance, but generally have little effect, > especially in probe/remove code which gets called once. Accessing > something which has gone is going to cause an Opps. > > So, you need to think about the lifetime of objects you are > manipulating the reference counts on. You want to state in the commit > message your understanding of these lifetimes so the reviewer can > sanity check them.> > FYI: Ignore anything you have learned while fixing device tree > reference counting bugs. Lifetimes of OF objects is probably very > broken. > > Andrew > > --- > pw-bot: cr
diff --git a/drivers/net/ethernet/nxp/lpc_eth.c b/drivers/net/ethernet/nxp/lpc_eth.c index dd3e58a1319c..8bff394991e4 100644 --- a/drivers/net/ethernet/nxp/lpc_eth.c +++ b/drivers/net/ethernet/nxp/lpc_eth.c @@ -768,6 +768,9 @@ static int lpc_mii_probe(struct net_device *ndev) return -ENODEV; } + if (pldat->phy_node) + phy_device_free(phydev); + phydev = phy_connect(ndev, phydev_name(phydev), &lpc_handle_link_change, lpc_phy_interface_mode(&pldat->pdev->dev));
of_phy_find_device() calls bus_find_device(), which calls get_device() on the returned struct device * to increment the refcount. The current implementation does not decrement the refcount, which causes memory leak. So add the missing phy_device_free() call to decrement the refcount via put_device() to balance the refcount. Fixes: 3503bf024b3e ("net: lpc_eth: parse phy nodes from device tree") Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> --- drivers/net/ethernet/nxp/lpc_eth.c | 3 +++ 1 file changed, 3 insertions(+)