diff mbox series

net: phy: don't issue a module request if a driver is available

Message ID 20250101235122.704012-1-francesco@valla.it (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series net: phy: don't issue a module request if a driver is available | expand

Checks

Context Check Description
netdev/series_format warning Single patches do not need cover letters; Target tree name not specified in the subject
netdev/tree_selection success Guessed tree name to be net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1 this patch: 1
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers warning 4 maintainers not CCed: edumazet@google.com andrew@lunn.ch pabeni@redhat.com kuba@kernel.org
netdev/build_clang success Errors and warnings before: 2 this patch: 2
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 1 this patch: 1
netdev/checkpatch warning WARNING: line length of 88 exceeds 80 columns
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 39 this patch: 39
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2025-01-02--18-00 (tests: 881)

Commit Message

Francesco Valla Jan. 1, 2025, 11:51 p.m. UTC
Whenever a new PHY device is created, request_module() is called
unconditionally, without checking if a driver for the new PHY is already
available (either built-in or from a previous probe). This conflicts
with async probing of the underlying MDIO bus and always throws a
warning (because if a driver is loaded it _might_ cause a deadlock, if
in turn it calls async_synchronize_full()).

Add a list of registered drivers and check if one is already available
before resorting to call request_module(); in this way, if the PHY
driver is already there, the MDIO bus can perform the async probe.

Signed-off-by: Francesco Valla <francesco@valla.it>
---
 drivers/net/phy/phy_device.c | 61 +++++++++++++++++++++++++++++++-----
 1 file changed, 53 insertions(+), 8 deletions(-)

Comments

Suman Ghosh Jan. 2, 2025, 9:35 a.m. UTC | #1
> 	mutex_init(&dev->lock);
> 	INIT_DELAYED_WORK(&dev->state_queue, phy_state_machine);
>
>-	/* Request the appropriate module unconditionally; don't
>-	 * bother trying to do so only if it isn't already loaded,
>-	 * because that gets complicated. A hotplug event would have
>-	 * done an unconditional modprobe anyway.
>-	 * We don't do normal hotplug because it won't work for MDIO
>+	/* We don't do normal hotplug because it won't work for MDIO
> 	 * -- because it relies on the device staying around for long
> 	 * enough for the driver to get loaded. With MDIO, the NIC
> 	 * driver will get bored and give up as soon as it finds that @@ -
>724,7 +745,8 @@ struct phy_device *phy_device_create(struct mii_bus
>*bus, int addr, u32 phy_id,
> 		int i;
>
> 		for (i = 1; i < num_ids; i++) {
>-			if (c45_ids->device_ids[i] == 0xffffffff)
>+			if (c45_ids->device_ids[i] == 0xffffffff ||
>+			    phy_driver_exists(c45_ids->device_ids[i]))
> 				continue;
>
> 			ret = phy_request_driver_module(dev, @@ -732,7 +754,7 @@
>struct phy_device *phy_device_create(struct mii_bus *bus, int addr, u32
>phy_id,
> 			if (ret)
> 				break;
> 		}
>-	} else {
>+	} else if (!phy_driver_exists(phy_id)) {
[Suman] Can we add this phy_driver_exists() API call before the if/else check?

> 		ret = phy_request_driver_module(dev, phy_id);
> 	}
>
Francesco Valla Jan. 2, 2025, 10:21 a.m. UTC | #2
Hi Suman,

On Thursday, 2 January 2025 at 10:35:40 Suman Ghosh <sumang@marvell.com> wrote:
> > 	mutex_init(&dev->lock);
> > 	INIT_DELAYED_WORK(&dev->state_queue, phy_state_machine);
> >
> >-	/* Request the appropriate module unconditionally; don't
> >-	 * bother trying to do so only if it isn't already loaded,
> >-	 * because that gets complicated. A hotplug event would have
> >-	 * done an unconditional modprobe anyway.
> >-	 * We don't do normal hotplug because it won't work for MDIO
> >+	/* We don't do normal hotplug because it won't work for MDIO
> > 	 * -- because it relies on the device staying around for long
> > 	 * enough for the driver to get loaded. With MDIO, the NIC
> > 	 * driver will get bored and give up as soon as it finds that @@ -
> >724,7 +745,8 @@ struct phy_device *phy_device_create(struct mii_bus
> >*bus, int addr, u32 phy_id,
> > 		int i;
> >
> > 		for (i = 1; i < num_ids; i++) {
> >-			if (c45_ids->device_ids[i] == 0xffffffff)
> >+			if (c45_ids->device_ids[i] == 0xffffffff ||
> >+			    phy_driver_exists(c45_ids->device_ids[i]))
> > 				continue;
> >
> > 			ret = phy_request_driver_module(dev, @@ -732,7 +754,7 @@
> >struct phy_device *phy_device_create(struct mii_bus *bus, int addr, u32
> >phy_id,
> > 			if (ret)
> > 				break;
> > 		}
> >-	} else {
> >+	} else if (!phy_driver_exists(phy_id)) {
> [Suman] Can we add this phy_driver_exists() API call before the if/else check?
> 

Not really, as in case of C45 PHYs we have to check for drivers using (multiple)
IDs, which are different from phy_id.

> > 		ret = phy_request_driver_module(dev, phy_id);
> > 	}
> >
> 


Thank you!

Regards,
Francesco
Russell King (Oracle) Jan. 2, 2025, 11:06 a.m. UTC | #3
On Thu, Jan 02, 2025 at 12:51:22AM +0100, Francesco Valla wrote:
> Whenever a new PHY device is created, request_module() is called
> unconditionally, without checking if a driver for the new PHY is already
> available (either built-in or from a previous probe). This conflicts
> with async probing of the underlying MDIO bus and always throws a
> warning (because if a driver is loaded it _might_ cause a deadlock, if
> in turn it calls async_synchronize_full()).

Why aren't any of the phylib maintainers seeing this warning? Where does
the warning come from?

> +static bool phy_driver_exists(u32 phy_id)
> +{
> +	bool found = false;
> +	struct phy_drv_node *node;
> +
> +	down_read(&phy_drv_list_sem);
> +	list_for_each_entry(node, &phy_drv_list, list) {
> +		if (phy_id_compare(phy_id, node->drv->phy_id, node->drv->phy_id_mask)) {
> +			found = true;
> +			break;
> +		}
> +	}
> +	up_read(&phy_drv_list_sem);
> +
> +	return found;
> +}
> +

Why do we need this, along with the associated additional memory
allocations? What's wrong with bus_for_each_drv() which the core
code provides?
Francesco Valla Jan. 2, 2025, 1:26 p.m. UTC | #4
On Thursday, 2 January 2025 at 12:06:15 Russell King (Oracle) <linux@armlinux.org.uk> wrote:
> On Thu, Jan 02, 2025 at 12:51:22AM +0100, Francesco Valla wrote:
> > Whenever a new PHY device is created, request_module() is called
> > unconditionally, without checking if a driver for the new PHY is already
> > available (either built-in or from a previous probe). This conflicts
> > with async probing of the underlying MDIO bus and always throws a
> > warning (because if a driver is loaded it _might_ cause a deadlock, if
> > in turn it calls async_synchronize_full()).
> 
> Why aren't any of the phylib maintainers seeing this warning? Where does
> the warning come from?
> 

I'm not sure. For me, it was pretty easy to trigger. I'm on a BeaglePlay (AM62X)
and I am working on boot time optimization and with different configurations for
async probing. When the async probe of the Ethernet NIC (i.e.: am65-cpsw-nuss)
runs, it in turn triggers the probe of the associated davincio_mdio device,
which then brings me to the following WARNING:

[    0.287271] davinci_mdio 8000f00.mdio: davinci mdio revision 9.7, bus freq 1000000
[    0.287574] ------------[ cut here ]------------
[    0.287581] WARNING: CPU: 2 PID: 48 at /kernel/module/kmod.c:144 __request_module+0x19c/0x204
[    0.287605] Modules linked in:
[    0.287619] CPU: 2 UID: 0 PID: 48 Comm: kworker/u16:2 Not tainted 6.12.4-00004-g89f77a9313d4-dirty #1
[    0.287630] Hardware name: BeagleBoard.org BeaglePlay (DT)
[    0.287637] Workqueue: async async_run_entry_fn
[    0.287653] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    0.287663] pc : __request_module+0x19c/0x204
[    0.287671] lr : __request_module+0x198/0x204
[    0.287679] sp : ffff8000817b2e70
[    0.287684] x29: ffff8000817b2ef0 x28: ffff000001b994b0 x27: 0000000000000000
[    0.287698] x26: 0000000000000000 x25: ffff8000817b3277 x24: 0000000000000000
[    0.287712] x23: ffff000001b99000 x22: 0000000000000000 x21: ffff0000038a8800
[    0.287726] x20: 0000000000000001 x19: ffff800080d4dc18 x18: 0000000000000002
[    0.287739] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[    0.287753] x14: 0000000000000001 x13: 0000000000000001 x12: 0000000000000001
[    0.287767] x11: 0000000000000001 x10: 0000000000000000 x9 : 0000000000000000
[    0.287780] x8 : ffff8000817b2ee8 x7 : 0000000000000000 x6 : 0000000000000000
[    0.287794] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000030
[    0.287806] x2 : 0000000000000008 x1 : ffff8000800c66dc x0 : 0000000000000001
[    0.287820] Call trace:
[    0.287821] omap_i2c 20000000.i2c: bus 0 rev0.12 at 400 kHz
[    0.287826]  __request_module+0x19c/0x204
[    0.287835]  phy_request_driver_module+0x11c/0x17c
[    0.287849]  phy_device_create+0x234/0x260
[    0.287860]  get_phy_device+0x78/0x154
[    0.287870]  fwnode_mdiobus_register_phy+0x11c/0x1d8
[    0.287880]  __of_mdiobus_parse_phys+0x174/0x2a0
[    0.287890]  __of_mdiobus_register+0x104/0x240
[    0.287899]  davinci_mdio_probe+0x284/0x44c
[    0.287909]  platform_probe+0x68/0xc4
[    0.287921]  really_probe+0xbc/0x29c
[    0.287930]  really_probe_debug+0x88/0x110
[    0.287940]  __driver_probe_device+0xbc/0xd4
[    0.287948]  driver_probe_device+0xd8/0x15c
[    0.287957]  __device_attach_driver+0xb8/0x134
[    0.287965]  bus_for_each_drv+0x88/0xe8
[    0.287974]  __device_attach+0xa0/0x190
[    0.287982]  device_initial_probe+0x14/0x20
[    0.287991]  bus_probe_device+0xac/0xb0
[    0.287998]  device_add+0x588/0x74c
[    0.288011]  of_device_add+0x44/0x60
[    0.288027]  of_platform_device_create_pdata+0x8c/0x120
[    0.288039]  of_platform_device_create+0x18/0x24
[    0.288050]  am65_cpsw_nuss_probe+0x670/0xcf4
[    0.288062]  platform_probe+0x68/0xc4
[    0.288072]  really_probe+0xbc/0x29c
[    0.288079]  really_probe_debug+0x88/0x110
[    0.288088]  __driver_probe_device+0xbc/0xd4
[    0.288096]  driver_probe_device+0xd8/0x15c
[    0.288105]  __driver_attach_async_helper+0x4c/0xb4
[    0.288113]  async_run_entry_fn+0x34/0xe0
[    0.288124]  process_one_work+0x148/0x288
[    0.288137]  worker_thread+0x2cc/0x3d4
[    0.288147]  kthread+0x110/0x114
[    0.288159]  ret_from_fork+0x10/0x20
[    0.288171] ---[ end trace 0000000000000000 ]---

This is expected, as request_module() is not meant to be called from an async
context:

https://lore.kernel.org/lkml/20130118221227.GG24579@htj.dyndns.org/

It should be noted that:
 - the davincio_mdio device is a child of the am65-cpsw-nuss device
 - the am65-cpsw-nuss driver is NOT marked with neither PROBE_PREFER_ASYNCHRONOUS
   nor PROBE_FORCE_SYNCHRONOUS and the behavior is being triggered specifying
   driver_async_probe=am65-cpsw-nuss on the command line.

> > +static bool phy_driver_exists(u32 phy_id)
> > +{
> > +	bool found = false;
> > +	struct phy_drv_node *node;
> > +
> > +	down_read(&phy_drv_list_sem);
> > +	list_for_each_entry(node, &phy_drv_list, list) {
> > +		if (phy_id_compare(phy_id, node->drv->phy_id, node->drv->phy_id_mask)) {
> > +			found = true;
> > +			break;
> > +		}
> > +	}
> > +	up_read(&phy_drv_list_sem);
> > +
> > +	return found;
> > +}
> > +
> 
> Why do we need this, along with the associated additional memory
> allocations? What's wrong with bus_for_each_drv() which the core
> code provides?
> 
> 

Because I didn't think of it, but it would be much better. I'll refactor the
logic using bus_for_each_drv() + a simple match callback.

Thank you!

Regards,
Francesco
Andrew Lunn Jan. 2, 2025, 1:52 p.m. UTC | #5
On Thu, Jan 02, 2025 at 02:26:58PM +0100, Francesco Valla wrote:
> On Thursday, 2 January 2025 at 12:06:15 Russell King (Oracle) <linux@armlinux.org.uk> wrote:
> > On Thu, Jan 02, 2025 at 12:51:22AM +0100, Francesco Valla wrote:
> > > Whenever a new PHY device is created, request_module() is called
> > > unconditionally, without checking if a driver for the new PHY is already
> > > available (either built-in or from a previous probe). This conflicts
> > > with async probing of the underlying MDIO bus and always throws a
> > > warning (because if a driver is loaded it _might_ cause a deadlock, if
> > > in turn it calls async_synchronize_full()).
> > 
> > Why aren't any of the phylib maintainers seeing this warning? Where does
> > the warning come from?
> > 
> 
> I'm not sure. For me, it was pretty easy to trigger.

Please include the information how you triggered it into the commit
message.

> This is expected, as request_module() is not meant to be called from an async
> context:
> 
> https://lore.kernel.org/lkml/20130118221227.GG24579@htj.dyndns.org/
> 
> It should be noted that:
>  - the davincio_mdio device is a child of the am65-cpsw-nuss device
>  - the am65-cpsw-nuss driver is NOT marked with neither PROBE_PREFER_ASYNCHRONOUS
>    nor PROBE_FORCE_SYNCHRONOUS and the behavior is being triggered specifying
>    driver_async_probe=am65-cpsw-nuss on the command line.

So the phylib core is currently async probe incompatible. The whole
module loading story is a bit shaky in phylib, so we need to be very
careful with any changes, or you are going to break stuff, in
interesting ways, with it first appearing to work, because the
fallback genphy is used rather than the specific PHY driver, but then
breaking when genphy is not sufficient.

Please think about this as a generic problem with async probe. Is this
really specific to phylib? Should some or all of the solution to the
problem be moved into the driver core? Could we maybe first try an
async probe using the existing drivers, and then fall back to a sync
probe which can load additional drivers?

One other question, how much speadup do you get with async probe of
PHYs? Is it really worth the effort?

	Andrew
Francesco Valla Jan. 2, 2025, 2:01 p.m. UTC | #6
On Thursday, 2 January 2025 at 14:52:18 Andrew Lunn <andrew@lunn.ch> wrote:
> On Thu, Jan 02, 2025 at 02:26:58PM +0100, Francesco Valla wrote:
> > On Thursday, 2 January 2025 at 12:06:15 Russell King (Oracle) <linux@armlinux.org.uk> wrote:
> > > On Thu, Jan 02, 2025 at 12:51:22AM +0100, Francesco Valla wrote:
> > > > Whenever a new PHY device is created, request_module() is called
> > > > unconditionally, without checking if a driver for the new PHY is already
> > > > available (either built-in or from a previous probe). This conflicts
> > > > with async probing of the underlying MDIO bus and always throws a
> > > > warning (because if a driver is loaded it _might_ cause a deadlock, if
> > > > in turn it calls async_synchronize_full()).
> > > 
> > > Why aren't any of the phylib maintainers seeing this warning? Where does
> > > the warning come from?
> > > 
> > 
> > I'm not sure. For me, it was pretty easy to trigger.
> 
> Please include the information how you triggered it into the commit
> message.
> 

Ok, will do.

> > This is expected, as request_module() is not meant to be called from an async
> > context:
> > 
> > https://lore.kernel.org/lkml/20130118221227.GG24579@htj.dyndns.org/
> > 
> > It should be noted that:
> >  - the davincio_mdio device is a child of the am65-cpsw-nuss device
> >  - the am65-cpsw-nuss driver is NOT marked with neither PROBE_PREFER_ASYNCHRONOUS
> >    nor PROBE_FORCE_SYNCHRONOUS and the behavior is being triggered specifying
> >    driver_async_probe=am65-cpsw-nuss on the command line.
> 
> So the phylib core is currently async probe incompatible. The whole
> module loading story is a bit shaky in phylib, so we need to be very
> careful with any changes, or you are going to break stuff, in
> interesting ways, with it first appearing to work, because the
> fallback genphy is used rather than the specific PHY driver, but then
> breaking when genphy is not sufficient.
> 
> Please think about this as a generic problem with async probe. Is this
> really specific to phylib? Should some or all of the solution to the
> problem be moved into the driver core? Could we maybe first try an
> async probe using the existing drivers, and then fall back to a sync
> probe which can load additional drivers?

It probably isn't, but considering the way phylib works currently (with the
genphy filling up for missing drivers etc.) I'm not sure if/how this can work.
I need to think about it.

> 
> One other question, how much speadup do you get with async probe of
> PHYs? Is it really worth the effort?

For me it's a reduction of ~170ms, which currently accounts for roughly the 25%
of the time spent before starting userspace (660ms vs 490ms, give or take a
couple of milliseconds). That's due to the large reset time required by the PHYs
to initialize, so I expect it would be much lower on most of the systems.
But, I've done much more to shave much less time in the past, so I think it is
at least worth investigating.

Thank you!

Regards,
Francesco

> 
> 	Andrew
>
Andrew Lunn Jan. 2, 2025, 2:21 p.m. UTC | #7
> > One other question, how much speadup do you get with async probe of
> > PHYs? Is it really worth the effort?
> 
> For me it's a reduction of ~170ms, which currently accounts for roughly the 25%
> of the time spent before starting userspace (660ms vs 490ms, give or take a
> couple of milliseconds). That's due to the large reset time required by the PHYs
> to initialize

170ms seems like a long time for a reset. Maybe check the datasheet
and see what it says.

	Andrew
Suman Ghosh Jan. 3, 2025, 7:47 a.m. UTC | #8
>> > 	mutex_init(&dev->lock);
>> > 	INIT_DELAYED_WORK(&dev->state_queue, phy_state_machine);
>> >
>> >-	/* Request the appropriate module unconditionally; don't
>> >-	 * bother trying to do so only if it isn't already loaded,
>> >-	 * because that gets complicated. A hotplug event would have
>> >-	 * done an unconditional modprobe anyway.
>> >-	 * We don't do normal hotplug because it won't work for MDIO
>> >+	/* We don't do normal hotplug because it won't work for MDIO
>> > 	 * -- because it relies on the device staying around for long
>> > 	 * enough for the driver to get loaded. With MDIO, the NIC
>> > 	 * driver will get bored and give up as soon as it finds that @@ -
>> >724,7 +745,8 @@ struct phy_device *phy_device_create(struct mii_bus
>> >*bus, int addr, u32 phy_id,
>> > 		int i;
>> >
>> > 		for (i = 1; i < num_ids; i++) {
>> >-			if (c45_ids->device_ids[i] == 0xffffffff)
>> >+			if (c45_ids->device_ids[i] == 0xffffffff ||
>> >+			    phy_driver_exists(c45_ids->device_ids[i]))
>> > 				continue;
>> >
>> > 			ret = phy_request_driver_module(dev, @@ -732,7 +754,7 @@
>struct
>> >phy_device *phy_device_create(struct mii_bus *bus, int addr, u32
>> >phy_id,
>> > 			if (ret)
>> > 				break;
>> > 		}
>> >-	} else {
>> >+	} else if (!phy_driver_exists(phy_id)) {
>> [Suman] Can we add this phy_driver_exists() API call before the
>if/else check?
>>
>
>Not really, as in case of C45 PHYs we have to check for drivers using
>(multiple) IDs, which are different from phy_id.
[Suman] Hmm got it. We can isolate the else part but that won't make much difference. Thank you!!
>
Niklas Cassel Jan. 3, 2025, 11:25 a.m. UTC | #9
On Thu, Jan 02, 2025 at 02:52:18PM +0100, Andrew Lunn wrote:
> On Thu, Jan 02, 2025 at 02:26:58PM +0100, Francesco Valla wrote:
> > On Thursday, 2 January 2025 at 12:06:15 Russell King (Oracle) <linux@armlinux.org.uk> wrote:
> > > On Thu, Jan 02, 2025 at 12:51:22AM +0100, Francesco Valla wrote:
> > > > Whenever a new PHY device is created, request_module() is called
> > > > unconditionally, without checking if a driver for the new PHY is already
> > > > available (either built-in or from a previous probe). This conflicts
> > > > with async probing of the underlying MDIO bus and always throws a
> > > > warning (because if a driver is loaded it _might_ cause a deadlock, if
> > > > in turn it calls async_synchronize_full()).
> > > 
> > > Why aren't any of the phylib maintainers seeing this warning? Where does
> > > the warning come from?
> > > 
> > 
> > I'm not sure. For me, it was pretty easy to trigger.
> 
> Please include the information how you triggered it into the commit
> message.
> 
> > This is expected, as request_module() is not meant to be called from an async
> > context:
> > 
> > https://lore.kernel.org/lkml/20130118221227.GG24579@htj.dyndns.org/
> > 
> > It should be noted that:
> >  - the davincio_mdio device is a child of the am65-cpsw-nuss device
> >  - the am65-cpsw-nuss driver is NOT marked with neither PROBE_PREFER_ASYNCHRONOUS
> >    nor PROBE_FORCE_SYNCHRONOUS and the behavior is being triggered specifying
> >    driver_async_probe=am65-cpsw-nuss on the command line.
> 
> So the phylib core is currently async probe incompatible. The whole
> module loading story is a bit shaky in phylib, so we need to be very
> careful with any changes, or you are going to break stuff, in
> interesting ways, with it first appearing to work, because the
> fallback genphy is used rather than the specific PHY driver, but then
> breaking when genphy is not sufficient.
> 
> Please think about this as a generic problem with async probe. Is this
> really specific to phylib? Should some or all of the solution to the
> problem be moved into the driver core? Could we maybe first try an
> async probe using the existing drivers, and then fall back to a sync
> probe which can load additional drivers?
> 
> One other question, how much speadup do you get with async probe of
> PHYs? Is it really worth the effort?
> 

I'm trying to enable async probe for my PCIe controller (pcie-dw-rockchip),
which on the radxa rock5b has a RTL8125 NIC connected to it.

By enabling async probe for the PCIe driver I get the same splat as Francesco.

Looking at the prints, it is trying to load a module for PHY ID: 0x1cc840
This PHY ID is defined in: drivers/net/phy/realtek.c.

Looking at my .config I have:
CONFIG_REALTEK_PHY=y

So this is not built as a module, so I am a bit surprised to see this
splat (since the driver is built as built-in).


I think it would be nice if the phylib core could be fixed so that
it does not try to load modules for drivers which are built as built-in.


Also see this old thread that tries to enable async probe by default on
DT systems:
https://lore.kernel.org/linux-kernel//d5796286-ec24-511a-5910-5673f8ea8b10@samsung.com/T/#u

AFAICT, it seems that the phylib core is one of the biggest blockers from
being able to enable async probe by default on DT systems.


Kind regards,
Niklas
Russell King (Oracle) Jan. 3, 2025, 12:08 p.m. UTC | #10
On Fri, Jan 03, 2025 at 12:25:52PM +0100, Niklas Cassel wrote:
> I'm trying to enable async probe for my PCIe controller (pcie-dw-rockchip),
> which on the radxa rock5b has a RTL8125 NIC connected to it.
> 
> By enabling async probe for the PCIe driver I get the same splat as Francesco.
> 
> Looking at the prints, it is trying to load a module for PHY ID: 0x1cc840
> This PHY ID is defined in: drivers/net/phy/realtek.c.
> 
> Looking at my .config I have:
> CONFIG_REALTEK_PHY=y
> 
> So this is not built as a module, so I am a bit surprised to see this
> splat (since the driver is built as built-in).
> 
> 
> I think it would be nice if the phylib core could be fixed so that
> it does not try to load modules for drivers which are built as built-in.
> 
> 
> Also see this old thread that tries to enable async probe by default on
> DT systems:
> https://lore.kernel.org/linux-kernel//d5796286-ec24-511a-5910-5673f8ea8b10@samsung.com/T/#u
> 
> AFAICT, it seems that the phylib core is one of the biggest blockers from
> being able to enable async probe by default on DT systems.

Yes, we accept that phylib is incompatible with async probing. I don't
think that's going to change, because it's fundamentally baked in with
the way the whole fallback driver stuff works.

We *certainly* don't want to move the request_module() into
phy_attach*() (which is the point where we require the driver to be
bound or we fallback to the generic feature-reduced driver). First,
that *will* break SFP modules, no ifs or buts.

Second, moving it there would mean calling request_module() in many
cases with the RTNL held, which blocks things like new connections
network establishing while the module is requested (I've run into this
problem when the TI Wilink driver locks up holding the RTNL lock making
the platform impossible to remotely resolve if there isn't an already
open SSH connection.) We certainly don't want userspace to be doing
stuff while holding the "big" RTNL that affects much of networking.

Third, I suspect phylib already has a race between the PHY driver /
driver core binding the appropriate driver and phy_attach_direct()
attaching the fallback generic driver to the driverless PHY device,
and making this more "async" is going to open that race possibly to
the point where it becomes a problem. (At the moment, it doesn't
seem to cause any issue, so is theoretical right now - but if one
reads the code, it's obvious that there is no locking that prevents
a race there.)

What saves phylib right now is that by issueing the request_module(),
that will wait for the module to be loaded and initialised. The
initialisation function will register the PHY drivers in this module.
As this is synchronous, it will happen before request_module() returns,
and thus before phy_device_create() returns. Thus, if there is a module
available for the PHY, it will be loaded and available to be bound
to the PHY device by the time phy_device_register() is called. This
ensures that - in the case of an auto-loaded module, the race will
never happen.

Yes, it's weak. A scenario that could trigger this is loading PHY
driver modules in parallel with a call to the phy_attach*() functions,
e.g. when bringing up a network interface where the network driver
calls through to phy_attach*() from its .ndo_open() method. If we
simply make phylib's request_module() async, then this race will be
opened for auto-loaded modules as well.

Closing this race to give consistent results is impossible, even if
we add locking. If phy_attach*() were to complete first, the generic
driver would be used despite the PHY specific driver module being
loaded. Alternatively, if the PHY specific driver module finishes
being loaded before phy_attach*() is called, then the PHY specific
driver will be used for the device. So... it needs to be synchronous.

I also don't think "make a list of built-in drivers and omit the
request module" is an acceptable workaround - it's a sticky plaster
for the problem. If the PHY driver isn't built-in, then you have the
same problem with request_module() being issued. You could work around
that by ensuring that the PHY driver is built-in, but then we're
relying on multiple different things all being correct in diverse
areas, which is fragile.

All in all, I don't see at the moment any simple way to make phylib
async-probe compatible.
Niklas Cassel Jan. 3, 2025, 12:49 p.m. UTC | #11
On Fri, Jan 03, 2025 at 12:08:43PM +0000, Russell King (Oracle) wrote:
> On Fri, Jan 03, 2025 at 12:25:52PM +0100, Niklas Cassel wrote:
> > I'm trying to enable async probe for my PCIe controller (pcie-dw-rockchip),
> > which on the radxa rock5b has a RTL8125 NIC connected to it.
> > 
> > By enabling async probe for the PCIe driver I get the same splat as Francesco.
> > 
> > Looking at the prints, it is trying to load a module for PHY ID: 0x1cc840
> > This PHY ID is defined in: drivers/net/phy/realtek.c.
> > 
> > Looking at my .config I have:
> > CONFIG_REALTEK_PHY=y
> > 
> > So this is not built as a module, so I am a bit surprised to see this
> > splat (since the driver is built as built-in).
> > 
> > 
> > I think it would be nice if the phylib core could be fixed so that
> > it does not try to load modules for drivers which are built as built-in.
> > 
> > 
> > Also see this old thread that tries to enable async probe by default on
> > DT systems:
> > https://lore.kernel.org/linux-kernel//d5796286-ec24-511a-5910-5673f8ea8b10@samsung.com/T/#u
> > 
> > AFAICT, it seems that the phylib core is one of the biggest blockers from
> > being able to enable async probe by default on DT systems.
> 
> Yes, we accept that phylib is incompatible with async probing. I don't
> think that's going to change, because it's fundamentally baked in with
> the way the whole fallback driver stuff works.
> 
> We *certainly* don't want to move the request_module() into
> phy_attach*() (which is the point where we require the driver to be
> bound or we fallback to the generic feature-reduced driver). First,
> that *will* break SFP modules, no ifs or buts.
> 
> Second, moving it there would mean calling request_module() in many
> cases with the RTNL held, which blocks things like new connections
> network establishing while the module is requested (I've run into this
> problem when the TI Wilink driver locks up holding the RTNL lock making
> the platform impossible to remotely resolve if there isn't an already
> open SSH connection.) We certainly don't want userspace to be doing
> stuff while holding the "big" RTNL that affects much of networking.
> 
> Third, I suspect phylib already has a race between the PHY driver /
> driver core binding the appropriate driver and phy_attach_direct()
> attaching the fallback generic driver to the driverless PHY device,
> and making this more "async" is going to open that race possibly to
> the point where it becomes a problem. (At the moment, it doesn't
> seem to cause any issue, so is theoretical right now - but if one
> reads the code, it's obvious that there is no locking that prevents
> a race there.)
> 
> What saves phylib right now is that by issueing the request_module(),
> that will wait for the module to be loaded and initialised. The
> initialisation function will register the PHY drivers in this module.
> As this is synchronous, it will happen before request_module() returns,
> and thus before phy_device_create() returns. Thus, if there is a module
> available for the PHY, it will be loaded and available to be bound
> to the PHY device by the time phy_device_register() is called. This
> ensures that - in the case of an auto-loaded module, the race will
> never happen.
> 
> Yes, it's weak. A scenario that could trigger this is loading PHY
> driver modules in parallel with a call to the phy_attach*() functions,
> e.g. when bringing up a network interface where the network driver
> calls through to phy_attach*() from its .ndo_open() method. If we
> simply make phylib's request_module() async, then this race will be
> opened for auto-loaded modules as well.
> 
> Closing this race to give consistent results is impossible, even if
> we add locking. If phy_attach*() were to complete first, the generic
> driver would be used despite the PHY specific driver module being
> loaded. Alternatively, if the PHY specific driver module finishes
> being loaded before phy_attach*() is called, then the PHY specific
> driver will be used for the device. So... it needs to be synchronous.
> 
> I also don't think "make a list of built-in drivers and omit the
> request module" is an acceptable workaround - it's a sticky plaster
> for the problem. If the PHY driver isn't built-in, then you have the
> same problem with request_module() being issued. You could work around
> that by ensuring that the PHY driver is built-in, but then we're
> relying on multiple different things all being correct in diverse
> areas, which is fragile.

FWIW, the patch in $subject does make the splat go away for me.
(I have the PHY driver built as built-in).

The patch in $subject does "Add a list of registered drivers and check
if one is already available before resorting to call request_module();
in this way, if the PHY driver is already there, the MDIO bus can perform
the async probe."

Personally, I think this solution is better than keeping the status-quo.

If we take a buildroot kernel config for a specific board as an example,
they are very careful to always mark the NIC driver for the specific board
as built-in, such that the board can use nfsroot.
(Same logic can be applied to phylib driver.)

Having a solution similar to what is suggested in $subject would certainly
improve things for DT based systems.

Also note that arch/arm64/configs/defconfig marks a bunch of phylib drivers
as built-in.


Kind regards,
Niklas
Russell King (Oracle) Jan. 3, 2025, 2:03 p.m. UTC | #12
On Fri, Jan 03, 2025 at 01:49:30PM +0100, Niklas Cassel wrote:
> On Fri, Jan 03, 2025 at 12:08:43PM +0000, Russell King (Oracle) wrote:
> > On Fri, Jan 03, 2025 at 12:25:52PM +0100, Niklas Cassel wrote:
> > > I'm trying to enable async probe for my PCIe controller (pcie-dw-rockchip),
> > > which on the radxa rock5b has a RTL8125 NIC connected to it.
> > > 
> > > By enabling async probe for the PCIe driver I get the same splat as Francesco.
> > > 
> > > Looking at the prints, it is trying to load a module for PHY ID: 0x1cc840
> > > This PHY ID is defined in: drivers/net/phy/realtek.c.
> > > 
> > > Looking at my .config I have:
> > > CONFIG_REALTEK_PHY=y
> > > 
> > > So this is not built as a module, so I am a bit surprised to see this
> > > splat (since the driver is built as built-in).
> > > 
> > > 
> > > I think it would be nice if the phylib core could be fixed so that
> > > it does not try to load modules for drivers which are built as built-in.
> > > 
> > > 
> > > Also see this old thread that tries to enable async probe by default on
> > > DT systems:
> > > https://lore.kernel.org/linux-kernel//d5796286-ec24-511a-5910-5673f8ea8b10@samsung.com/T/#u
> > > 
> > > AFAICT, it seems that the phylib core is one of the biggest blockers from
> > > being able to enable async probe by default on DT systems.
> > 
> > Yes, we accept that phylib is incompatible with async probing. I don't
> > think that's going to change, because it's fundamentally baked in with
> > the way the whole fallback driver stuff works.
> > 
> > We *certainly* don't want to move the request_module() into
> > phy_attach*() (which is the point where we require the driver to be
> > bound or we fallback to the generic feature-reduced driver). First,
> > that *will* break SFP modules, no ifs or buts.
> > 
> > Second, moving it there would mean calling request_module() in many
> > cases with the RTNL held, which blocks things like new connections
> > network establishing while the module is requested (I've run into this
> > problem when the TI Wilink driver locks up holding the RTNL lock making
> > the platform impossible to remotely resolve if there isn't an already
> > open SSH connection.) We certainly don't want userspace to be doing
> > stuff while holding the "big" RTNL that affects much of networking.
> > 
> > Third, I suspect phylib already has a race between the PHY driver /
> > driver core binding the appropriate driver and phy_attach_direct()
> > attaching the fallback generic driver to the driverless PHY device,
> > and making this more "async" is going to open that race possibly to
> > the point where it becomes a problem. (At the moment, it doesn't
> > seem to cause any issue, so is theoretical right now - but if one
> > reads the code, it's obvious that there is no locking that prevents
> > a race there.)
> > 
> > What saves phylib right now is that by issueing the request_module(),
> > that will wait for the module to be loaded and initialised. The
> > initialisation function will register the PHY drivers in this module.
> > As this is synchronous, it will happen before request_module() returns,
> > and thus before phy_device_create() returns. Thus, if there is a module
> > available for the PHY, it will be loaded and available to be bound
> > to the PHY device by the time phy_device_register() is called. This
> > ensures that - in the case of an auto-loaded module, the race will
> > never happen.
> > 
> > Yes, it's weak. A scenario that could trigger this is loading PHY
> > driver modules in parallel with a call to the phy_attach*() functions,
> > e.g. when bringing up a network interface where the network driver
> > calls through to phy_attach*() from its .ndo_open() method. If we
> > simply make phylib's request_module() async, then this race will be
> > opened for auto-loaded modules as well.
> > 
> > Closing this race to give consistent results is impossible, even if
> > we add locking. If phy_attach*() were to complete first, the generic
> > driver would be used despite the PHY specific driver module being
> > loaded. Alternatively, if the PHY specific driver module finishes
> > being loaded before phy_attach*() is called, then the PHY specific
> > driver will be used for the device. So... it needs to be synchronous.
> > 
> > I also don't think "make a list of built-in drivers and omit the
> > request module" is an acceptable workaround - it's a sticky plaster
> > for the problem. If the PHY driver isn't built-in, then you have the
> > same problem with request_module() being issued. You could work around
> > that by ensuring that the PHY driver is built-in, but then we're
> > relying on multiple different things all being correct in diverse
> > areas, which is fragile.
> 
> FWIW, the patch in $subject does make the splat go away for me.
> (I have the PHY driver built as built-in).

This is not about "making the splat go away". Making something go away
is not solving the problem if it's a fragile sticky plaster. Such things
come back to bite.

> The patch in $subject does "Add a list of registered drivers and check
> if one is already available before resorting to call request_module();
> in this way, if the PHY driver is already there, the MDIO bus can perform
> the async probe."

This is the sticky plaster. I covered that in my message to which you're
replying. See the last paragraph from my reply quoted above.
Andrew Lunn Jan. 3, 2025, 2:12 p.m. UTC | #13
> FWIW, the patch in $subject does make the splat go away for me.
> (I have the PHY driver built as built-in).
> 
> The patch in $subject does "Add a list of registered drivers and check
> if one is already available before resorting to call request_module();
> in this way, if the PHY driver is already there, the MDIO bus can perform
> the async probe."

Lets take a step backwards.

How in general should module loading work with async probe? Lets
understand that first.

Then we can think about what is special about PHYs, and how we can fit
into the existing framework.

	Andrew
Niklas Cassel Jan. 3, 2025, 2:34 p.m. UTC | #14
On Fri, Jan 03, 2025 at 03:12:14PM +0100, Andrew Lunn wrote:
> > FWIW, the patch in $subject does make the splat go away for me.
> > (I have the PHY driver built as built-in).
> > 
> > The patch in $subject does "Add a list of registered drivers and check
> > if one is already available before resorting to call request_module();
> > in this way, if the PHY driver is already there, the MDIO bus can perform
> > the async probe."
> 
> Lets take a step backwards.
> 
> How in general should module loading work with async probe? Lets
> understand that first.
> 
> Then we can think about what is special about PHYs, and how we can fit
> into the existing framework.

I agree that it might be a good idea, if possible, for request_module()
itself to see if the requested module is already registered, and then do
nothing.

Adding Luis (modules maintainer) to CC.

Luis, phylib calls request_module() unconditionally, regardless if there
is a driver registered already (phy driver built as built-in).

This causes the splat described here to be printed:
https://lore.kernel.org/netdev/7103704.9J7NaK4W3v@fedora.fritz.box/T/#u

But as far as I can tell, this is a false positive, since the call cannot
possibly load any module (since the driver is built as built-in and not
as a module).


Kind regards,
Niklas
Andrew Lunn Jan. 3, 2025, 3:37 p.m. UTC | #15
On Fri, Jan 03, 2025 at 03:34:26PM +0100, Niklas Cassel wrote:
> On Fri, Jan 03, 2025 at 03:12:14PM +0100, Andrew Lunn wrote:
> > > FWIW, the patch in $subject does make the splat go away for me.
> > > (I have the PHY driver built as built-in).
> > > 
> > > The patch in $subject does "Add a list of registered drivers and check
> > > if one is already available before resorting to call request_module();
> > > in this way, if the PHY driver is already there, the MDIO bus can perform
> > > the async probe."
> > 
> > Lets take a step backwards.
> > 
> > How in general should module loading work with async probe? Lets
> > understand that first.
> > 
> > Then we can think about what is special about PHYs, and how we can fit
> > into the existing framework.
> 
> I agree that it might be a good idea, if possible, for request_module()
> itself to see if the requested module is already registered, and then do
> nothing.
> 
> Adding Luis (modules maintainer) to CC.
> 
> Luis, phylib calls request_module() unconditionally, regardless if there
> is a driver registered already (phy driver built as built-in).
> 
> This causes the splat described here to be printed:
> https://lore.kernel.org/netdev/7103704.9J7NaK4W3v@fedora.fritz.box/T/#u
> 
> But as far as I can tell, this is a false positive, since the call cannot
> possibly load any module (since the driver is built as built-in and not
> as a module).

Please be careful here. Just because it is built in for your build
does not mean it is built in for everybody. The code needs to be able
to load the module if it is not built in.

Also, i've seen broken builds where the driver is both built in and
exists as a module. User space will try to load the module, and the
linker will then complain about duplicate symbols and the load fails.
So it is not a false positive. There is also the question of what
exactly causes the deadlock. Is simply calling user space to load a
module which does not exist sufficient to cause the deadlock? That is
what is happening in your system, with built in modules.

Also, the kernel probably has no idea if the module has already been
loaded nor not when request_module() is called. What we pass to the
request_module() is not the name of the module, but a string which
represents one of the IDs the PHY has. It is then userspace that maps
that string to a kernel module via modules.alias. And there is a many
to one mapping, many IDs map to one module. So it could be the module
has already been loaded due to some other string.

So, back to my original question. How should module loading work with
async probe?

      Andrew
Francesco Valla Jan. 3, 2025, 5:57 p.m. UTC | #16
Hi,

On Friday, 3 January 2025 at 16:37:33 Andrew Lunn <andrew@lunn.ch> wrote:
> On Fri, Jan 03, 2025 at 03:34:26PM +0100, Niklas Cassel wrote:
> > On Fri, Jan 03, 2025 at 03:12:14PM +0100, Andrew Lunn wrote:
> > > > FWIW, the patch in $subject does make the splat go away for me.
> > > > (I have the PHY driver built as built-in).
> > > > 
> > > > The patch in $subject does "Add a list of registered drivers and check
> > > > if one is already available before resorting to call request_module();
> > > > in this way, if the PHY driver is already there, the MDIO bus can perform
> > > > the async probe."
> > > 
> > > Lets take a step backwards.
> > > 
> > > How in general should module loading work with async probe? Lets
> > > understand that first.
> > > 
> > > Then we can think about what is special about PHYs, and how we can fit
> > > into the existing framework.
> > 
> > I agree that it might be a good idea, if possible, for request_module()
> > itself to see if the requested module is already registered, and then do
> > nothing.
> > 
> > Adding Luis (modules maintainer) to CC.
> > 
> > Luis, phylib calls request_module() unconditionally, regardless if there
> > is a driver registered already (phy driver built as built-in).
> > 
> > This causes the splat described here to be printed:
> > https://lore.kernel.org/netdev/7103704.9J7NaK4W3v@fedora.fritz.box/T/#u
> > 
> > But as far as I can tell, this is a false positive, since the call cannot
> > possibly load any module (since the driver is built as built-in and not
> > as a module).
> 
> Please be careful here. Just because it is built in for your build
> does not mean it is built in for everybody. The code needs to be able
> to load the module if it is not built in.
> 

I fully agree here - moreover, the same WARN would be triggered if async probe
is used for a NIC driver which is built as a module, with PHY drivers
themselves being modules.

> Also, i've seen broken builds where the driver is both built in and
> exists as a module. User space will try to load the module, and the
> linker will then complain about duplicate symbols and the load fails.
> So it is not a false positive. There is also the question of what
> exactly causes the deadlock. Is simply calling user space to load a
> module which does not exist sufficient to cause the deadlock? That is
> what is happening in your system, with built in modules.
> 

The deadlock (which I did NOT observed) would be caused by a module init
function calling async_synchronize_full(); quoting from __request_module():

    /*
     * We don't allow synchronous module loading from async.  Module
     * init may invoke async_synchronize_full() which will end up
     * waiting for this task which already is waiting for the module
     * loading to complete, leading to a deadlock.
     */
    WARN_ON_ONCE(wait && current_is_async());

The deadlock was observed way back and the code above was added in 2013 [1];
some more work has been done during the years [2].

> Also, the kernel probably has no idea if the module has already been
> loaded nor not when request_module() is called. What we pass to the
> request_module() is not the name of the module, but a string which
> represents one of the IDs the PHY has. It is then userspace that maps
> that string to a kernel module via modules.alias. And there is a many
> to one mapping, many IDs map to one module. So it could be the module
> has already been loaded due to some other string.
> 
> So, back to my original question. How should module loading work with
> async probe?
>

Today it is not meant to work [1]; that was mainly the reason for my
original patch. Maybe the best solution would be to work on that.


[1] https://lore.kernel.org/all/20130118221227.GG24579@htj.dyndns.org/
[2] https://lore.kernel.org/lkml/YfsruBT19o3j0KD2@bombadil.infradead.org/T/#m4a8579a9e9a2508bea66223906e1917e164e6c70


Thank you!


Regards,
Francesco
Francesco Valla Jan. 3, 2025, 6 p.m. UTC | #17
On Thursday, 2 January 2025 at 15:21:51 Andrew Lunn <andrew@lunn.ch> wrote:
> > > One other question, how much speadup do you get with async probe of
> > > PHYs? Is it really worth the effort?
> > 
> > For me it's a reduction of ~170ms, which currently accounts for roughly the 25%
> > of the time spent before starting userspace (660ms vs 490ms, give or take a
> > couple of milliseconds). That's due to the large reset time required by the PHYs
> > to initialize
> 
> 170ms seems like a long time for a reset. Maybe check the datasheet
> and see what it says.
> 
> 	Andrew
> 

Turned out that another 50-60ms could be shaven off. I trusted the original
implementation for the platform on the reset timings, but the datasheet used
the same name for two time quantities and they chose the wrong one.

Regards,
Francesco
diff mbox series

Patch

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index b26bb33cd1d4..a9e8b834851c 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -44,6 +44,14 @@  MODULE_DESCRIPTION("PHY library");
 MODULE_AUTHOR("Andy Fleming");
 MODULE_LICENSE("GPL");
 
+struct phy_drv_node {
+	const struct phy_driver *drv;
+	struct list_head list;
+};
+
+static LIST_HEAD(phy_drv_list);
+static DECLARE_RWSEM(phy_drv_list_sem);
+
 __ETHTOOL_DECLARE_LINK_MODE_MASK(phy_basic_features) __ro_after_init;
 EXPORT_SYMBOL_GPL(phy_basic_features);
 
@@ -658,6 +666,23 @@  static int phy_request_driver_module(struct phy_device *dev, u32 phy_id)
 	return 0;
 }
 
+static bool phy_driver_exists(u32 phy_id)
+{
+	bool found = false;
+	struct phy_drv_node *node;
+
+	down_read(&phy_drv_list_sem);
+	list_for_each_entry(node, &phy_drv_list, list) {
+		if (phy_id_compare(phy_id, node->drv->phy_id, node->drv->phy_id_mask)) {
+			found = true;
+			break;
+		}
+	}
+	up_read(&phy_drv_list_sem);
+
+	return found;
+}
+
 struct phy_device *phy_device_create(struct mii_bus *bus, int addr, u32 phy_id,
 				     bool is_c45,
 				     struct phy_c45_device_ids *c45_ids)
@@ -709,11 +734,7 @@  struct phy_device *phy_device_create(struct mii_bus *bus, int addr, u32 phy_id,
 	mutex_init(&dev->lock);
 	INIT_DELAYED_WORK(&dev->state_queue, phy_state_machine);
 
-	/* Request the appropriate module unconditionally; don't
-	 * bother trying to do so only if it isn't already loaded,
-	 * because that gets complicated. A hotplug event would have
-	 * done an unconditional modprobe anyway.
-	 * We don't do normal hotplug because it won't work for MDIO
+	/* We don't do normal hotplug because it won't work for MDIO
 	 * -- because it relies on the device staying around for long
 	 * enough for the driver to get loaded. With MDIO, the NIC
 	 * driver will get bored and give up as soon as it finds that
@@ -724,7 +745,8 @@  struct phy_device *phy_device_create(struct mii_bus *bus, int addr, u32 phy_id,
 		int i;
 
 		for (i = 1; i < num_ids; i++) {
-			if (c45_ids->device_ids[i] == 0xffffffff)
+			if (c45_ids->device_ids[i] == 0xffffffff ||
+			    phy_driver_exists(c45_ids->device_ids[i]))
 				continue;
 
 			ret = phy_request_driver_module(dev,
@@ -732,7 +754,7 @@  struct phy_device *phy_device_create(struct mii_bus *bus, int addr, u32 phy_id,
 			if (ret)
 				break;
 		}
-	} else {
+	} else if (!phy_driver_exists(phy_id)) {
 		ret = phy_request_driver_module(dev, phy_id);
 	}
 
@@ -3674,6 +3696,7 @@  static int phy_remove(struct device *dev)
  */
 int phy_driver_register(struct phy_driver *new_driver, struct module *owner)
 {
+	struct phy_drv_node *node;
 	int retval;
 
 	/* Either the features are hard coded, or dynamically
@@ -3695,6 +3718,10 @@  int phy_driver_register(struct phy_driver *new_driver, struct module *owner)
 		 new_driver->name))
 		return -EINVAL;
 
+	node = kzalloc(sizeof(*node), GFP_KERNEL);
+	if (!node)
+		return -ENOMEM;
+
 	new_driver->mdiodrv.flags |= MDIO_DEVICE_IS_PHY;
 	new_driver->mdiodrv.driver.name = new_driver->name;
 	new_driver->mdiodrv.driver.bus = &mdio_bus_type;
@@ -3707,10 +3734,15 @@  int phy_driver_register(struct phy_driver *new_driver, struct module *owner)
 	if (retval) {
 		pr_err("%s: Error %d in registering driver\n",
 		       new_driver->name, retval);
-
+		kfree(node);
 		return retval;
 	}
 
+	down_write(&phy_drv_list_sem);
+	node->drv = new_driver;
+	list_add(&node->list, &phy_drv_list);
+	up_write(&phy_drv_list_sem);
+
 	pr_debug("%s: Registered new driver\n", new_driver->name);
 
 	return 0;
@@ -3736,6 +3768,19 @@  EXPORT_SYMBOL(phy_drivers_register);
 
 void phy_driver_unregister(struct phy_driver *drv)
 {
+	struct phy_drv_node *node;
+
+	down_write(&phy_drv_list_sem);
+	list_for_each_entry(node, &phy_drv_list, list) {
+		if (phy_id_compare(drv->phy_id,
+				   node->drv->phy_id, node->drv->phy_id_mask)) {
+			list_del(&node->list);
+			kfree(node);
+			break;
+		}
+	}
+	up_write(&phy_drv_list_sem);
+
 	driver_unregister(&drv->mdiodrv.driver);
 }
 EXPORT_SYMBOL(phy_driver_unregister);