diff mbox series

[net] phy: fix null pointer issue in phy_attach_direct()

Message ID 20250129183638.695010-1-sreedevi.joshi@intel.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [net] phy: fix null pointer issue in phy_attach_direct() | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers fail 3 blamed authors not CCed: f.fainelli@gmail.com ioana.ciornei@nxp.com olteanv@gmail.com; 3 maintainers not CCed: f.fainelli@gmail.com ioana.ciornei@nxp.com olteanv@gmail.com
netdev/build_clang success Errors and warnings before: 2 this patch: 2
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 8 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 39 this patch: 39
netdev/source_inline success Was 0 now: 0

Commit Message

sreedevi.joshi Jan. 29, 2025, 6:36 p.m. UTC
From: Sreedevi Joshi <sreedevi.joshi@intel.com>

When attaching a fixed phy to devices like veth, it is
possible that there is no parent. The logic in
phy_attach_direct() tries to access the driver member
without checking for the null. This causes segfault in the
case of fixed phy.

Checks are in place now to ensure the parent is not null
before accessing to address this scenario.

Fixes: 2db2d9d1ac37 ("net: phy: Guard against the presence of a netdev")
Signed-off-by: Sreedevi Joshi <sreedevi.joshi@intel.com>
---
 drivers/net/phy/phy_device.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Andrew Lunn Jan. 29, 2025, 7:14 p.m. UTC | #1
On Wed, Jan 29, 2025 at 12:36:38PM -0600, sreedevi.joshi wrote:
> From: Sreedevi Joshi <sreedevi.joshi@intel.com>
> 
> When attaching a fixed phy to devices like veth

Humm. Zoom out. What is the big picture? Why would a veth need a PHY?

	Andrew
Russell King (Oracle) Jan. 29, 2025, 7:29 p.m. UTC | #2
On Wed, Jan 29, 2025 at 12:36:38PM -0600, sreedevi.joshi wrote:
> From: Sreedevi Joshi <sreedevi.joshi@intel.com>
> 
> When attaching a fixed phy to devices like veth, it is
> possible that there is no parent. The logic in
> phy_attach_direct() tries to access the driver member
> without checking for the null. This causes segfault in the
> case of fixed phy.

Kernel mode doesn't segfault. That's a userspace thing. Kernel mode
oopses.

I'm confused. You mention veth, which presumably is drivers/net/veth.c.
Grepping this driver for "phy" returns nothing. So how can veth be
broken by a phylib change?
Joshi, Sreedevi Jan. 30, 2025, 7:10 p.m. UTC | #3
> -----Original Message-----
> From: Andrew Lunn <andrew@lunn.ch>
> Sent: Wednesday, January 29, 2025 2:14 PM
> To: sreedevi.joshi <joshisre@ecsmtp.an.intel.com>
> Cc: hkallweit1@gmail.com; linux@armlinux.org.uk; edumazet@google.com; kuba@kernel.org; pabeni@redhat.com;
> netdev@vger.kernel.org; Joshi, Sreedevi <sreedevi.joshi@intel.com>
> Subject: Re: [PATCH net] phy: fix null pointer issue in phy_attach_direct()
> 
> On Wed, Jan 29, 2025 at 12:36:38PM -0600, sreedevi.joshi wrote:
> > From: Sreedevi Joshi <sreedevi.joshi@intel.com>
> >
> > When attaching a fixed phy to devices like veth
> 
> Humm. Zoom out. What is the big picture? Why would a veth need a PHY?
> 
> 	Andrew
[] 
This issue was encountered when working on a POC to demo the mii_timestamper timestamp
callback hooks mechanism. We are using veth pairs as we don't have the HW yet. In this demo,
we connect a fixed PHY to veth and attach mii_timestamper hooks that way. However, as veth device
(like any other virtual interfaces) does not have a parent, it causes Kernel Oops and on our system
it needs a reboot to recover the system. With this check in place,
we could connect fixed PHY and mii_timestamper hooks successfully. I understand
it is not a common practice to attach a PHY to a virtual interface. However, having a check for NULL
before accessing the member will be good to avoid issues.

Wanted to check with community if it is worth applying this to the upstream driver.

This was the crash log:
025-01-30T18:26:53.952545+00:00 dregen kernel: BUG: kernel NULL pointer dereference, address: 0000000000000068
2025-01-30T18:26:53.952570+00:00 dregen kernel: #PF: supervisor read access in kernel mode
2025-01-30T18:26:53.952571+00:00 dregen kernel: #PF: error_code(0x0000) - not-present page
2025-01-30T18:26:53.952572+00:00 dregen kernel: PGD 0 P4D 0
2025-01-30T18:26:53.952573+00:00 dregen kernel: Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
2025-01-30T18:26:53.952573+00:00 dregen kernel: CPU: 80 UID: 0 PID: 7734 Comm: ip Not tainted 6.11.0+ #100
2025-01-30T18:26:53.952574+00:00 dregen kernel: Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0011.032620200659 03/26/2020
2025-01-30T18:26:53.952576+00:00 dregen kernel: RIP: 0010:phy_attach_direct+0x34/0x3f0
2025-01-30T18:26:53.952577+00:00 dregen kernel: Code: 55 45 31 ed 41 54 49 89 f4 55 89 d5 53 48 89 fb 48 83 ec 08 4c 8b b6 e0 02 00 00 89 0c 24 48 85 ff 74 0f 48 8b 87 78 05 00 00 <48> 8b 40 68 4c 8b 68 10 49 8b 3e 4c 39 ef 74 0d e8 27 a6 85 ff 84
2025-01-30T18:26:53.952577+00:00 dregen kernel: RSP: 0018:ffffb6b4337c7698 EFLAGS: 00010286
2025-01-30T18:26:53.952578+00:00 dregen kernel: RAX: 0000000000000000 RBX: ffff9dbd9a3c5000 RCX: 0000000000000002
2025-01-30T18:26:53.952578+00:00 dregen kernel: RDX: 0000000000000000 RSI: ffff9dbda39f9800 RDI: ffff9dbd9a3c5000
2025-01-30T18:26:53.952579+00:00 dregen kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
2025-01-30T18:26:53.952579+00:00 dregen kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff9dbda39f9800
2025-01-30T18:26:53.952581+00:00 dregen kernel: R13: 0000000000000000 R14: ffff9d014dc48000 R15: ffff9dbda39f9000
2025-01-30T18:26:53.952582+00:00 dregen kernel: FS:  00007f6e30b20b80(0000) GS:ffff9e778bb00000(0000) knlGS:0000000000000000
2025-01-30T18:26:53.952582+00:00 dregen kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
2025-01-30T18:26:53.952582+00:00 dregen kernel: CR2: 0000000000000068 CR3: 000000bd51444002 CR4: 00000000007706f0
2025-01-30T18:26:53.952583+00:00 dregen kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
2025-01-30T18:26:53.952583+00:00 dregen kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
2025-01-30T18:26:53.952585+00:00 dregen kernel: PKRU: 55555554
2025-01-30T18:26:53.952585+00:00 dregen kernel: Call Trace:
2025-01-30T18:26:53.952586+00:00 dregen kernel:  <TASK>
2025-01-30T18:26:53.952586+00:00 dregen kernel:  ? __die+0x1f/0x60
2025-01-30T18:26:53.952586+00:00 dregen kernel:  ? page_fault_oops+0x15c/0x450
2025-01-30T18:26:53.952587+00:00 dregen kernel:  ? klist_next+0x145/0x150
2025-01-30T18:26:53.952589+00:00 dregen kernel:  ? exc_page_fault+0x77/0x160
2025-01-30T18:26:53.952589+00:00 dregen kernel:  ? asm_exc_page_fault+0x22/0x30
2025-01-30T18:26:53.952590+00:00 dregen kernel:  ? phy_attach_direct+0x34/0x3f0
2025-01-30T18:26:53.952590+00:00 dregen kernel:  ? __pfx_veth_adjust_link+0x10/0x10 [veth]
2025-01-30T18:26:53.952591+00:00 dregen kernel:  phy_connect_direct+0x21/0x70
2025-01-30T18:26:53.952591+00:00 dregen kernel:  veth_newlink+0x1f7/0x550 [veth]
2025-01-30T18:26:53.952591+00:00 dregen kernel:  __rtnl_newlink+0x70f/0x980
2025-01-30T18:26:53.952593+00:00 dregen kernel:  ? avc_has_perm_noaudit+0x67/0xf0
2025-01-30T18:26:53.952593+00:00 dregen kernel:  rtnl_newlink+0x43/0x70
2025-01-30T18:26:53.952594+00:00 dregen kernel:  rtnetlink_rcv_msg+0x14b/0x3f0
2025-01-30T18:26:53.952594+00:00 dregen kernel:  ? dl_server_stop+0x2b/0x40
2025-01-30T18:26:53.952594+00:00 dregen kernel:  ? __perf_event_task_sched_in+0x8c/0x200
2025-01-30T18:26:53.952595+00:00 dregen kernel:  ? __pfx_rtnetlink_rcv_msg+0x10/0x10
2025-01-30T18:26:53.952597+00:00 dregen kernel:  netlink_rcv_skb+0x54/0x100
2025-01-30T18:26:53.952597+00:00 dregen kernel:  netlink_unicast+0x23e/0x390
2025-01-30T18:26:53.952598+00:00 dregen kernel:  netlink_sendmsg+0x1f3/0x440
2025-01-30T18:26:53.952598+00:00 dregen kernel:  ____sys_sendmsg+0x2d7/0x310
2025-01-30T18:26:53.952598+00:00 dregen kernel:  ? copy_msghdr_from_user+0x6d/0xa0
2025-01-30T18:26:53.952599+00:00 dregen kernel:  ___sys_sendmsg+0x86/0xd0
2025-01-30T18:26:53.952599+00:00 dregen kernel:  ? do_fault+0x2a4/0x5d0
2025-01-30T18:26:53.952601+00:00 dregen kernel:  ? __handle_mm_fault+0x55f/0xff0
2025-01-30T18:26:53.952601+00:00 dregen kernel:  __sys_sendmsg+0x57/0xa0
2025-01-30T18:26:53.952602+00:00 dregen kernel:  do_syscall_64+0x3b/0xc0
2025-01-30T18:26:53.952602+00:00 dregen kernel:  entry_SYSCALL_64_after_hwframe+0x76/0x7e
Joshi, Sreedevi Jan. 30, 2025, 7:18 p.m. UTC | #4
> -----Original Message-----
> From: Russell King <linux@armlinux.org.uk>
> Sent: Wednesday, January 29, 2025 2:30 PM
> To: sreedevi.joshi <joshisre@ecsmtp.an.intel.com>
> Cc: andrew@lunn.ch; hkallweit1@gmail.com; edumazet@google.com; kuba@kernel.org; pabeni@redhat.com;
> netdev@vger.kernel.org; Joshi, Sreedevi <sreedevi.joshi@intel.com>
> Subject: Re: [PATCH net] phy: fix null pointer issue in phy_attach_direct()
> 
> On Wed, Jan 29, 2025 at 12:36:38PM -0600, sreedevi.joshi wrote:
> > From: Sreedevi Joshi <sreedevi.joshi@intel.com>
> >
> > When attaching a fixed phy to devices like veth, it is possible that
> > there is no parent. The logic in
> > phy_attach_direct() tries to access the driver member without checking
> > for the null. This causes segfault in the case of fixed phy.
> 
> Kernel mode doesn't segfault. That's a userspace thing. Kernel mode oopses.
You are right. My bad for using for the wrong terminology. I have attached the
Kernel Oops log in my previous reply.
> 
> I'm confused. You mention veth, which presumably is drivers/net/veth.c.
> Grepping this driver for "phy" returns nothing. So how can veth be broken by a phylib change?
Yes. There is no PHY attached to the veth. This is part of the POC we are working on. We attach a
Fixed PHY to veth so the mii_timestamper hooks can be exercised for the demo. This was uncovered 
during that and even though there is no PHY usually attached to a virtual interface, it may be good
idea to have the necessary check before accessing the member to avoid issues. Wanted to check with
community if it can be included in the upstream driver.
> 
> --
> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
Andrew Lunn Jan. 30, 2025, 7:52 p.m. UTC | #5
On Thu, Jan 30, 2025 at 07:10:49PM +0000, Joshi, Sreedevi wrote:
> 
> > -----Original Message-----
> > From: Andrew Lunn <andrew@lunn.ch>
> > Sent: Wednesday, January 29, 2025 2:14 PM
> > To: sreedevi.joshi <joshisre@ecsmtp.an.intel.com>
> > Cc: hkallweit1@gmail.com; linux@armlinux.org.uk; edumazet@google.com; kuba@kernel.org; pabeni@redhat.com;
> > netdev@vger.kernel.org; Joshi, Sreedevi <sreedevi.joshi@intel.com>
> > Subject: Re: [PATCH net] phy: fix null pointer issue in phy_attach_direct()
> > 
> > On Wed, Jan 29, 2025 at 12:36:38PM -0600, sreedevi.joshi wrote:
> > > From: Sreedevi Joshi <sreedevi.joshi@intel.com>
> > >
> > > When attaching a fixed phy to devices like veth
> > 
> > Humm. Zoom out. What is the big picture? Why would a veth need a PHY?
> > 
> > 	Andrew
> [] 
> This issue was encountered when working on a POC to demo the mii_timestamper timestamp
> callback hooks mechanism. We are using veth pairs as we don't have the HW yet. In this demo,
> we connect a fixed PHY to veth and attach mii_timestamper hooks that way. However, as veth device
> (like any other virtual interfaces) does not have a parent, it causes Kernel Oops and on our system
> it needs a reboot to recover the system. With this check in place,
> we could connect fixed PHY and mii_timestamper hooks successfully. I understand
> it is not a common practice to attach a PHY to a virtual interface. However, having a check for NULL
> before accessing the member will be good to avoid issues.

Well, there is a flip side to this. You are doing something which does
not make sense. Getting an Opps is a good indication you are doing
something you should not. And the Opps makes it easy to
debug. Silently ignoring the problem makes it a lot harder to find.

Is there a legitimate use case for a physical network device without a
parent device? It looks like phy_attach_direct() has been referencing
the parent since December 2016, so given its history i'm not sure
there is a legitimate use case.

I assume when you get real hardware you will have both a parent and a
PHY?

	Andrew
diff mbox series

Patch

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 46713d27412b..be813962cb89 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -1471,7 +1471,7 @@  int phy_attach_direct(struct net_device *dev, struct phy_device *phydev,
 	 * our own module->refcnt here, otherwise we would not be able to
 	 * unload later on.
 	 */
-	if (dev)
+	if (dev && dev->dev.parent)
 		ndev_owner = dev->dev.parent->driver->owner;
 	if (ndev_owner != bus->owner && !try_module_get(bus->owner)) {
 		phydev_err(phydev, "failed to get the bus module\n");