Message ID | 20240507102822.2023826-1-maxime.chevallier@bootlin.com (mailing list archive) |
---|---|
Headers | show |
Series | Fix phy_link_topology initialization | expand |
Hi Maxime, On Tue, May 07, 2024 at 12:28:19PM +0200, Maxime Chevallier wrote: > Nathan and Heiner reported issues that occur when phylib and phy drivers > built as modules expect the phy_link_topology to be initialized, due to > wrong use of IS_REACHABLE. > > This small fixup series addresses that by moving the initialization code > into net/core/dev.c, but at the same time implementing lazy > initialization to only allocate the topology upon the first PHY > insertion. > > This needed some refactoring, namely pass the netdevice itself as a > parameter for phy_link_topology helpers. > > Thanks Heiner for the help on untangling this, and Nathan for the > report. Are you able to prioritize getting this series merged? This has been a problem in -next for over a month now and the merge window is now open. I would hate to see this regress in mainline, as my main system may be affected by it (not sure, I got a new test machine that got bit by it in addition to the other two I noticed it on). Cheers, Nathan
On Sun, May 12, 2024 at 11:36:36PM -0700, Nathan Chancellor wrote: > Hi Maxime, > > On Tue, May 07, 2024 at 12:28:19PM +0200, Maxime Chevallier wrote: > > Nathan and Heiner reported issues that occur when phylib and phy drivers > > built as modules expect the phy_link_topology to be initialized, due to > > wrong use of IS_REACHABLE. > > > > This small fixup series addresses that by moving the initialization code > > into net/core/dev.c, but at the same time implementing lazy > > initialization to only allocate the topology upon the first PHY > > insertion. > > > > This needed some refactoring, namely pass the netdevice itself as a > > parameter for phy_link_topology helpers. > > > > Thanks Heiner for the help on untangling this, and Nathan for the > > report. > > Are you able to prioritize getting this series merged? This has been a > problem in -next for over a month now and the merge window is now open. > I would hate to see this regress in mainline, as my main system may be > affected by it (not sure, I got a new test machine that got bit by it in > addition to the other two I noticed it on). ... and Maxime has been working on trying to get an acceptable fix for it over that time, with to-and-fro discussions. Maxime still hasn't got an ack from Heiner for the fixes, and changes are still being requested. I think, sadly, the only way forward at this point would be to revert the original commit. I've just tried reverting 6916e461e793 in my net-next tree and it's possible, although a little noisy: $ git revert 6916e461e793 Performing inexact rename detection: 100% (8904/8904), done. Auto-merging net/core/dev.c Auto-merging include/uapi/linux/ethtool.h Removing include/linux/phy_link_topology_core.h Removing include/linux/phy_link_topology.h Auto-merging include/linux/phy.h Auto-merging include/linux/netdevice.h Removing drivers/net/phy/phy_link_topology.c Auto-merging drivers/net/phy/phy_device.c Auto-merging MAINTAINERS hint: Waiting for your editor to close the file... I haven't checked whether that ends up with something that's buildable. Any views Jakub/Dave/Paolo?
On Mon, 13 May 2024 10:15:48 +0100 Russell King (Oracle) wrote: > ... and Maxime has been working on trying to get an acceptable fix for > it over that time, with to-and-fro discussions. Maxime still hasn't got > an ack from Heiner for the fixes, and changes are still being > requested. > > I think, sadly, the only way forward at this point would be to revert > the original commit. I've just tried reverting 6916e461e793 in my > net-next tree and it's possible, although a little noisy: > > $ git revert 6916e461e793 > Performing inexact rename detection: 100% (8904/8904), done. > Auto-merging net/core/dev.c > Auto-merging include/uapi/linux/ethtool.h > Removing include/linux/phy_link_topology_core.h > Removing include/linux/phy_link_topology.h > Auto-merging include/linux/phy.h > Auto-merging include/linux/netdevice.h > Removing drivers/net/phy/phy_link_topology.c > Auto-merging drivers/net/phy/phy_device.c > Auto-merging MAINTAINERS > hint: Waiting for your editor to close the file... > > I haven't checked whether that ends up with something that's buildable. > > Any views Jakub/Dave/Paolo? I think you're right. The series got half-merged, we shouldn't push it into a release in this state. We should revert all of it, I reckon? 6916e461e793 ("net: phy: Introduce ethernet link topology representation") 0ec5ed6c130e ("net: sfp: pass the phy_device when disconnecting an sfp module's PHY") e75e4e074c44 ("net: phy: add helpers to handle sfp phy connect/disconnect") fdd353965b52 ("net: sfp: Add helper to return the SFP bus name") 841942bc6212 ("net: ethtool: Allow passing a phy index for some commands") Does anyone feel strongly that we should try to patch it up instead?
Hi Jakub, On Mon, 13 May 2024 08:11:38 -0700 Jakub Kicinski <kuba@kernel.org> wrote: > On Mon, 13 May 2024 10:15:48 +0100 Russell King (Oracle) wrote: > > ... and Maxime has been working on trying to get an acceptable fix for > > it over that time, with to-and-fro discussions. Maxime still hasn't got > > an ack from Heiner for the fixes, and changes are still being > > requested. > > > > I think, sadly, the only way forward at this point would be to revert > > the original commit. I've just tried reverting 6916e461e793 in my > > net-next tree and it's possible, although a little noisy: > > > > $ git revert 6916e461e793 > > Performing inexact rename detection: 100% (8904/8904), done. > > Auto-merging net/core/dev.c > > Auto-merging include/uapi/linux/ethtool.h > > Removing include/linux/phy_link_topology_core.h > > Removing include/linux/phy_link_topology.h > > Auto-merging include/linux/phy.h > > Auto-merging include/linux/netdevice.h > > Removing drivers/net/phy/phy_link_topology.c > > Auto-merging drivers/net/phy/phy_device.c > > Auto-merging MAINTAINERS > > hint: Waiting for your editor to close the file... > > > > I haven't checked whether that ends up with something that's buildable. > > > > Any views Jakub/Dave/Paolo? > > I think you're right. The series got half-merged, we shouldn't push it > into a release in this state. We should revert all of it, I reckon? > > 6916e461e793 ("net: phy: Introduce ethernet link topology representation") > 0ec5ed6c130e ("net: sfp: pass the phy_device when disconnecting an sfp module's PHY") > e75e4e074c44 ("net: phy: add helpers to handle sfp phy connect/disconnect") > fdd353965b52 ("net: sfp: Add helper to return the SFP bus name") > 841942bc6212 ("net: ethtool: Allow passing a phy index for some commands") > > Does anyone feel strongly that we should try to patch it up instead? It's OK for me, at least this showed some of the shortcomings with the current code, let's come back with a better version for the next round. Thanks, Maxime