diff mbox series

[net-next,v17,01/14] net: phy: Introduce ethernet link topology representation

Message ID 20240709063039.2909536-2-maxime.chevallier@bootlin.com (mailing list archive)
State Deferred
Delegated to: Netdev Maintainers
Headers show
Series Introduce PHY listing and link_topology tracking | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net-next, async
netdev/ynl success Generated files up to date; no warnings/errors; GEN HAS DIFF 2 files changed, 911 insertions(+);
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 863 this patch: 863
netdev/build_tools success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers warning 1 maintainers not CCed: sd@queasysnail.net
netdev/build_clang success Errors and warnings before: 917 this patch: 917
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 5000 this patch: 5000
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 329 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 171 this patch: 171
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-07-11--18-00 (tests: 696)

Commit Message

Maxime Chevallier July 9, 2024, 6:30 a.m. UTC
Link topologies containing multiple network PHYs attached to the same
net_device can be found when using a PHY as a media converter for use
with an SFP connector, on which an SFP transceiver containing a PHY can
be used.

With the current model, the transceiver's PHY can't be used for
operations such as cable testing, timestamping, macsec offload, etc.

The reason being that most of the logic for these configuration, coming
from either ethtool netlink or ioctls tend to use netdev->phydev, which
in multi-phy systems will reference the PHY closest to the MAC.

Introduce a numbering scheme allowing to enumerate PHY devices that
belong to any netdev, which can in turn allow userspace to take more
precise decisions with regard to each PHY's configuration.

The numbering is maintained per-netdev, in a phy_device_list.
The numbering works similarly to a netdevice's ifindex, with
identifiers that are only recycled once INT_MAX has been reached.

This prevents races that could occur between PHY listing and SFP
transceiver removal/insertion.

The identifiers are assigned at phy_attach time, as the numbering
depends on the netdevice the phy is attached to. The PHY index can be
re-used for PHYs that are persistent.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
---
 MAINTAINERS                         |   1 +
 drivers/net/phy/Makefile            |   2 +-
 drivers/net/phy/phy_device.c        |   6 ++
 drivers/net/phy/phy_link_topology.c | 105 ++++++++++++++++++++++++++++
 include/linux/netdevice.h           |   4 +-
 include/linux/phy.h                 |   4 ++
 include/linux/phy_link_topology.h   |  82 ++++++++++++++++++++++
 include/uapi/linux/ethtool.h        |  16 +++++
 net/core/dev.c                      |  15 ++++
 9 files changed, 233 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/phy/phy_link_topology.c
 create mode 100644 include/linux/phy_link_topology.h

Comments

Christophe Leroy Aug. 14, 2024, 2:28 p.m. UTC | #1
Le 09/07/2024 à 08:30, Maxime Chevallier a écrit :
> Link topologies containing multiple network PHYs attached to the same
> net_device can be found when using a PHY as a media converter for use
> with an SFP connector, on which an SFP transceiver containing a PHY can
> be used.
> 
> With the current model, the transceiver's PHY can't be used for
> operations such as cable testing, timestamping, macsec offload, etc.
> 
> The reason being that most of the logic for these configuration, coming
> from either ethtool netlink or ioctls tend to use netdev->phydev, which
> in multi-phy systems will reference the PHY closest to the MAC.
> 
> Introduce a numbering scheme allowing to enumerate PHY devices that
> belong to any netdev, which can in turn allow userspace to take more
> precise decisions with regard to each PHY's configuration.
> 
> The numbering is maintained per-netdev, in a phy_device_list.
> The numbering works similarly to a netdevice's ifindex, with
> identifiers that are only recycled once INT_MAX has been reached.
> 
> This prevents races that could occur between PHY listing and SFP
> transceiver removal/insertion.
> 
> The identifiers are assigned at phy_attach time, as the numbering
> depends on the netdevice the phy is attached to. The PHY index can be
> re-used for PHYs that are persistent.
> 
> Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>

Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu>

> ---
>   MAINTAINERS                         |   1 +
>   drivers/net/phy/Makefile            |   2 +-
>   drivers/net/phy/phy_device.c        |   6 ++
>   drivers/net/phy/phy_link_topology.c | 105 ++++++++++++++++++++++++++++
>   include/linux/netdevice.h           |   4 +-
>   include/linux/phy.h                 |   4 ++
>   include/linux/phy_link_topology.h   |  82 ++++++++++++++++++++++
>   include/uapi/linux/ethtool.h        |  16 +++++
>   net/core/dev.c                      |  15 ++++
>   9 files changed, 233 insertions(+), 2 deletions(-)
>   create mode 100644 drivers/net/phy/phy_link_topology.c
>   create mode 100644 include/linux/phy_link_topology.h
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index e0f28278e504..5135c3379234 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -8194,6 +8194,7 @@ F:	include/linux/mii.h
>   F:	include/linux/of_net.h
>   F:	include/linux/phy.h
>   F:	include/linux/phy_fixed.h
> +F:	include/linux/phy_link_topology.h
>   F:	include/linux/phylib_stubs.h
>   F:	include/linux/platform_data/mdio-bcm-unimac.h
>   F:	include/linux/platform_data/mdio-gpio.h
> diff --git a/drivers/net/phy/Makefile b/drivers/net/phy/Makefile
> index 202ed7f450da..1d8be374915f 100644
> --- a/drivers/net/phy/Makefile
> +++ b/drivers/net/phy/Makefile
> @@ -2,7 +2,7 @@
>   # Makefile for Linux PHY drivers
>   
>   libphy-y			:= phy.o phy-c45.o phy-core.o phy_device.o \
> -				   linkmode.o
> +				   linkmode.o phy_link_topology.o
>   mdio-bus-y			+= mdio_bus.o mdio_device.o
>   
>   ifdef CONFIG_MDIO_DEVICE
> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
> index 70b07e621fb2..e68acaba1b4f 100644
> --- a/drivers/net/phy/phy_device.c
> +++ b/drivers/net/phy/phy_device.c
> @@ -29,6 +29,7 @@
>   #include <linux/phy.h>
>   #include <linux/phylib_stubs.h>
>   #include <linux/phy_led_triggers.h>
> +#include <linux/phy_link_topology.h>
>   #include <linux/pse-pd/pse.h>
>   #include <linux/property.h>
>   #include <linux/rtnetlink.h>
> @@ -1511,6 +1512,10 @@ int phy_attach_direct(struct net_device *dev, struct phy_device *phydev,
>   
>   		if (phydev->sfp_bus_attached)
>   			dev->sfp_bus = phydev->sfp_bus;
> +
> +		err = phy_link_topo_add_phy(dev, phydev, PHY_UPSTREAM_MAC, dev);
> +		if (err)
> +			goto error;
>   	}
>   
>   	/* Some Ethernet drivers try to connect to a PHY device before
> @@ -1938,6 +1943,7 @@ void phy_detach(struct phy_device *phydev)
>   	if (dev) {
>   		phydev->attached_dev->phydev = NULL;
>   		phydev->attached_dev = NULL;
> +		phy_link_topo_del_phy(dev, phydev);
>   	}
>   	phydev->phylink = NULL;
>   
> diff --git a/drivers/net/phy/phy_link_topology.c b/drivers/net/phy/phy_link_topology.c
> new file mode 100644
> index 000000000000..4a5d73002a1a
> --- /dev/null
> +++ b/drivers/net/phy/phy_link_topology.c
> @@ -0,0 +1,105 @@
> +// SPDX-License-Identifier: GPL-2.0+
> +/*
> + * Infrastructure to handle all PHY devices connected to a given netdev,
> + * either directly or indirectly attached.
> + *
> + * Copyright (c) 2023 Maxime Chevallier<maxime.chevallier@bootlin.com>
> + */
> +
> +#include <linux/phy_link_topology.h>
> +#include <linux/phy.h>
> +#include <linux/rtnetlink.h>
> +#include <linux/xarray.h>
> +
> +static int netdev_alloc_phy_link_topology(struct net_device *dev)
> +{
> +	struct phy_link_topology *topo;
> +
> +	topo = kzalloc(sizeof(*topo), GFP_KERNEL);
> +	if (!topo)
> +		return -ENOMEM;
> +
> +	xa_init_flags(&topo->phys, XA_FLAGS_ALLOC1);
> +	topo->next_phy_index = 1;
> +
> +	dev->link_topo = topo;
> +
> +	return 0;
> +}
> +
> +int phy_link_topo_add_phy(struct net_device *dev,
> +			  struct phy_device *phy,
> +			  enum phy_upstream upt, void *upstream)
> +{
> +	struct phy_link_topology *topo = dev->link_topo;
> +	struct phy_device_node *pdn;
> +	int ret;
> +
> +	if (!topo) {
> +		ret = netdev_alloc_phy_link_topology(dev);
> +		if (ret)
> +			return ret;
> +
> +		topo = dev->link_topo;
> +	}
> +
> +	pdn = kzalloc(sizeof(*pdn), GFP_KERNEL);
> +	if (!pdn)
> +		return -ENOMEM;
> +
> +	pdn->phy = phy;
> +	switch (upt) {
> +	case PHY_UPSTREAM_MAC:
> +		pdn->upstream.netdev = (struct net_device *)upstream;
> +		if (phy_on_sfp(phy))
> +			pdn->parent_sfp_bus = pdn->upstream.netdev->sfp_bus;
> +		break;
> +	case PHY_UPSTREAM_PHY:
> +		pdn->upstream.phydev = (struct phy_device *)upstream;
> +		if (phy_on_sfp(phy))
> +			pdn->parent_sfp_bus = pdn->upstream.phydev->sfp_bus;
> +		break;
> +	default:
> +		ret = -EINVAL;
> +		goto err;
> +	}
> +	pdn->upstream_type = upt;
> +
> +	/* Attempt to re-use a previously allocated phy_index */
> +	if (phy->phyindex)
> +		ret = xa_insert(&topo->phys, phy->phyindex, pdn, GFP_KERNEL);
> +	else
> +		ret = xa_alloc_cyclic(&topo->phys, &phy->phyindex, pdn,
> +				      xa_limit_32b, &topo->next_phy_index,
> +				      GFP_KERNEL);
> +
> +	if (ret)
> +		goto err;
> +
> +	return 0;
> +
> +err:
> +	kfree(pdn);
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(phy_link_topo_add_phy);
> +
> +void phy_link_topo_del_phy(struct net_device *dev,
> +			   struct phy_device *phy)
> +{
> +	struct phy_link_topology *topo = dev->link_topo;
> +	struct phy_device_node *pdn;
> +
> +	if (!topo)
> +		return;
> +
> +	pdn = xa_erase(&topo->phys, phy->phyindex);
> +
> +	/* We delete the PHY from the topology, however we don't re-set the
> +	 * phy->phyindex field. If the PHY isn't gone, we can re-assign it the
> +	 * same index next time it's added back to the topology
> +	 */
> +
> +	kfree(pdn);
> +}
> +EXPORT_SYMBOL_GPL(phy_link_topo_del_phy);
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 93558645c6d0..937da1dfcb2c 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -40,7 +40,6 @@
>   #include <net/dcbnl.h>
>   #endif
>   #include <net/netprio_cgroup.h>
> -
>   #include <linux/netdev_features.h>
>   #include <linux/neighbour.h>
>   #include <linux/netdevice_xmit.h>
> @@ -81,6 +80,7 @@ struct xdp_frame;
>   struct xdp_metadata_ops;
>   struct xdp_md;
>   struct ethtool_netdev_state;
> +struct phy_link_topology;
>   
>   typedef u32 xdp_features_t;
>   
> @@ -1977,6 +1977,7 @@ enum netdev_reg_state {
>    *	@fcoe_ddp_xid:	Max exchange id for FCoE LRO by ddp
>    *
>    *	@priomap:	XXX: need comments on this one
> + *	@link_topo:	Physical link topology tracking attached PHYs
>    *	@phydev:	Physical device may attach itself
>    *			for hardware timestamping
>    *	@sfp_bus:	attached &struct sfp_bus structure.
> @@ -2368,6 +2369,7 @@ struct net_device {
>   #if IS_ENABLED(CONFIG_CGROUP_NET_PRIO)
>   	struct netprio_map __rcu *priomap;
>   #endif
> +	struct phy_link_topology	*link_topo;
>   	struct phy_device	*phydev;
>   	struct sfp_bus		*sfp_bus;
>   	struct lock_class_key	*qdisc_tx_busylock;
> diff --git a/include/linux/phy.h b/include/linux/phy.h
> index bd68f9d8e74f..2d477eb2809a 100644
> --- a/include/linux/phy.h
> +++ b/include/linux/phy.h
> @@ -554,6 +554,9 @@ struct macsec_ops;
>    * @drv: Pointer to the driver for this PHY instance
>    * @devlink: Create a link between phy dev and mac dev, if the external phy
>    *           used by current mac interface is managed by another mac interface.
> + * @phyindex: Unique id across the phy's parent tree of phys to address the PHY
> + *	      from userspace, similar to ifindex. A zero index means the PHY
> + *	      wasn't assigned an id yet.
>    * @phy_id: UID for this device found during discovery
>    * @c45_ids: 802.3-c45 Device Identifiers if is_c45.
>    * @is_c45:  Set to true if this PHY uses clause 45 addressing.
> @@ -654,6 +657,7 @@ struct phy_device {
>   
>   	struct device_link *devlink;
>   
> +	u32 phyindex;
>   	u32 phy_id;
>   
>   	struct phy_c45_device_ids c45_ids;
> diff --git a/include/linux/phy_link_topology.h b/include/linux/phy_link_topology.h
> new file mode 100644
> index 000000000000..68a59e25821c
> --- /dev/null
> +++ b/include/linux/phy_link_topology.h
> @@ -0,0 +1,82 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * PHY device list allow maintaining a list of PHY devices that are
> + * part of a netdevice's link topology. PHYs can for example be chained,
> + * as is the case when using a PHY that exposes an SFP module, on which an
> + * SFP transceiver that embeds a PHY is connected.
> + *
> + * This list can then be used by userspace to leverage individual PHY
> + * capabilities.
> + */
> +#ifndef __PHY_LINK_TOPOLOGY_H
> +#define __PHY_LINK_TOPOLOGY_H
> +
> +#include <linux/ethtool.h>
> +#include <linux/netdevice.h>
> +
> +struct xarray;
> +struct phy_device;
> +struct sfp_bus;
> +
> +struct phy_link_topology {
> +	struct xarray phys;
> +	u32 next_phy_index;
> +};
> +
> +struct phy_device_node {
> +	enum phy_upstream upstream_type;
> +
> +	union {
> +		struct net_device	*netdev;
> +		struct phy_device	*phydev;
> +	} upstream;
> +
> +	struct sfp_bus *parent_sfp_bus;
> +
> +	struct phy_device *phy;
> +};
> +
> +#if IS_ENABLED(CONFIG_PHYLIB)
> +int phy_link_topo_add_phy(struct net_device *dev,
> +			  struct phy_device *phy,
> +			  enum phy_upstream upt, void *upstream);
> +
> +void phy_link_topo_del_phy(struct net_device *dev, struct phy_device *phy);
> +
> +static inline struct phy_device *
> +phy_link_topo_get_phy(struct net_device *dev, u32 phyindex)
> +{
> +	struct phy_link_topology *topo = dev->link_topo;
> +	struct phy_device_node *pdn;
> +
> +	if (!topo)
> +		return NULL;
> +
> +	pdn = xa_load(&topo->phys, phyindex);
> +	if (pdn)
> +		return pdn->phy;
> +
> +	return NULL;
> +}
> +
> +#else
> +static inline int phy_link_topo_add_phy(struct net_device *dev,
> +					struct phy_device *phy,
> +					enum phy_upstream upt, void *upstream)
> +{
> +	return 0;
> +}
> +
> +static inline void phy_link_topo_del_phy(struct net_device *dev,
> +					 struct phy_device *phy)
> +{
> +}
> +
> +static inline struct phy_device *
> +phy_link_topo_get_phy(struct net_device *dev, u32 phyindex)
> +{
> +	return NULL;
> +}
> +#endif
> +
> +#endif /* __PHY_LINK_TOPOLOGY_H */
> diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h
> index 230110b97029..baf9e4d1855b 100644
> --- a/include/uapi/linux/ethtool.h
> +++ b/include/uapi/linux/ethtool.h
> @@ -2532,4 +2532,20 @@ struct ethtool_link_settings {
>   	 * __u32 map_lp_advertising[link_mode_masks_nwords];
>   	 */
>   };
> +
> +/**
> + * enum phy_upstream - Represents the upstream component a given PHY device
> + * is connected to, as in what is on the other end of the MII bus. Most PHYs
> + * will be attached to an Ethernet MAC controller, but in some cases, there's
> + * an intermediate PHY used as a media-converter, which will driver another
> + * MII interface as its output.
> + * @PHY_UPSTREAM_MAC: Upstream component is a MAC (a switch port,
> + *		      or ethernet controller)
> + * @PHY_UPSTREAM_PHY: Upstream component is a PHY (likely a media converter)
> + */
> +enum phy_upstream {
> +	PHY_UPSTREAM_MAC,
> +	PHY_UPSTREAM_PHY,
> +};
> +
>   #endif /* _UAPI_LINUX_ETHTOOL_H */
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 73e5af6943c3..cd316d97c145 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -158,6 +158,7 @@
>   #include <net/page_pool/types.h>
>   #include <net/page_pool/helpers.h>
>   #include <net/rps.h>
> +#include <linux/phy_link_topology.h>
>   
>   #include "dev.h"
>   #include "net-sysfs.h"
> @@ -10312,6 +10313,17 @@ static void netdev_do_free_pcpu_stats(struct net_device *dev)
>   	}
>   }
>   
> +static void netdev_free_phy_link_topology(struct net_device *dev)
> +{
> +	struct phy_link_topology *topo = dev->link_topo;
> +
> +	if (IS_ENABLED(CONFIG_PHYLIB) && topo) {
> +		xa_destroy(&topo->phys);
> +		kfree(topo);
> +		dev->link_topo = NULL;
> +	}
> +}
> +
>   /**
>    * register_netdevice() - register a network device
>    * @dev: device to register
> @@ -11108,6 +11120,7 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
>   #ifdef CONFIG_NET_SCHED
>   	hash_init(dev->qdisc_hash);
>   #endif
> +
>   	dev->priv_flags = IFF_XMIT_DST_RELEASE | IFF_XMIT_DST_RELEASE_PERM;
>   	setup(dev);
>   
> @@ -11200,6 +11213,8 @@ void free_netdev(struct net_device *dev)
>   	free_percpu(dev->xdp_bulkq);
>   	dev->xdp_bulkq = NULL;
>   
> +	netdev_free_phy_link_topology(dev);
> +
>   	/*  Compatibility with error handling in drivers */
>   	if (dev->reg_state == NETREG_UNINITIALIZED ||
>   	    dev->reg_state == NETREG_DUMMY) {
diff mbox series

Patch

diff --git a/MAINTAINERS b/MAINTAINERS
index e0f28278e504..5135c3379234 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8194,6 +8194,7 @@  F:	include/linux/mii.h
 F:	include/linux/of_net.h
 F:	include/linux/phy.h
 F:	include/linux/phy_fixed.h
+F:	include/linux/phy_link_topology.h
 F:	include/linux/phylib_stubs.h
 F:	include/linux/platform_data/mdio-bcm-unimac.h
 F:	include/linux/platform_data/mdio-gpio.h
diff --git a/drivers/net/phy/Makefile b/drivers/net/phy/Makefile
index 202ed7f450da..1d8be374915f 100644
--- a/drivers/net/phy/Makefile
+++ b/drivers/net/phy/Makefile
@@ -2,7 +2,7 @@ 
 # Makefile for Linux PHY drivers
 
 libphy-y			:= phy.o phy-c45.o phy-core.o phy_device.o \
-				   linkmode.o
+				   linkmode.o phy_link_topology.o
 mdio-bus-y			+= mdio_bus.o mdio_device.o
 
 ifdef CONFIG_MDIO_DEVICE
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 70b07e621fb2..e68acaba1b4f 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -29,6 +29,7 @@ 
 #include <linux/phy.h>
 #include <linux/phylib_stubs.h>
 #include <linux/phy_led_triggers.h>
+#include <linux/phy_link_topology.h>
 #include <linux/pse-pd/pse.h>
 #include <linux/property.h>
 #include <linux/rtnetlink.h>
@@ -1511,6 +1512,10 @@  int phy_attach_direct(struct net_device *dev, struct phy_device *phydev,
 
 		if (phydev->sfp_bus_attached)
 			dev->sfp_bus = phydev->sfp_bus;
+
+		err = phy_link_topo_add_phy(dev, phydev, PHY_UPSTREAM_MAC, dev);
+		if (err)
+			goto error;
 	}
 
 	/* Some Ethernet drivers try to connect to a PHY device before
@@ -1938,6 +1943,7 @@  void phy_detach(struct phy_device *phydev)
 	if (dev) {
 		phydev->attached_dev->phydev = NULL;
 		phydev->attached_dev = NULL;
+		phy_link_topo_del_phy(dev, phydev);
 	}
 	phydev->phylink = NULL;
 
diff --git a/drivers/net/phy/phy_link_topology.c b/drivers/net/phy/phy_link_topology.c
new file mode 100644
index 000000000000..4a5d73002a1a
--- /dev/null
+++ b/drivers/net/phy/phy_link_topology.c
@@ -0,0 +1,105 @@ 
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * Infrastructure to handle all PHY devices connected to a given netdev,
+ * either directly or indirectly attached.
+ *
+ * Copyright (c) 2023 Maxime Chevallier<maxime.chevallier@bootlin.com>
+ */
+
+#include <linux/phy_link_topology.h>
+#include <linux/phy.h>
+#include <linux/rtnetlink.h>
+#include <linux/xarray.h>
+
+static int netdev_alloc_phy_link_topology(struct net_device *dev)
+{
+	struct phy_link_topology *topo;
+
+	topo = kzalloc(sizeof(*topo), GFP_KERNEL);
+	if (!topo)
+		return -ENOMEM;
+
+	xa_init_flags(&topo->phys, XA_FLAGS_ALLOC1);
+	topo->next_phy_index = 1;
+
+	dev->link_topo = topo;
+
+	return 0;
+}
+
+int phy_link_topo_add_phy(struct net_device *dev,
+			  struct phy_device *phy,
+			  enum phy_upstream upt, void *upstream)
+{
+	struct phy_link_topology *topo = dev->link_topo;
+	struct phy_device_node *pdn;
+	int ret;
+
+	if (!topo) {
+		ret = netdev_alloc_phy_link_topology(dev);
+		if (ret)
+			return ret;
+
+		topo = dev->link_topo;
+	}
+
+	pdn = kzalloc(sizeof(*pdn), GFP_KERNEL);
+	if (!pdn)
+		return -ENOMEM;
+
+	pdn->phy = phy;
+	switch (upt) {
+	case PHY_UPSTREAM_MAC:
+		pdn->upstream.netdev = (struct net_device *)upstream;
+		if (phy_on_sfp(phy))
+			pdn->parent_sfp_bus = pdn->upstream.netdev->sfp_bus;
+		break;
+	case PHY_UPSTREAM_PHY:
+		pdn->upstream.phydev = (struct phy_device *)upstream;
+		if (phy_on_sfp(phy))
+			pdn->parent_sfp_bus = pdn->upstream.phydev->sfp_bus;
+		break;
+	default:
+		ret = -EINVAL;
+		goto err;
+	}
+	pdn->upstream_type = upt;
+
+	/* Attempt to re-use a previously allocated phy_index */
+	if (phy->phyindex)
+		ret = xa_insert(&topo->phys, phy->phyindex, pdn, GFP_KERNEL);
+	else
+		ret = xa_alloc_cyclic(&topo->phys, &phy->phyindex, pdn,
+				      xa_limit_32b, &topo->next_phy_index,
+				      GFP_KERNEL);
+
+	if (ret)
+		goto err;
+
+	return 0;
+
+err:
+	kfree(pdn);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(phy_link_topo_add_phy);
+
+void phy_link_topo_del_phy(struct net_device *dev,
+			   struct phy_device *phy)
+{
+	struct phy_link_topology *topo = dev->link_topo;
+	struct phy_device_node *pdn;
+
+	if (!topo)
+		return;
+
+	pdn = xa_erase(&topo->phys, phy->phyindex);
+
+	/* We delete the PHY from the topology, however we don't re-set the
+	 * phy->phyindex field. If the PHY isn't gone, we can re-assign it the
+	 * same index next time it's added back to the topology
+	 */
+
+	kfree(pdn);
+}
+EXPORT_SYMBOL_GPL(phy_link_topo_del_phy);
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 93558645c6d0..937da1dfcb2c 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -40,7 +40,6 @@ 
 #include <net/dcbnl.h>
 #endif
 #include <net/netprio_cgroup.h>
-
 #include <linux/netdev_features.h>
 #include <linux/neighbour.h>
 #include <linux/netdevice_xmit.h>
@@ -81,6 +80,7 @@  struct xdp_frame;
 struct xdp_metadata_ops;
 struct xdp_md;
 struct ethtool_netdev_state;
+struct phy_link_topology;
 
 typedef u32 xdp_features_t;
 
@@ -1977,6 +1977,7 @@  enum netdev_reg_state {
  *	@fcoe_ddp_xid:	Max exchange id for FCoE LRO by ddp
  *
  *	@priomap:	XXX: need comments on this one
+ *	@link_topo:	Physical link topology tracking attached PHYs
  *	@phydev:	Physical device may attach itself
  *			for hardware timestamping
  *	@sfp_bus:	attached &struct sfp_bus structure.
@@ -2368,6 +2369,7 @@  struct net_device {
 #if IS_ENABLED(CONFIG_CGROUP_NET_PRIO)
 	struct netprio_map __rcu *priomap;
 #endif
+	struct phy_link_topology	*link_topo;
 	struct phy_device	*phydev;
 	struct sfp_bus		*sfp_bus;
 	struct lock_class_key	*qdisc_tx_busylock;
diff --git a/include/linux/phy.h b/include/linux/phy.h
index bd68f9d8e74f..2d477eb2809a 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -554,6 +554,9 @@  struct macsec_ops;
  * @drv: Pointer to the driver for this PHY instance
  * @devlink: Create a link between phy dev and mac dev, if the external phy
  *           used by current mac interface is managed by another mac interface.
+ * @phyindex: Unique id across the phy's parent tree of phys to address the PHY
+ *	      from userspace, similar to ifindex. A zero index means the PHY
+ *	      wasn't assigned an id yet.
  * @phy_id: UID for this device found during discovery
  * @c45_ids: 802.3-c45 Device Identifiers if is_c45.
  * @is_c45:  Set to true if this PHY uses clause 45 addressing.
@@ -654,6 +657,7 @@  struct phy_device {
 
 	struct device_link *devlink;
 
+	u32 phyindex;
 	u32 phy_id;
 
 	struct phy_c45_device_ids c45_ids;
diff --git a/include/linux/phy_link_topology.h b/include/linux/phy_link_topology.h
new file mode 100644
index 000000000000..68a59e25821c
--- /dev/null
+++ b/include/linux/phy_link_topology.h
@@ -0,0 +1,82 @@ 
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * PHY device list allow maintaining a list of PHY devices that are
+ * part of a netdevice's link topology. PHYs can for example be chained,
+ * as is the case when using a PHY that exposes an SFP module, on which an
+ * SFP transceiver that embeds a PHY is connected.
+ *
+ * This list can then be used by userspace to leverage individual PHY
+ * capabilities.
+ */
+#ifndef __PHY_LINK_TOPOLOGY_H
+#define __PHY_LINK_TOPOLOGY_H
+
+#include <linux/ethtool.h>
+#include <linux/netdevice.h>
+
+struct xarray;
+struct phy_device;
+struct sfp_bus;
+
+struct phy_link_topology {
+	struct xarray phys;
+	u32 next_phy_index;
+};
+
+struct phy_device_node {
+	enum phy_upstream upstream_type;
+
+	union {
+		struct net_device	*netdev;
+		struct phy_device	*phydev;
+	} upstream;
+
+	struct sfp_bus *parent_sfp_bus;
+
+	struct phy_device *phy;
+};
+
+#if IS_ENABLED(CONFIG_PHYLIB)
+int phy_link_topo_add_phy(struct net_device *dev,
+			  struct phy_device *phy,
+			  enum phy_upstream upt, void *upstream);
+
+void phy_link_topo_del_phy(struct net_device *dev, struct phy_device *phy);
+
+static inline struct phy_device *
+phy_link_topo_get_phy(struct net_device *dev, u32 phyindex)
+{
+	struct phy_link_topology *topo = dev->link_topo;
+	struct phy_device_node *pdn;
+
+	if (!topo)
+		return NULL;
+
+	pdn = xa_load(&topo->phys, phyindex);
+	if (pdn)
+		return pdn->phy;
+
+	return NULL;
+}
+
+#else
+static inline int phy_link_topo_add_phy(struct net_device *dev,
+					struct phy_device *phy,
+					enum phy_upstream upt, void *upstream)
+{
+	return 0;
+}
+
+static inline void phy_link_topo_del_phy(struct net_device *dev,
+					 struct phy_device *phy)
+{
+}
+
+static inline struct phy_device *
+phy_link_topo_get_phy(struct net_device *dev, u32 phyindex)
+{
+	return NULL;
+}
+#endif
+
+#endif /* __PHY_LINK_TOPOLOGY_H */
diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h
index 230110b97029..baf9e4d1855b 100644
--- a/include/uapi/linux/ethtool.h
+++ b/include/uapi/linux/ethtool.h
@@ -2532,4 +2532,20 @@  struct ethtool_link_settings {
 	 * __u32 map_lp_advertising[link_mode_masks_nwords];
 	 */
 };
+
+/**
+ * enum phy_upstream - Represents the upstream component a given PHY device
+ * is connected to, as in what is on the other end of the MII bus. Most PHYs
+ * will be attached to an Ethernet MAC controller, but in some cases, there's
+ * an intermediate PHY used as a media-converter, which will driver another
+ * MII interface as its output.
+ * @PHY_UPSTREAM_MAC: Upstream component is a MAC (a switch port,
+ *		      or ethernet controller)
+ * @PHY_UPSTREAM_PHY: Upstream component is a PHY (likely a media converter)
+ */
+enum phy_upstream {
+	PHY_UPSTREAM_MAC,
+	PHY_UPSTREAM_PHY,
+};
+
 #endif /* _UAPI_LINUX_ETHTOOL_H */
diff --git a/net/core/dev.c b/net/core/dev.c
index 73e5af6943c3..cd316d97c145 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -158,6 +158,7 @@ 
 #include <net/page_pool/types.h>
 #include <net/page_pool/helpers.h>
 #include <net/rps.h>
+#include <linux/phy_link_topology.h>
 
 #include "dev.h"
 #include "net-sysfs.h"
@@ -10312,6 +10313,17 @@  static void netdev_do_free_pcpu_stats(struct net_device *dev)
 	}
 }
 
+static void netdev_free_phy_link_topology(struct net_device *dev)
+{
+	struct phy_link_topology *topo = dev->link_topo;
+
+	if (IS_ENABLED(CONFIG_PHYLIB) && topo) {
+		xa_destroy(&topo->phys);
+		kfree(topo);
+		dev->link_topo = NULL;
+	}
+}
+
 /**
  * register_netdevice() - register a network device
  * @dev: device to register
@@ -11108,6 +11120,7 @@  struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
 #ifdef CONFIG_NET_SCHED
 	hash_init(dev->qdisc_hash);
 #endif
+
 	dev->priv_flags = IFF_XMIT_DST_RELEASE | IFF_XMIT_DST_RELEASE_PERM;
 	setup(dev);
 
@@ -11200,6 +11213,8 @@  void free_netdev(struct net_device *dev)
 	free_percpu(dev->xdp_bulkq);
 	dev->xdp_bulkq = NULL;
 
+	netdev_free_phy_link_topology(dev);
+
 	/*  Compatibility with error handling in drivers */
 	if (dev->reg_state == NETREG_UNINITIALIZED ||
 	    dev->reg_state == NETREG_DUMMY) {