diff mbox series

[RFC,V1,net-next,3/4] net: Let the active time stamping layer be selectable.

Message ID 20220103232555.19791-4-richardcochran@gmail.com (mailing list archive)
State RFC
Delegated to: Netdev Maintainers
Headers show
Series Make MAC/PHY time stamping selectable | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Series has a cover letter
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit fail Errors and warnings before: 5 this patch: 6
netdev/cc_maintainers warning 3 maintainers not CCed: danieller@nvidia.com arnd@arndb.de atenart@kernel.org
netdev/build_clang fail Errors and warnings before: 24 this patch: 27
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn fail Errors and warnings before: 7 this patch: 8
netdev/checkpatch warning CHECK: Blank lines aren't necessary after an open brace '{' WARNING: braces {} are not necessary for single statement blocks WARNING: line length of 81 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 1 this patch: 1
netdev/source_inline success Was 0 now: 0

Commit Message

Richard Cochran Jan. 3, 2022, 11:25 p.m. UTC
Make the sysfs knob writable, and add checks in the ioctl and time
stamping paths to respect the currently selected time stamping layer.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
---
 .../ABI/testing/sysfs-class-net-timestamping  |  5 +-
 net/core/dev_ioctl.c                          | 44 ++++++++++++++--
 net/core/net-sysfs.c                          | 50 +++++++++++++++++--
 net/core/timestamping.c                       |  6 +++
 net/ethtool/common.c                          | 18 +++++--
 5 files changed, 111 insertions(+), 12 deletions(-)

Comments

Russell King (Oracle) Jan. 3, 2022, 11:53 p.m. UTC | #1
On Mon, Jan 03, 2022 at 03:25:54PM -0800, Richard Cochran wrote:
> Make the sysfs knob writable, and add checks in the ioctl and time
> stamping paths to respect the currently selected time stamping layer.
> 
> Signed-off-by: Richard Cochran <richardcochran@gmail.com>

As I stated for patch 2, this patch will break mvpp2 PTP support,
since as soon as we bind a PHY, whether or not it supports PTP, you
will switch "selected_timestamping_layer" to be PHY mode PTP, and
direct all PTP calls to the PHY layer whether or not the PHY has
PTP support, away from the MAC layer.
Richard Cochran Jan. 4, 2022, 1:42 a.m. UTC | #2
On Mon, Jan 03, 2022 at 11:53:57PM +0000, Russell King (Oracle) wrote:
> On Mon, Jan 03, 2022 at 03:25:54PM -0800, Richard Cochran wrote:
> > Make the sysfs knob writable, and add checks in the ioctl and time
> > stamping paths to respect the currently selected time stamping layer.
> > 
> > Signed-off-by: Richard Cochran <richardcochran@gmail.com>
> 
> As I stated for patch 2, this patch will break mvpp2 PTP support,
> since as soon as we bind a PHY, whether or not it supports PTP, you
> will switch "selected_timestamping_layer" to be PHY mode PTP, and
> direct all PTP calls to the PHY layer whether or not the PHY has
> PTP support, away from the MAC layer.

Oh, that was a brain fart of mine.

I'll amend that to switch selected_timestamping_layer only if the PHY
does support time stamping.

IMO, the default should be PHY because up until now the PHY layer was
prefered.

Or would you say the MAC layer should take default priority?

(that may well break some existing systems)

Thanks,
Richard
Vladimir Oltean Jan. 20, 2022, 4:48 p.m. UTC | #3
Hi Richard,

On Mon, Jan 03, 2022 at 03:25:54PM -0800, Richard Cochran wrote:
> Make the sysfs knob writable, and add checks in the ioctl and time
> stamping paths to respect the currently selected time stamping layer.
> 
> Signed-off-by: Richard Cochran <richardcochran@gmail.com>
> ---

Could we think of a more flexible solution? Your proposal would not
allow a packet to have multiple hwtstamps, and maybe that would be
interesting for some use cases (hardware testing, mostly).
The problem with that, really, is that user space doesn't know which PHC
is a hardware timestamp coming from - this info isn't present in the
struct scm_timestamping presented in the SOL_SOCKET/SCM_TIMESTAMPING
cmsg.

This is also the reason why DSA denies PTP timestamping on the master
interface, although there isn't any physical reason to do that. For the
same reason mentioned earlier, it would be nice to see hwtstamps for a
packet as it traverses DSA master -> DSA switch port -> PHY attached to
DSA switch.

With a new SO_TIMESTAMPING API (say, SOF_TIMESTAMPING_SELECT_PHC |
SOF_TIMESTAMPING_PHY, plus a new SCM_TIMESTAMP_PHC enum that would also
contain the PHC index), we could make this a per-socket option. Kernel
keeps track of whether any socket requests PHY timestamping, and
enables/disables PHY timestamping as needed. Your patch replays a call
to ops->ndo_eth_ioctl() from current_timestamping_provider_store()
anyway (I mean creates a call from outside its intended calling context),
we'd need similar logic to call that function from sock_set_timestamping()
or thereabouts.

Do you consider this a valid use case? Different approaches for
different needs, I suppose. I guess what you want to achieve is for
ptp4l to not really care who is the timestamp provider.

>  .../ABI/testing/sysfs-class-net-timestamping  |  5 +-
>  net/core/dev_ioctl.c                          | 44 ++++++++++++++--
>  net/core/net-sysfs.c                          | 50 +++++++++++++++++--
>  net/core/timestamping.c                       |  6 +++
>  net/ethtool/common.c                          | 18 +++++--
>  5 files changed, 111 insertions(+), 12 deletions(-)
> 
> diff --git a/Documentation/ABI/testing/sysfs-class-net-timestamping b/Documentation/ABI/testing/sysfs-class-net-timestamping
> index 529c3a6eb607..6dfd59740cad 100644
> --- a/Documentation/ABI/testing/sysfs-class-net-timestamping
> +++ b/Documentation/ABI/testing/sysfs-class-net-timestamping
> @@ -11,7 +11,10 @@ What:		/sys/class/net/<iface>/current_timestamping_provider
>  Date:		January 2022
>  Contact:	Richard Cochran <richardcochran@gmail.com>
>  Description:
> -		Show the current SO_TIMESTAMPING provider.
> +		Shows or sets the current SO_TIMESTAMPING provider.
> +		When changing the value, some packets in the kernel
> +		networking stack may still be delivered with time
> +		stamps from the previous provider.
>  		The possible values are:
>  		- "mac"  The MAC provides time stamping.
>  		- "phy"  The PHY or MII device provides time stamping.
> diff --git a/net/core/dev_ioctl.c b/net/core/dev_ioctl.c
> index 1b807d119da5..269068ce3a51 100644
> --- a/net/core/dev_ioctl.c
> +++ b/net/core/dev_ioctl.c
> @@ -260,6 +260,43 @@ static int dev_eth_ioctl(struct net_device *dev,
>  	return err;
>  }
>  
> +static int dev_hwtstamp_ioctl(struct net_device *dev,
> +			      struct ifreq *ifr, unsigned int cmd)
> +{
> +	const struct net_device_ops *ops = dev->netdev_ops;
> +	int err;
> +
> +	err = dsa_ndo_eth_ioctl(dev, ifr, cmd);
> +	if (err == 0 || err != -EOPNOTSUPP)
> +		return err;
> +
> +	if (!netif_device_present(dev))
> +		return -ENODEV;
> +
> +	switch (dev->selected_timestamping_layer) {
> +
> +	case MAC_TIMESTAMPING:
> +		if (ops->ndo_do_ioctl == phy_do_ioctl) {
> +			/* Some drivers set .ndo_do_ioctl to phy_do_ioctl. */
> +			err = -EOPNOTSUPP;
> +		} else {
> +			err = ops->ndo_eth_ioctl(dev, ifr, cmd);
> +		}
> +		break;
> +
> +	case PHY_TIMESTAMPING:
> +		if (phy_has_hwtstamp(dev->phydev)) {
> +			err = phy_mii_ioctl(dev->phydev, ifr, cmd);
> +		} else {
> +			err = -ENODEV;
> +			WARN_ON(1);
> +		}
> +		break;
> +	}
> +
> +	return err;
> +}
> +
>  static int dev_siocbond(struct net_device *dev,
>  			struct ifreq *ifr, unsigned int cmd)
>  {
> @@ -395,6 +432,9 @@ static int dev_ifsioc(struct net *net, struct ifreq *ifr, void __user *data,
>  			return err;
>  		fallthrough;
>  
> +	case SIOCGHWTSTAMP:
> +		return dev_hwtstamp_ioctl(dev, ifr, cmd);
> +
>  	/*
>  	 *	Unknown or private ioctl
>  	 */
> @@ -405,9 +445,7 @@ static int dev_ifsioc(struct net *net, struct ifreq *ifr, void __user *data,
>  
>  		if (cmd == SIOCGMIIPHY ||
>  		    cmd == SIOCGMIIREG ||
> -		    cmd == SIOCSMIIREG ||
> -		    cmd == SIOCSHWTSTAMP ||
> -		    cmd == SIOCGHWTSTAMP) {
> +		    cmd == SIOCSMIIREG) {
>  			err = dev_eth_ioctl(dev, ifr, cmd);
>  		} else if (cmd == SIOCBONDENSLAVE ||
>  		    cmd == SIOCBONDRELEASE ||
> diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
> index 4ff7ef417c38..c27f01a1a285 100644
> --- a/net/core/net-sysfs.c
> +++ b/net/core/net-sysfs.c
> @@ -664,17 +664,59 @@ static ssize_t current_timestamping_provider_show(struct device *dev,
>  	if (!rtnl_trylock())
>  		return restart_syscall();
>  
> -	if (phy_has_tsinfo(phydev)) {
> -		ret = sprintf(buf, "%s\n", "phy");
> -	} else {
> +	switch (netdev->selected_timestamping_layer) {
> +	case MAC_TIMESTAMPING:
>  		ret = sprintf(buf, "%s\n", "mac");
> +		break;
> +	case PHY_TIMESTAMPING:
> +		ret = sprintf(buf, "%s\n", "phy");
> +		break;
>  	}
>  
>  	rtnl_unlock();
>  
>  	return ret;
>  }
> -static DEVICE_ATTR_RO(current_timestamping_provider);
> +
> +static ssize_t current_timestamping_provider_store(struct device *dev,
> +						   struct device_attribute *attr,
> +						   const char *buf, size_t len)
> +{
> +	struct net_device *netdev = to_net_dev(dev);
> +	struct net *net = dev_net(netdev);
> +	enum timestamping_layer flavor;
> +
> +	if (!ns_capable(net->user_ns, CAP_NET_ADMIN))
> +		return -EPERM;
> +
> +	if (!strcmp(buf, "mac\n"))
> +		flavor = MAC_TIMESTAMPING;
> +	else if (!strcmp(buf, "phy\n"))
> +		flavor = PHY_TIMESTAMPING;

Shouldn't we use sysfs_streq()?

> +	else
> +		return -EINVAL;
> +
> +	if (!rtnl_trylock())
> +		return restart_syscall();
> +
> +	if (!dev_isalive(netdev))
> +		goto out;
> +
> +	if (netdev->selected_timestamping_layer != flavor) {
> +		const struct net_device_ops *ops = netdev->netdev_ops;
> +		struct ifreq ifr = {0};
> +
> +		/* Disable time stamping in the current layer. */
> +		if (netif_device_present(netdev) && ops->ndo_eth_ioctl)
> +			ops->ndo_eth_ioctl(netdev, &ifr, SIOCSHWTSTAMP);
> +
> +		netdev->selected_timestamping_layer = flavor;

I'm unclear about this. If MAC timestamping was previously enabled on
the interface, ptp4l is running, and the admin surprise-changes it to
PHY, this will not enable PHY timestamping, will it? So the ptp4l
application must be restarted.

If I'm correct about this, then it would be cleaner to use the
setsockopt interface, since the kernel would have a better way of not
changing stuff from under existing processes' feet.

> +	}
> +out:
> +	rtnl_unlock();
> +	return len;
> +}
> +static DEVICE_ATTR_RW(current_timestamping_provider);
>  
>  static struct attribute *net_class_attrs[] __ro_after_init = {
>  	&dev_attr_netdev_group.attr,
> diff --git a/net/core/timestamping.c b/net/core/timestamping.c
> index 04840697fe79..31c3142787b7 100644
> --- a/net/core/timestamping.c
> +++ b/net/core/timestamping.c
> @@ -28,6 +28,9 @@ void skb_clone_tx_timestamp(struct sk_buff *skb)
>  	if (!skb->sk)
>  		return;
>  
> +	if (skb->dev->selected_timestamping_layer != PHY_TIMESTAMPING)
> +		return;
> +
>  	type = classify(skb);
>  	if (type == PTP_CLASS_NONE)
>  		return;
> @@ -50,6 +53,9 @@ bool skb_defer_rx_timestamp(struct sk_buff *skb)
>  	if (!skb->dev || !skb->dev->phydev || !skb->dev->phydev->mii_ts)
>  		return false;
>  
> +	if (skb->dev->selected_timestamping_layer != PHY_TIMESTAMPING)
> +		return false;
> +
>  	if (skb_headroom(skb) < ETH_HLEN)
>  		return false;
>  
> diff --git a/net/ethtool/common.c b/net/ethtool/common.c
> index 651d18eef589..7b50820c1d1d 100644
> --- a/net/ethtool/common.c
> +++ b/net/ethtool/common.c
> @@ -545,10 +545,20 @@ int __ethtool_get_ts_info(struct net_device *dev, struct ethtool_ts_info *info)
>  	memset(info, 0, sizeof(*info));
>  	info->cmd = ETHTOOL_GET_TS_INFO;
>  
> -	if (phy_has_tsinfo(phydev))
> -		return phy_ts_info(phydev, info);
> -	if (ops->get_ts_info)
> -		return ops->get_ts_info(dev, info);
> +	switch (dev->selected_timestamping_layer) {
> +
> +	case MAC_TIMESTAMPING:
> +		if (ops->get_ts_info)
> +			return ops->get_ts_info(dev, info);
> +		break;
> +
> +	case PHY_TIMESTAMPING:
> +		if (phy_has_tsinfo(phydev)) {
> +			return phy_ts_info(phydev, info);
> +		}
> +		WARN_ON(1);
> +		return -ENODEV;
> +	}
>  
>  	info->so_timestamping = SOF_TIMESTAMPING_RX_SOFTWARE |
>  				SOF_TIMESTAMPING_SOFTWARE;
> -- 
> 2.20.1
>
Andrew Lunn Jan. 21, 2022, 3:38 a.m. UTC | #4
> This is also the reason why DSA denies PTP timestamping on the master
> interface, although there isn't any physical reason to do that. For the
> same reason mentioned earlier, it would be nice to see hwtstamps for a
> packet as it traverses DSA master -> DSA switch port -> PHY attached to
> DSA switch.

Don't forget there could be back to back PHYs between the master and
the DSA switch port. In theory they could also be doing time stamping.

Also consider the case of a switch port connected to a PHY which does
media conversion to SFP. And the SFP has a copper module, so contains
another PHY. So you could have the MAC and both PHYs doing time
stamping?

So in the extreme case, you have 7 time stamps, 3 from MACs and 4 from
PHYs!

I doubt we want to support this, is there a valid use case for it?

  Andrew
Richard Cochran Jan. 21, 2022, 4:05 a.m. UTC | #5
On Fri, Jan 21, 2022 at 04:38:09AM +0100, Andrew Lunn wrote:

> So in the extreme case, you have 7 time stamps, 3 from MACs and 4 from
> PHYs!

:^)
 
> I doubt we want to support this, is there a valid use case for it?

Someday, someone will surely say it is important, but with any luck
I'll be dead by then...

Cheers,
Richard
Kurt Kanzenbach Jan. 21, 2022, 11:05 a.m. UTC | #6
On Thu Jan 20 2022, Vladimir Oltean wrote:
> Hi Richard,
>
> On Mon, Jan 03, 2022 at 03:25:54PM -0800, Richard Cochran wrote:
>> Make the sysfs knob writable, and add checks in the ioctl and time
>> stamping paths to respect the currently selected time stamping layer.
>> 
>> Signed-off-by: Richard Cochran <richardcochran@gmail.com>
>> ---
>
> Could we think of a more flexible solution? Your proposal would not
> allow a packet to have multiple hwtstamps, and maybe that would be
> interesting for some use cases (hardware testing, mostly).

One use case i can think of for having multiple hwtimestamps per packet
is crosstimestamping. Some devices such as hellcreek have multiple PHCs
and allow generation of such crosstimestamps.

Thanks,
Kurt
Vladimir Oltean Jan. 21, 2022, 2:50 p.m. UTC | #7
On Thu, Jan 20, 2022 at 08:05:08PM -0800, Richard Cochran wrote:
> On Fri, Jan 21, 2022 at 04:38:09AM +0100, Andrew Lunn wrote:
> 
> > So in the extreme case, you have 7 time stamps, 3 from MACs and 4 from
> > PHYs!
> 
> :^)
>  
> > I doubt we want to support this, is there a valid use case for it?
> 
> Someday, someone will surely say it is important, but with any luck
> I'll be dead by then...

So as I mentioned earlier, the use case would be hardware performance
testing and diagnosing. You may consider that as not that important, but
this is basically what I had to do for several months, and even wrote
a program for that, that collects packet timestamps at all possible points.
And to your point, the more complex a system is, the more important it
becomes to be able to diagnose it precisely.
Richard Cochran Jan. 21, 2022, 3:28 p.m. UTC | #8
On Fri, Jan 21, 2022 at 02:50:36PM +0000, Vladimir Oltean wrote:
> So as I mentioned earlier, the use case would be hardware performance
> testing and diagnosing. You may consider that as not that important, but
> this is basically what I had to do for several months, and even wrote
> a program for that, that collects packet timestamps at all possible points.

This is not possible without making a brand new CMSG to accommodate
time stamps from all the various layers.

That is completely out of scope for this series.

The only practical use case of this series is to switch from PHY back to MAC.

Thanks,
Richard
Richard Cochran Jan. 21, 2022, 3:31 p.m. UTC | #9
On Fri, Jan 21, 2022 at 12:05:22PM +0100, Kurt Kanzenbach wrote:
> On Thu Jan 20 2022, Vladimir Oltean wrote:
> > Hi Richard,
> >
> > On Mon, Jan 03, 2022 at 03:25:54PM -0800, Richard Cochran wrote:
> >> Make the sysfs knob writable, and add checks in the ioctl and time
> >> stamping paths to respect the currently selected time stamping layer.
> >> 
> >> Signed-off-by: Richard Cochran <richardcochran@gmail.com>
> >> ---
> >
> > Could we think of a more flexible solution? Your proposal would not
> > allow a packet to have multiple hwtstamps, and maybe that would be
> > interesting for some use cases (hardware testing, mostly).
> 
> One use case i can think of for having multiple hwtimestamps per packet
> is crosstimestamping. Some devices such as hellcreek have multiple PHCs
> and allow generation of such crosstimestamps.

That may well be nice to have, but that is not in scope for this series.

Thanks,
Richard
Vladimir Oltean Jan. 21, 2022, 4:23 p.m. UTC | #10
On Fri, Jan 21, 2022 at 07:28:20AM -0800, Richard Cochran wrote:
> On Fri, Jan 21, 2022 at 02:50:36PM +0000, Vladimir Oltean wrote:
> > So as I mentioned earlier, the use case would be hardware performance
> > testing and diagnosing. You may consider that as not that important, but
> > this is basically what I had to do for several months, and even wrote
> > a program for that, that collects packet timestamps at all possible points.
> 
> This is not possible without making a brand new CMSG to accommodate
> time stamps from all the various layers.
> 
> That is completely out of scope for this series.
> 
> The only practical use case of this series is to switch from PHY back to MAC.

I don't think my proposal is out of scope. It deals with the same thing
as what you propose: the kernel makes a selection by default, user space
can change it. The need for PHC identification in the cmsg arises as a
direct consequence of the fact that there are multiple PHCs in the path.
It isn't a requirement that I introduced artificially. Your solution has
a need to deal with that problem too, it's just that it omits to do so:

|               When changing the value, some packets in the kernel
|               networking stack may still be delivered with time
|               stamps from the previous provider.
Richard Cochran Jan. 22, 2022, 2:08 a.m. UTC | #11
On Fri, Jan 21, 2022 at 04:23:27PM +0000, Vladimir Oltean wrote:
> On Fri, Jan 21, 2022 at 07:28:20AM -0800, Richard Cochran wrote:
> > On Fri, Jan 21, 2022 at 02:50:36PM +0000, Vladimir Oltean wrote:
> > > So as I mentioned earlier, the use case would be hardware performance
> > > testing and diagnosing. You may consider that as not that important, but
> > > this is basically what I had to do for several months, and even wrote
> > > a program for that, that collects packet timestamps at all possible points.
> > 
> > This is not possible without making a brand new CMSG to accommodate
> > time stamps from all the various layers.
> > 
> > That is completely out of scope for this series.
> > 
> > The only practical use case of this series is to switch from PHY back to MAC.
> 
> I don't think my proposal is out of scope.

I don't see a way to implement your proposal at all.

If you want to implement your proposal and submit a patch series, I
will gladly drop mine in favor of yours.  Otherwise, my series solves
a real world problem and improves the stack at least a little bit.

Thanks,
Richard
Miroslav Lichvar Jan. 24, 2022, 9:28 a.m. UTC | #12
On Fri, Jan 21, 2022 at 07:28:20AM -0800, Richard Cochran wrote:
> On Fri, Jan 21, 2022 at 02:50:36PM +0000, Vladimir Oltean wrote:
> > So as I mentioned earlier, the use case would be hardware performance
> > testing and diagnosing. You may consider that as not that important, but
> > this is basically what I had to do for several months, and even wrote
> > a program for that, that collects packet timestamps at all possible points.
> 
> This is not possible without making a brand new CMSG to accommodate
> time stamps from all the various layers.

FWIW, scm_timestamping has three fields and the middle one no longer
seems to be used. If a new socket/timestamping option enabled all
three (SW, MAC, PHY) timestamps in the cmsg, I think that would be a
nice feature.

There are applications that receive both SW and HW timestamps in order
to fall back to SW when a HW timestamp glitched or is missing. This
could be extended to three levels with MAC and PHY timestamps.

> That is completely out of scope for this series.
> 
> The only practical use case of this series is to switch from PHY back to MAC.

From an admin point of view, it makes sense to me to have an option to
disable PHY timestamps for the whole device if there are issues with
it. For debugging and applications, it would be nice to have an option
to get all of them at the same time.
Richard Cochran Jan. 24, 2022, 3:37 p.m. UTC | #13
On Mon, Jan 24, 2022 at 10:28:23AM +0100, Miroslav Lichvar wrote:

> FWIW, scm_timestamping has three fields and the middle one no longer
> seems to be used. If a new socket/timestamping option enabled all
> three (SW, MAC, PHY) timestamps in the cmsg, I think that would be a
> nice feature.

This won't work because:

- There would need to be seven^W eight, not three slots.

- Even with just three, the CMSG would have to have a bit that clearly
  identifies the new format.
 
> From an admin point of view, it makes sense to me to have an option to
> disable PHY timestamps for the whole device if there are issues with
> it. For debugging and applications, it would be nice to have an option
> to get all of them at the same time.

Right.  Those are two different use cases.  The present series
addresses the first one.  The second one entails making a new flavor
of time stamping API.

Thanks,
Richard
Vladimir Oltean Jan. 25, 2022, 3:37 p.m. UTC | #14
On Mon, Jan 24, 2022 at 07:37:16AM -0800, Richard Cochran wrote:
> On Mon, Jan 24, 2022 at 10:28:23AM +0100, Miroslav Lichvar wrote:
> 
> > FWIW, scm_timestamping has three fields and the middle one no longer
> > seems to be used. If a new socket/timestamping option enabled all
> > three (SW, MAC, PHY) timestamps in the cmsg, I think that would be a
> > nice feature.
> 
> This won't work because:
> 
> - There would need to be seven^W eight, not three slots.
> 
> - Even with just three, the CMSG would have to have a bit that clearly
>   identifies the new format.
>  
> > From an admin point of view, it makes sense to me to have an option to
> > disable PHY timestamps for the whole device if there are issues with
> > it. For debugging and applications, it would be nice to have an option
> > to get all of them at the same time.
> 
> Right.  Those are two different use cases.  The present series
> addresses the first one.  The second one entails making a new flavor
> of time stamping API.

I agree that they are different use cases, but with the runtime PHC
change being now possible, there isn't really a clear-cut moment when an
application can now say "from this moment on, I'm only getting
timestamps from the new PHC". Especially when you switch over from the
PHY to the MAC, a few tens of ms may pass until you get an RX timestamp
from the PHY, and the MAC PHC may well be up and operational in the
meantime, and ptp4l synchronizing based on its timestamps.

I don't have a clear idea how to put it in practice, either. I'll think
about it, but the problem I see is that the phc_index reported by
"ethtool -T" becomes a fuzzy concept in the presence of multiple PHCs.
Adding the ability to retrieve timestamps from all of them doesn't make
it any easier to select which one gets reported to ethtool.
Maybe there's room for both the sysfs and the socket option after all.
Michael Walle April 4, 2022, 3:05 p.m. UTC | #15
Sorry for digging out this older thread, but it seems to be discussed
in [1].

> IMO, the default should be PHY because up until now the PHY layer was
> prefered.
> 
> Or would you say the MAC layer should take default priority?
> 
> (that may well break some existing systems)

Correct me if I'm wrong, but for systems with multiple interfaces,
in particular switches, you'd need external circuits to synchronize
the PHCs within in the PHYs. (And if you use a time aware scheduler
you'd need to synchronize the MAC, too). Whereas for switches there
is usually just one PHC in the MAC which just works.

On these systems, pushing the timestamping to the PHY would mean
that this external circuitry must exist and have to be in use/
supported. MAC timestamping will work in all cases without any
external dependencies.

I'm working on a board with the LAN9668 switch which has one LAN8814
PHY and two GPY215 PHYs and two internal PHYs. The LAN9668 driver
will forward all timestamping ioctls to the PHY if it supports
timestamping (unconditionally). As soon as the patches to add ptp
support to the LAN8814 will be accepted, I guess it will break the
PTP/TAS support because there is no synchronization between all the
PHCs on that board. Thus, IMHO MAC timestamping should be the default.

-michael

[1] https://lore.kernel.org/netdev/20220308145405.GD29063@hoboy.vegasvil.org/
Andrew Lunn April 4, 2022, 3:20 p.m. UTC | #16
On Mon, Apr 04, 2022 at 05:05:08PM +0200, Michael Walle wrote:
> Sorry for digging out this older thread, but it seems to be discussed
> in [1].
> 
> > IMO, the default should be PHY because up until now the PHY layer was
> > prefered.
> > 
> > Or would you say the MAC layer should take default priority?
> > 
> > (that may well break some existing systems)
> 
> Correct me if I'm wrong, but for systems with multiple interfaces,
> in particular switches, you'd need external circuits to synchronize
> the PHCs within in the PHYs.

If the PHYs are external. There are switches with internal PHYs, so
they might already have the needed synchronisation.

> (And if you use a time aware scheduler
> you'd need to synchronize the MAC, too). Whereas for switches there
> is usually just one PHC in the MAC which just works.

And there could be switches with the MACs being totally
independent. In theory.

> On these systems, pushing the timestamping to the PHY would mean
> that this external circuitry must exist and have to be in use/
> supported. MAC timestamping will work in all cases without any
> external dependencies.

And if the MAC are independent, you need the external dependency.

> I'm working on a board with the LAN9668 switch which has one LAN8814
> PHY and two GPY215 PHYs and two internal PHYs. The LAN9668 driver
> will forward all timestamping ioctls to the PHY if it supports
> timestamping (unconditionally). As soon as the patches to add ptp
> support to the LAN8814 will be accepted, I guess it will break the
> PTP/TAS support because there is no synchronization between all the
> PHCs on that board. Thus, IMHO MAC timestamping should be the default.

There are arguments for MAC being the defaults. But we must have the
option to do different, see above. But the real problem here is
history. It is very hard to change a default without breaking systems
which depend on that default. I believe Russell has a system which
will break if the default is changed.

So i suspect the default cannot be changed, but maybe we need a
mechanism where an interface or a board can express a preference it
would prefer be used when there are multiple choices, in addition to
the user space API to make the selection.

   Andrew
Michael Walle April 4, 2022, 5:12 p.m. UTC | #17
Am 2022-04-04 17:20, schrieb Andrew Lunn:
> On Mon, Apr 04, 2022 at 05:05:08PM +0200, Michael Walle wrote:
>> Sorry for digging out this older thread, but it seems to be discussed
>> in [1].
>> 
>> > IMO, the default should be PHY because up until now the PHY layer was
>> > prefered.
>> >
>> > Or would you say the MAC layer should take default priority?
>> >
>> > (that may well break some existing systems)
>> 
>> Correct me if I'm wrong, but for systems with multiple interfaces,
>> in particular switches, you'd need external circuits to synchronize
>> the PHCs within in the PHYs.
> 
> If the PHYs are external. There are switches with internal PHYs, so
> they might already have the needed synchronisation.
> 
>> (And if you use a time aware scheduler
>> you'd need to synchronize the MAC, too). Whereas for switches there
>> is usually just one PHC in the MAC which just works.
> 
> And there could be switches with the MACs being totally
> independent. In theory.
> 
>> On these systems, pushing the timestamping to the PHY would mean
>> that this external circuitry must exist and have to be in use/
>> supported. MAC timestamping will work in all cases without any
>> external dependencies.
> 
> And if the MAC are independent, you need the external dependency.
> 
>> I'm working on a board with the LAN9668 switch which has one LAN8814
>> PHY and two GPY215 PHYs and two internal PHYs. The LAN9668 driver
>> will forward all timestamping ioctls to the PHY if it supports
>> timestamping (unconditionally). As soon as the patches to add ptp
>> support to the LAN8814 will be accepted, I guess it will break the
>> PTP/TAS support because there is no synchronization between all the
>> PHCs on that board. Thus, IMHO MAC timestamping should be the default.
> 
> There are arguments for MAC being the defaults. But we must have the
> option to do different, see above. But the real problem here is
> history. It is very hard to change a default without breaking systems
> which depend on that default. I believe Russell has a system which
> will break if the default is changed.
> 
> So i suspect the default cannot be changed, but maybe we need a
> mechanism where an interface or a board can express a preference it
> would prefer be used when there are multiple choices, in addition to
> the user space API to make the selection.

That would make sense. I guess what bothers me with the current
mechanism is that a feature addition to the PHY in the *future* (the
timestamping support) might break a board - or at least changes the
behavior by suddenly using PHY timestamping.

-michael
Richard Cochran April 5, 2022, 5:59 a.m. UTC | #18
On Mon, Apr 04, 2022 at 07:12:11PM +0200, Michael Walle wrote:

> That would make sense. I guess what bothers me with the current
> mechanism is that a feature addition to the PHY in the *future* (the
> timestamping support) might break a board - or at least changes the
> behavior by suddenly using PHY timestamping.

That is a good point, but then something will break in any case.

Thanks,
Richard
Michael Walle April 5, 2022, 8:48 a.m. UTC | #19
Am 2022-04-05 07:59, schrieb Richard Cochran:
> On Mon, Apr 04, 2022 at 07:12:11PM +0200, Michael Walle wrote:
> 
>> That would make sense. I guess what bothers me with the current
>> mechanism is that a feature addition to the PHY in the *future* (the
>> timestamping support) might break a board - or at least changes the
>> behavior by suddenly using PHY timestamping.
> 
> That is a good point, but then something will break in any case.

Can the ethernet driver select the default one? So any current
driver which has "if phy_has_hwtstamp() forward_ioctl;" can set
it to PHY while newer drivers can (or should actually) leave it as
MAC.

This doesn't really fix the problem per se. But at least new
drivers won't be affected. For my problem at hand, I'd need to
convince Microchip to default to MAC although the driver is
already in the tree (but there is no user of it in mainline
right now).

-michael
Kurt Kanzenbach April 5, 2022, 9:01 a.m. UTC | #20
On Mon Apr 04 2022, Michael Walle wrote:
> That would make sense. I guess what bothers me with the current
> mechanism is that a feature addition to the PHY in the *future* (the
> timestamping support) might break a board - or at least changes the
> behavior by suddenly using PHY timestamping.

Currently PHY timestamping is hidden behind a configuration option
(NETWORK_PHY_TIMESTAMPING). By disabling this option the default
behavior should stay at MAC timestamping even if additional features
are added on top of the PHY drivers at later stages. Or not?

Thanks,
Kurt
Michael Walle April 5, 2022, 9:19 a.m. UTC | #21
Am 2022-04-05 11:01, schrieb Kurt Kanzenbach:
> On Mon Apr 04 2022, Michael Walle wrote:
>> That would make sense. I guess what bothers me with the current
>> mechanism is that a feature addition to the PHY in the *future* (the
>> timestamping support) might break a board - or at least changes the
>> behavior by suddenly using PHY timestamping.
> 
> Currently PHY timestamping is hidden behind a configuration option
> (NETWORK_PHY_TIMESTAMPING). By disabling this option the default
> behavior should stay at MAC timestamping even if additional features
> are added on top of the PHY drivers at later stages. Or not?

That is correct. But a Kconfig option has several drawbacks:
(1) Doesn't work with boards where I might want PHY timestamping
     on *some* ports, thus I need to enable it and then stumple
     across the same problem.
(2) Doesn't work with generic distro support, which is what is
     ARM pushing right now with their SystemReady stuff (among other
     things also for embeddem system). Despite that, I have two boards
     which are already ready for booting debian out of the box for
     example. While I might convince Debian to enable that option
     (as I see it, that option is there to disable the additional
     overhead) it certainly won't be on a per board basis.
     Actually for keeping the MAC timestamping as is, you'd need to
     convince a distribution to never enable the PHY timestamping
     kconfig option.

So yes, I agree it will work when you have control over your
kconfig options, after all (1) might be more academic. But I'm
really concerned about (2).

-michael
Kurt Kanzenbach April 5, 2022, 11:19 a.m. UTC | #22
On Tue Apr 05 2022, Michael Walle wrote:
> Am 2022-04-05 11:01, schrieb Kurt Kanzenbach:
>> On Mon Apr 04 2022, Michael Walle wrote:
>>> That would make sense. I guess what bothers me with the current
>>> mechanism is that a feature addition to the PHY in the *future* (the
>>> timestamping support) might break a board - or at least changes the
>>> behavior by suddenly using PHY timestamping.
>> 
>> Currently PHY timestamping is hidden behind a configuration option
>> (NETWORK_PHY_TIMESTAMPING). By disabling this option the default
>> behavior should stay at MAC timestamping even if additional features
>> are added on top of the PHY drivers at later stages. Or not?
>
> That is correct. But a Kconfig option has several drawbacks:
> (1) Doesn't work with boards where I might want PHY timestamping
>      on *some* ports, thus I need to enable it and then stumple
>      across the same problem.
> (2) Doesn't work with generic distro support, which is what is
>      ARM pushing right now with their SystemReady stuff (among other
>      things also for embeddem system). Despite that, I have two boards
>      which are already ready for booting debian out of the box for
>      example. While I might convince Debian to enable that option
>      (as I see it, that option is there to disable the additional
>      overhead) it certainly won't be on a per board basis.
>      Actually for keeping the MAC timestamping as is, you'd need to
>      convince a distribution to never enable the PHY timestamping
>      kconfig option.
>
> So yes, I agree it will work when you have control over your
> kconfig options, after all (1) might be more academic. But I'm
> really concerned about (2).

Yes, the limitations described above are exactly one of the reasons to
make the timestamping layer configurable at run time as done by these
patches.

Thanks,
Kurt
Grygorii Strashko April 5, 2022, 1:15 p.m. UTC | #23
On 05/04/2022 14:19, Kurt Kanzenbach wrote:
> On Tue Apr 05 2022, Michael Walle wrote:
>> Am 2022-04-05 11:01, schrieb Kurt Kanzenbach:
>>> On Mon Apr 04 2022, Michael Walle wrote:
>>>> That would make sense. I guess what bothers me with the current
>>>> mechanism is that a feature addition to the PHY in the *future* (the
>>>> timestamping support) might break a board - or at least changes the
>>>> behavior by suddenly using PHY timestamping.
>>>
>>> Currently PHY timestamping is hidden behind a configuration option
>>> (NETWORK_PHY_TIMESTAMPING). By disabling this option the default
>>> behavior should stay at MAC timestamping even if additional features
>>> are added on top of the PHY drivers at later stages. Or not?
>>
>> That is correct. But a Kconfig option has several drawbacks:
>> (1) Doesn't work with boards where I might want PHY timestamping
>>       on *some* ports, thus I need to enable it and then stumple
>>       across the same problem.
>> (2) Doesn't work with generic distro support, which is what is
>>       ARM pushing right now with their SystemReady stuff (among other
>>       things also for embeddem system). Despite that, I have two boards
>>       which are already ready for booting debian out of the box for
>>       example. While I might convince Debian to enable that option
>>       (as I see it, that option is there to disable the additional
>>       overhead) it certainly won't be on a per board basis.
>>       Actually for keeping the MAC timestamping as is, you'd need to
>>       convince a distribution to never enable the PHY timestamping
>>       kconfig option.
>>
>> So yes, I agree it will work when you have control over your
>> kconfig options, after all (1) might be more academic. But I'm
>> really concerned about (2).
> 
> Yes, the limitations described above are exactly one of the reasons to
> make the timestamping layer configurable at run time as done by these
> patches.

Seems like PHY TS support belongs to HW description category, so could it be device tree material,
like generic property defining which layer should do timestamping?
Andrew Lunn April 5, 2022, 1:29 p.m. UTC | #24
> > Yes, the limitations described above are exactly one of the reasons to
> > make the timestamping layer configurable at run time as done by these
> > patches.
> 
> Seems like PHY TS support belongs to HW description category, so could it be device tree material,
> like generic property defining which layer should do timestamping?

Maybe. Device tree is supposed to describe the hardware, not how you
configure the hardware. Which PTP you using is a configuration choice,
so i expect some people will argue it should not be in DT.

   Andrew
Richard Cochran April 5, 2022, 3:46 p.m. UTC | #25
On Tue, Apr 05, 2022 at 10:48:21AM +0200, Michael Walle wrote:
> Can the ethernet driver select the default one? So any current
> driver which has "if phy_has_hwtstamp() forward_ioctl;" can set
> it to PHY while newer drivers can (or should actually) leave it as
> MAC.

I want to remove all of the phy_has_ stuff from MAC drivers.
MAC/PHY drivers should just implement the in-kernel interfaces.
IMO that is the clean way forward.

Thanks,
Richard
Richard Cochran April 5, 2022, 3:48 p.m. UTC | #26
On Tue, Apr 05, 2022 at 03:29:05PM +0200, Andrew Lunn wrote:

> Maybe. Device tree is supposed to describe the hardware, not how you
> configure the hardware. Which PTP you using is a configuration choice,
> so i expect some people will argue it should not be in DT.

+1

Pure DT means no configuration choices.

(but you find many examples that break the rules!)

Thanks,
Richard
Grygorii Strashko April 6, 2022, 11:18 a.m. UTC | #27
On 05/04/2022 18:48, Richard Cochran wrote:
> On Tue, Apr 05, 2022 at 03:29:05PM +0200, Andrew Lunn wrote:
> 
>> Maybe. Device tree is supposed to describe the hardware, not how you
>> configure the hardware. Which PTP you using is a configuration choice,
>> so i expect some people will argue it should not be in DT.
> 
> +1
> 
> Pure DT means no configuration choices.
> 
> (but you find many examples that break the rules!)
> 

My point was related to one of issues described by Michael Walle in this thread:
- supporting TS by the PHY may require also additional board support;
- phy_has_hwtstamp() defined statically by PHY drivers without taking into account board design;
- Kconfig option Doesn't really work with generic distro support and not allowed per-port cfg.

So adding smth like "hwtstamp-en" will clear identify that this particular PHY on this particular board
supports time stamping.
(or hwtstamp-full/hwtstamp-rx/hwtstamp-tx).

Of course, it will not help with default or dynamic selection of time stamping layer :(,
but it will be one problem less.
diff mbox series

Patch

diff --git a/Documentation/ABI/testing/sysfs-class-net-timestamping b/Documentation/ABI/testing/sysfs-class-net-timestamping
index 529c3a6eb607..6dfd59740cad 100644
--- a/Documentation/ABI/testing/sysfs-class-net-timestamping
+++ b/Documentation/ABI/testing/sysfs-class-net-timestamping
@@ -11,7 +11,10 @@  What:		/sys/class/net/<iface>/current_timestamping_provider
 Date:		January 2022
 Contact:	Richard Cochran <richardcochran@gmail.com>
 Description:
-		Show the current SO_TIMESTAMPING provider.
+		Shows or sets the current SO_TIMESTAMPING provider.
+		When changing the value, some packets in the kernel
+		networking stack may still be delivered with time
+		stamps from the previous provider.
 		The possible values are:
 		- "mac"  The MAC provides time stamping.
 		- "phy"  The PHY or MII device provides time stamping.
diff --git a/net/core/dev_ioctl.c b/net/core/dev_ioctl.c
index 1b807d119da5..269068ce3a51 100644
--- a/net/core/dev_ioctl.c
+++ b/net/core/dev_ioctl.c
@@ -260,6 +260,43 @@  static int dev_eth_ioctl(struct net_device *dev,
 	return err;
 }
 
+static int dev_hwtstamp_ioctl(struct net_device *dev,
+			      struct ifreq *ifr, unsigned int cmd)
+{
+	const struct net_device_ops *ops = dev->netdev_ops;
+	int err;
+
+	err = dsa_ndo_eth_ioctl(dev, ifr, cmd);
+	if (err == 0 || err != -EOPNOTSUPP)
+		return err;
+
+	if (!netif_device_present(dev))
+		return -ENODEV;
+
+	switch (dev->selected_timestamping_layer) {
+
+	case MAC_TIMESTAMPING:
+		if (ops->ndo_do_ioctl == phy_do_ioctl) {
+			/* Some drivers set .ndo_do_ioctl to phy_do_ioctl. */
+			err = -EOPNOTSUPP;
+		} else {
+			err = ops->ndo_eth_ioctl(dev, ifr, cmd);
+		}
+		break;
+
+	case PHY_TIMESTAMPING:
+		if (phy_has_hwtstamp(dev->phydev)) {
+			err = phy_mii_ioctl(dev->phydev, ifr, cmd);
+		} else {
+			err = -ENODEV;
+			WARN_ON(1);
+		}
+		break;
+	}
+
+	return err;
+}
+
 static int dev_siocbond(struct net_device *dev,
 			struct ifreq *ifr, unsigned int cmd)
 {
@@ -395,6 +432,9 @@  static int dev_ifsioc(struct net *net, struct ifreq *ifr, void __user *data,
 			return err;
 		fallthrough;
 
+	case SIOCGHWTSTAMP:
+		return dev_hwtstamp_ioctl(dev, ifr, cmd);
+
 	/*
 	 *	Unknown or private ioctl
 	 */
@@ -405,9 +445,7 @@  static int dev_ifsioc(struct net *net, struct ifreq *ifr, void __user *data,
 
 		if (cmd == SIOCGMIIPHY ||
 		    cmd == SIOCGMIIREG ||
-		    cmd == SIOCSMIIREG ||
-		    cmd == SIOCSHWTSTAMP ||
-		    cmd == SIOCGHWTSTAMP) {
+		    cmd == SIOCSMIIREG) {
 			err = dev_eth_ioctl(dev, ifr, cmd);
 		} else if (cmd == SIOCBONDENSLAVE ||
 		    cmd == SIOCBONDRELEASE ||
diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index 4ff7ef417c38..c27f01a1a285 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -664,17 +664,59 @@  static ssize_t current_timestamping_provider_show(struct device *dev,
 	if (!rtnl_trylock())
 		return restart_syscall();
 
-	if (phy_has_tsinfo(phydev)) {
-		ret = sprintf(buf, "%s\n", "phy");
-	} else {
+	switch (netdev->selected_timestamping_layer) {
+	case MAC_TIMESTAMPING:
 		ret = sprintf(buf, "%s\n", "mac");
+		break;
+	case PHY_TIMESTAMPING:
+		ret = sprintf(buf, "%s\n", "phy");
+		break;
 	}
 
 	rtnl_unlock();
 
 	return ret;
 }
-static DEVICE_ATTR_RO(current_timestamping_provider);
+
+static ssize_t current_timestamping_provider_store(struct device *dev,
+						   struct device_attribute *attr,
+						   const char *buf, size_t len)
+{
+	struct net_device *netdev = to_net_dev(dev);
+	struct net *net = dev_net(netdev);
+	enum timestamping_layer flavor;
+
+	if (!ns_capable(net->user_ns, CAP_NET_ADMIN))
+		return -EPERM;
+
+	if (!strcmp(buf, "mac\n"))
+		flavor = MAC_TIMESTAMPING;
+	else if (!strcmp(buf, "phy\n"))
+		flavor = PHY_TIMESTAMPING;
+	else
+		return -EINVAL;
+
+	if (!rtnl_trylock())
+		return restart_syscall();
+
+	if (!dev_isalive(netdev))
+		goto out;
+
+	if (netdev->selected_timestamping_layer != flavor) {
+		const struct net_device_ops *ops = netdev->netdev_ops;
+		struct ifreq ifr = {0};
+
+		/* Disable time stamping in the current layer. */
+		if (netif_device_present(netdev) && ops->ndo_eth_ioctl)
+			ops->ndo_eth_ioctl(netdev, &ifr, SIOCSHWTSTAMP);
+
+		netdev->selected_timestamping_layer = flavor;
+	}
+out:
+	rtnl_unlock();
+	return len;
+}
+static DEVICE_ATTR_RW(current_timestamping_provider);
 
 static struct attribute *net_class_attrs[] __ro_after_init = {
 	&dev_attr_netdev_group.attr,
diff --git a/net/core/timestamping.c b/net/core/timestamping.c
index 04840697fe79..31c3142787b7 100644
--- a/net/core/timestamping.c
+++ b/net/core/timestamping.c
@@ -28,6 +28,9 @@  void skb_clone_tx_timestamp(struct sk_buff *skb)
 	if (!skb->sk)
 		return;
 
+	if (skb->dev->selected_timestamping_layer != PHY_TIMESTAMPING)
+		return;
+
 	type = classify(skb);
 	if (type == PTP_CLASS_NONE)
 		return;
@@ -50,6 +53,9 @@  bool skb_defer_rx_timestamp(struct sk_buff *skb)
 	if (!skb->dev || !skb->dev->phydev || !skb->dev->phydev->mii_ts)
 		return false;
 
+	if (skb->dev->selected_timestamping_layer != PHY_TIMESTAMPING)
+		return false;
+
 	if (skb_headroom(skb) < ETH_HLEN)
 		return false;
 
diff --git a/net/ethtool/common.c b/net/ethtool/common.c
index 651d18eef589..7b50820c1d1d 100644
--- a/net/ethtool/common.c
+++ b/net/ethtool/common.c
@@ -545,10 +545,20 @@  int __ethtool_get_ts_info(struct net_device *dev, struct ethtool_ts_info *info)
 	memset(info, 0, sizeof(*info));
 	info->cmd = ETHTOOL_GET_TS_INFO;
 
-	if (phy_has_tsinfo(phydev))
-		return phy_ts_info(phydev, info);
-	if (ops->get_ts_info)
-		return ops->get_ts_info(dev, info);
+	switch (dev->selected_timestamping_layer) {
+
+	case MAC_TIMESTAMPING:
+		if (ops->get_ts_info)
+			return ops->get_ts_info(dev, info);
+		break;
+
+	case PHY_TIMESTAMPING:
+		if (phy_has_tsinfo(phydev)) {
+			return phy_ts_info(phydev, info);
+		}
+		WARN_ON(1);
+		return -ENODEV;
+	}
 
 	info->so_timestamping = SOF_TIMESTAMPING_RX_SOFTWARE |
 				SOF_TIMESTAMPING_SOFTWARE;