diff mbox series

[net-next] net: ethtool: allow MAC drivers to override ethtool get_ts_info

Message ID E1kyYfI-0004wl-Tf@rmk-PC.armlinux.org.uk (mailing list archive)
State Deferred
Delegated to: Netdev Maintainers
Headers show
Series [net-next] net: ethtool: allow MAC drivers to override ethtool get_ts_info | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for net-next
netdev/subject_prefix success Link
netdev/cc_maintainers warning 4 maintainers not CCed: mkubecek@suse.cz f.fainelli@gmail.com alobakin@pm.me meirl@mellanox.com
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 1 this patch: 1
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 20 lines checked
netdev/build_allmodconfig_warn success Errors and warnings before: 1 this patch: 1
netdev/header_inline success Link
netdev/stable success Stable not CCed

Commit Message

Russell King (Oracle) Jan. 10, 2021, 11:13 a.m. UTC
Check whether the MAC driver has implemented the get_ts_info()
method first, and call it if present.  If this method returns
-EOPNOTSUPP, defer to the phylib or default implementation.

This allows network drivers such as mvpp2 to use their more accurate
timestamping implementation than using a less accurate implementation
in the PHY. Network drivers can opt to defer to phylib by returning
-EOPNOTSUPP.

This change will be needed if the Marvell PHY drivers add support for
PTP.

Note: this may cause a change for any drivers that use phylib and
provide get_ts_info(). It is not obvious if any such cases exist.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
 net/ethtool/common.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

Comments

Andrew Lunn Jan. 10, 2021, 4:35 p.m. UTC | #1
On Sun, Jan 10, 2021 at 11:13:44AM +0000, Russell King wrote:
> Check whether the MAC driver has implemented the get_ts_info()
> method first, and call it if present.  If this method returns
> -EOPNOTSUPP, defer to the phylib or default implementation.
> 
> This allows network drivers such as mvpp2 to use their more accurate
> timestamping implementation than using a less accurate implementation
> in the PHY. Network drivers can opt to defer to phylib by returning
> -EOPNOTSUPP.
> 
> This change will be needed if the Marvell PHY drivers add support for
> PTP.
> 
> Note: this may cause a change for any drivers that use phylib and
> provide get_ts_info(). It is not obvious if any such cases exist.

Hi Russell

We can detect that condition through? Call both, then do a WARN() if
we are changing the order? Maybe we should do that for a couple of
cycles?

For netlink ethtool, we can also provide an additional attribute. A
MAC, or PHY indicator we can do in the core. A string for the name of
the driver would need a bigger change.

	Andrew
Jakub Kicinski Jan. 14, 2021, 3:05 a.m. UTC | #2
On Sun, 10 Jan 2021 17:35:25 +0100 Andrew Lunn wrote:
> On Sun, Jan 10, 2021 at 11:13:44AM +0000, Russell King wrote:
> > Check whether the MAC driver has implemented the get_ts_info()
> > method first, and call it if present.  If this method returns
> > -EOPNOTSUPP, defer to the phylib or default implementation.
> > 
> > This allows network drivers such as mvpp2 to use their more accurate
> > timestamping implementation than using a less accurate implementation
> > in the PHY. Network drivers can opt to defer to phylib by returning
> > -EOPNOTSUPP.
> > 
> > This change will be needed if the Marvell PHY drivers add support for
> > PTP.
> > 
> > Note: this may cause a change for any drivers that use phylib and
> > provide get_ts_info(). It is not obvious if any such cases exist.  
> 
> Hi Russell
> 
> We can detect that condition through? Call both, then do a WARN() if
> we are changing the order? Maybe we should do that for a couple of
> cycles?
> 
> For netlink ethtool, we can also provide an additional attribute. A
> MAC, or PHY indicator we can do in the core. A string for the name of
> the driver would need a bigger change.

I'm not seeing a response to this suggestion, did vger eat it?
Richard Cochran Jan. 14, 2021, 12:55 p.m. UTC | #3
On Sun, Jan 10, 2021 at 11:13:44AM +0000, Russell King wrote:
> This allows network drivers such as mvpp2 to use their more accurate
> timestamping implementation than using a less accurate implementation
> in the PHY. Network drivers can opt to defer to phylib by returning
> -EOPNOTSUPP.

My expectation is that PHY time stamping is more accurate than MAC
time stamping.
 
> This change will be needed if the Marvell PHY drivers add support for
> PTP.

Huh?  The mvpp2 appears to be a MAC.  If this device has integrated
PHYs then I don't see the issue.  If your board has the nvpp2 device
with the dp83640 PHYTER, then don't you want to actually use the
PHYTER?

From my observation of the product offerings, I have yet to see a new
PHY (besides the dp83640 PHYTER) that implement time stamping.  The
PHYTER is 100 megabit only, and my understanding that it is too
difficult or even impossible to provide time stamps from within a
gigabit+ PHY.  So you needn't fear new time stamping PHYs to spoil
your setup!

> Note: this may cause a change for any drivers that use phylib and
> provide get_ts_info(). It is not obvious if any such cases exist.

Up until now, the code always favored PHY devices and devices external
to the MAC that snoop on the MII bus.  The assumption is that anyone
who builds a board with such specialty devices really wants to use
them.

Thanks,
Richard
Russell King (Oracle) Jan. 14, 2021, 1:22 p.m. UTC | #4
On Thu, Jan 14, 2021 at 04:55:06AM -0800, Richard Cochran wrote:
> On Sun, Jan 10, 2021 at 11:13:44AM +0000, Russell King wrote:
> > This allows network drivers such as mvpp2 to use their more accurate
> > timestamping implementation than using a less accurate implementation
> > in the PHY. Network drivers can opt to defer to phylib by returning
> > -EOPNOTSUPP.
> 
> My expectation is that PHY time stamping is more accurate than MAC
> time stamping.

PHY time stamping may be a "more accurate" location to get timestamps,
in terms of the hardware, but when you consider the entire setup, that
is not necessarily the case.

> > This change will be needed if the Marvell PHY drivers add support for
> > PTP.
> 
> Huh? The mvpp2 appears to be a MAC.  If this device has integrated
> PHYs then I don't see the issue.  If your board has the nvpp2 device
> with the dp83640 PHYTER, then don't you want to actually use the
> PHYTER?

You seem to be adding more information way beyond what I'm saying.

No, there aren't integrated PHYs. There's an external PHY - a Marvell
88e151x which has what I would call rudimentary stamping abilities,
whereas the mvpp2 has advanced stamping abilities.

You implemented the Marvell timestamping in DSA, so you know what the
Marvell offering there looks like and what it is capable of. That same
hardware appears in some Marvell PHYs.

The mvpp2 hardware (which has support already merged after you acked
the TAI part, and failed to provide comments on the mvpp2 part - so
davem gave up waiting) is capable of:
- stamping every received packet irrespective of its type
- stamping any transmitted packet or only those we wish to stamp
- inserting a timestamp into the packet (aka one-step, although that
  isn't implemented that yet)
- correcting the hardware time counter tick rate

There is considerable noise in reading the hardware timestamp counter
from the PHYs - caused by latency over the MDIO bus, which makes the
achievable accuracy lower. That noise is very much reduced when reading
the hardware timestamp counter from mvpp2 - where we can implement
mvpp22_tai_gettimex64(). Therefore, the achieved accuracy from mvpp2 is
higher than from a PHY.

We had already discussed this patch last year, and you agreed with it
then. What has changed?
Russell King (Oracle) Jan. 14, 2021, 1:32 p.m. UTC | #5
On Thu, Jan 14, 2021 at 01:22:17PM +0000, Russell King - ARM Linux admin wrote:
> On Thu, Jan 14, 2021 at 04:55:06AM -0800, Richard Cochran wrote:
> > On Sun, Jan 10, 2021 at 11:13:44AM +0000, Russell King wrote:
> > > This allows network drivers such as mvpp2 to use their more accurate
> > > timestamping implementation than using a less accurate implementation
> > > in the PHY. Network drivers can opt to defer to phylib by returning
> > > -EOPNOTSUPP.
> > 
> > My expectation is that PHY time stamping is more accurate than MAC
> > time stamping.
> 
> PHY time stamping may be a "more accurate" location to get timestamps,
> in terms of the hardware, but when you consider the entire setup, that
> is not necessarily the case.
> 
> > > This change will be needed if the Marvell PHY drivers add support for
> > > PTP.
> > 
> > Huh? The mvpp2 appears to be a MAC.  If this device has integrated
> > PHYs then I don't see the issue.  If your board has the nvpp2 device
> > with the dp83640 PHYTER, then don't you want to actually use the
> > PHYTER?
> 
> You seem to be adding more information way beyond what I'm saying.
> 
> No, there aren't integrated PHYs. There's an external PHY - a Marvell
> 88e151x which has what I would call rudimentary stamping abilities,
> whereas the mvpp2 has advanced stamping abilities.
> 
> You implemented the Marvell timestamping in DSA, so you know what the
> Marvell offering there looks like and what it is capable of. That same
> hardware appears in some Marvell PHYs.
> 
> The mvpp2 hardware (which has support already merged after you acked
> the TAI part, and failed to provide comments on the mvpp2 part - so
> davem gave up waiting) is capable of:
> - stamping every received packet irrespective of its type
> - stamping any transmitted packet or only those we wish to stamp
> - inserting a timestamp into the packet (aka one-step, although that
>   isn't implemented that yet)
> - correcting the hardware time counter tick rate
> 
> There is considerable noise in reading the hardware timestamp counter
> from the PHYs - caused by latency over the MDIO bus, which makes the
> achievable accuracy lower. That noise is very much reduced when reading
> the hardware timestamp counter from mvpp2 - where we can implement
> mvpp22_tai_gettimex64(). Therefore, the achieved accuracy from mvpp2 is
> higher than from a PHY.
> 
> We had already discussed this patch last year, and you agreed with it
> then. What has changed?

See the discussion in this sub-thread:

https://lore.kernel.org/netdev/20200729105807.GZ1551@shell.armlinux.org.uk/
Russell King (Oracle) Jan. 14, 2021, 5:09 p.m. UTC | #6
On Sun, Jan 10, 2021 at 05:35:25PM +0100, Andrew Lunn wrote:
> On Sun, Jan 10, 2021 at 11:13:44AM +0000, Russell King wrote:
> > Check whether the MAC driver has implemented the get_ts_info()
> > method first, and call it if present.  If this method returns
> > -EOPNOTSUPP, defer to the phylib or default implementation.
> > 
> > This allows network drivers such as mvpp2 to use their more accurate
> > timestamping implementation than using a less accurate implementation
> > in the PHY. Network drivers can opt to defer to phylib by returning
> > -EOPNOTSUPP.
> > 
> > This change will be needed if the Marvell PHY drivers add support for
> > PTP.
> > 
> > Note: this may cause a change for any drivers that use phylib and
> > provide get_ts_info(). It is not obvious if any such cases exist.
> 
> Hi Russell
> 
> We can detect that condition through? Call both, then do a WARN() if
> we are changing the order? Maybe we should do that for a couple of
> cycles?

I guess we could do something, but IMHO this really does not justify
using heavy hammers like WARN(). It's pointless producing a backtrace.
If we want a large noisy multi-line message that stands out, we should
just do that.

I think we can detect with something like:

	if (ops->get_ts_info && phy_has_tsinfo(phydev)) {
		netdev_warn(dev, "Both the PHY and the MAC support PTP. Which you end up with may change.\n");
	}

That said, this is _actually_ a fix.

As the code stands today:

__ethtool_get_ts_info() checks phy_has_tsinfo() and uses phy_ts_info()
in preference to ops->get_ts_info(), giving the PHY code first dibs on
intercepting this call.

Meanwhile, the ioctl() code gives the network driver first dibs on
intercepting the SIOCSHWTSTAMP and SIOCGHWTSTAMP ioctls.

This means that if you have both a PHY supporting timestamping, and a
MAC driver, you end up with the ethtool get_ts_info() call giving a
response from the PHY implementation but SIOCSHWTSTAMP and
SIOCGHWTSTAMP being intercepted by the MAC implementation.

This is exactly what will happen today to mvpp2 if we merge my patches
adding PTP support to the Marvell 88e151x PHYs without this patch.

So, my patch merely brings consistency to this.

> For netlink ethtool, we can also provide an additional attribute. A
> MAC, or PHY indicator we can do in the core. A string for the name of
> the driver would need a bigger change.

Unfortunately, PTP is not solely controlled through ethtool - it's
also controlled via ioctl() where it's not so easy to direct the
calls to either the MAC or PHY.
Richard Cochran Jan. 14, 2021, 5:27 p.m. UTC | #7
On Thu, Jan 14, 2021 at 01:32:35PM +0000, Russell King - ARM Linux admin wrote:
> > We had already discussed this patch last year, and you agreed with it
> > then. What has changed?
> 
> See the discussion in this sub-thread:
> 
> https://lore.kernel.org/netdev/20200729105807.GZ1551@shell.armlinux.org.uk/

Thanks for the reminder.  We ended up with having to review the MAC
drivers that support phydev.

   https://lore.kernel.org/netdev/20200730194427.GE1551@shell.armlinux.org.uk/

There is at least the FEC that supports phydev.  I have a board that
combines the FEC with the dp83640 PHYTER, and your patch would break
this setup.  (In the case of this HW combination, the PHYTER is
superior in every way.)

Another combination that I have seen twice is the TI am335x with its
cpsw MAC and the PHYTER.  Unfortunately I don't have one of these
boards, but people made them because the cpsw MAC supports time
stamping in a way that is inadequate.

I *think* the cpsw/phyter combination would work with your patch, but
only if the users disable CONFIG_TI_CPTS at compile time.

Thanks,
Richard
Russell King (Oracle) Jan. 14, 2021, 5:27 p.m. UTC | #8
On Thu, Jan 14, 2021 at 05:09:42PM +0000, Russell King - ARM Linux admin wrote:
> On Sun, Jan 10, 2021 at 05:35:25PM +0100, Andrew Lunn wrote:
> > On Sun, Jan 10, 2021 at 11:13:44AM +0000, Russell King wrote:
> > > Check whether the MAC driver has implemented the get_ts_info()
> > > method first, and call it if present.  If this method returns
> > > -EOPNOTSUPP, defer to the phylib or default implementation.
> > > 
> > > This allows network drivers such as mvpp2 to use their more accurate
> > > timestamping implementation than using a less accurate implementation
> > > in the PHY. Network drivers can opt to defer to phylib by returning
> > > -EOPNOTSUPP.
> > > 
> > > This change will be needed if the Marvell PHY drivers add support for
> > > PTP.
> > > 
> > > Note: this may cause a change for any drivers that use phylib and
> > > provide get_ts_info(). It is not obvious if any such cases exist.
> > 
> > Hi Russell
> > 
> > We can detect that condition through? Call both, then do a WARN() if
> > we are changing the order? Maybe we should do that for a couple of
> > cycles?
> 
> I guess we could do something, but IMHO this really does not justify
> using heavy hammers like WARN(). It's pointless producing a backtrace.
> If we want a large noisy multi-line message that stands out, we should
> just do that.
> 
> I think we can detect with something like:
> 
> 	if (ops->get_ts_info && phy_has_tsinfo(phydev)) {
> 		netdev_warn(dev, "Both the PHY and the MAC support PTP. Which you end up with may change.\n");
> 	}
> 
> That said, this is _actually_ a fix.
> 
> As the code stands today:
> 
> __ethtool_get_ts_info() checks phy_has_tsinfo() and uses phy_ts_info()
> in preference to ops->get_ts_info(), giving the PHY code first dibs on
> intercepting this call.
> 
> Meanwhile, the ioctl() code gives the network driver first dibs on
> intercepting the SIOCSHWTSTAMP and SIOCGHWTSTAMP ioctls.
> 
> This means that if you have both a PHY supporting timestamping, and a
> MAC driver, you end up with the ethtool get_ts_info() call giving a
> response from the PHY implementation but SIOCSHWTSTAMP and
> SIOCGHWTSTAMP being intercepted by the MAC implementation.
> 
> This is exactly what will happen today to mvpp2 if we merge my patches
> adding PTP support to the Marvell 88e151x PHYs without this patch.
> 
> So, my patch merely brings consistency to this.
> 
> > For netlink ethtool, we can also provide an additional attribute. A
> > MAC, or PHY indicator we can do in the core. A string for the name of
> > the driver would need a bigger change.
> 
> Unfortunately, PTP is not solely controlled through ethtool - it's
> also controlled via ioctl() where it's not so easy to direct the
> calls to either the MAC or PHY.

BTW, none of this is new informationm we're just re-covering the same
points that were already discussed back in July 2020:

https://lore.kernel.org/netdev/20200729105807.GZ1551@shell.armlinux.org.uk/
Russell King (Oracle) Jan. 14, 2021, 5:31 p.m. UTC | #9
On Thu, Jan 14, 2021 at 09:27:12AM -0800, Richard Cochran wrote:
> On Thu, Jan 14, 2021 at 01:32:35PM +0000, Russell King - ARM Linux admin wrote:
> > > We had already discussed this patch last year, and you agreed with it
> > > then. What has changed?
> > 
> > See the discussion in this sub-thread:
> > 
> > https://lore.kernel.org/netdev/20200729105807.GZ1551@shell.armlinux.org.uk/
> 
> Thanks for the reminder.  We ended up with having to review the MAC
> drivers that support phydev.
> 
>    https://lore.kernel.org/netdev/20200730194427.GE1551@shell.armlinux.org.uk/
> 
> There is at least the FEC that supports phydev.  I have a board that
> combines the FEC with the dp83640 PHYTER, and your patch would break
> this setup.  (In the case of this HW combination, the PHYTER is
> superior in every way.)
> 
> Another combination that I have seen twice is the TI am335x with its
> cpsw MAC and the PHYTER.  Unfortunately I don't have one of these
> boards, but people made them because the cpsw MAC supports time
> stamping in a way that is inadequate.
> 
> I *think* the cpsw/phyter combination would work with your patch, but
> only if the users disable CONFIG_TI_CPTS at compile time.

I think then the only solution is to move the decision how to handle
get_ts_info into each MAC driver and get rid of:

	if (phy_has_tsinfo(phydev))
	        return phy_ts_info(phydev, info);

in __ethtool_get_ts_info().
Russell King (Oracle) Jan. 14, 2021, 10:38 p.m. UTC | #10
On Thu, Jan 14, 2021 at 05:31:11PM +0000, Russell King - ARM Linux admin wrote:
> On Thu, Jan 14, 2021 at 09:27:12AM -0800, Richard Cochran wrote:
> > Thanks for the reminder.  We ended up with having to review the MAC
> > drivers that support phydev.
> > 
> >    https://lore.kernel.org/netdev/20200730194427.GE1551@shell.armlinux.org.uk/
> > 
> > There is at least the FEC that supports phydev.  I have a board that
> > combines the FEC with the dp83640 PHYTER, and your patch would break
> > this setup.  (In the case of this HW combination, the PHYTER is
> > superior in every way.)
> > 
> > Another combination that I have seen twice is the TI am335x with its
> > cpsw MAC and the PHYTER.  Unfortunately I don't have one of these
> > boards, but people made them because the cpsw MAC supports time
> > stamping in a way that is inadequate.
> > 
> > I *think* the cpsw/phyter combination would work with your patch, but
> > only if the users disable CONFIG_TI_CPTS at compile time.
> 
> I think then the only solution is to move the decision how to handle
> get_ts_info into each MAC driver and get rid of:
> 
> 	if (phy_has_tsinfo(phydev))
> 	        return phy_ts_info(phydev, info);
> 
> in __ethtool_get_ts_info().

Thinking about this more, that is an impossible task - there's no
obvious information around to suggest which ethernet drivers could
possibly be attached to a phylib PHY that supports PTP.

So, I think the only way to prevent a regression with the code as
it is today is that we _never_ support PTP on Marvell PHYs - because
doing so _will_ break the existing MVPP2 driver's implementation and
cause a regression.

Right now, there is no option: if a PHY supports PTP, then the only
option is to use the PHYs PTP. Which is utterly rediculous.

Unless you can see a way around it. Because I can't.
Richard Cochran Jan. 21, 2021, 4:04 a.m. UTC | #11
On Thu, Jan 14, 2021 at 10:38:00PM +0000, Russell King - ARM Linux admin wrote:

> So, I think the only way to prevent a regression with the code as
> it is today is that we _never_ support PTP on Marvell PHYs - because
> doing so _will_ break the existing MVPP2 driver's implementation and
> cause a regression.

The situation isn't as bad as it seems.

For one thing, mvpp2 incorrectly selects NETWORK_PHY_TIMESTAMPING.
It really shouldn't.  I'm submitting a fix soon.

As long as the new PHY driver (or at least the PTP bit) depends on
NETWORK_PHY_TIMESTAMPING, then that allows users who _really_ want
that to enable the option at compile time.  This option adds extra
checks into the networking hot path, and almost everyone should avoid
enabling it.

> Right now, there is no option: if a PHY supports PTP, then the only
> option is to use the PHYs PTP. Which is utterly rediculous.
> 
> Unless you can see a way around it. Because I can't.

I still think the original and best method is to hide the two (and
with your new driver, three) esoteric PHY time stamping drivers behind
a Kconfig option, and structure the code to favor PHY devices.

The idea to favor the MACs, from back in July, doesn't really change
the fundamental limitations that

- MAC and PHY time stamping cannot work simultaneously, and

- Users of PHY devices (representing a tiny minority) must enable the
  otherwise uninteresting NETWORK_PHY_TIMESTAMPING option at compile
  time.

Thanks,
Richard
Russell King (Oracle) Jan. 21, 2021, 10:27 a.m. UTC | #12
On Wed, Jan 20, 2021 at 08:04:51PM -0800, Richard Cochran wrote:
> On Thu, Jan 14, 2021 at 10:38:00PM +0000, Russell King - ARM Linux admin wrote:
> 
> > So, I think the only way to prevent a regression with the code as
> > it is today is that we _never_ support PTP on Marvell PHYs - because
> > doing so _will_ break the existing MVPP2 driver's implementation and
> > cause a regression.
> 
> The situation isn't as bad as it seems.
> 
> For one thing, mvpp2 incorrectly selects NETWORK_PHY_TIMESTAMPING.
> It really shouldn't.  I'm submitting a fix soon.
> 
> As long as the new PHY driver (or at least the PTP bit) depends on
> NETWORK_PHY_TIMESTAMPING, then that allows users who _really_ want
> that to enable the option at compile time.  This option adds extra
> checks into the networking hot path, and almost everyone should avoid
> enabling it.

As I already explained to you, you can *NOT* use kernel configuration
to make the choice.  ARM is a multi-platform kernel, and we will not
stand for platform choices dictated by kernel configuration options.
Richard Cochran Jan. 21, 2021, 3:06 p.m. UTC | #13
On Thu, Jan 21, 2021 at 10:27:38AM +0000, Russell King - ARM Linux admin wrote:
> As I already explained to you, you can *NOT* use kernel configuration
> to make the choice.  ARM is a multi-platform kernel, and we will not
> stand for platform choices dictated by kernel configuration options.

This has nothing to do with ARM as a platform.  The same limitation
applies to x86 and all the rest.

The networking stack does not support simultaneous PHY and MAC time
stamping.

If you enable both MAC and PHY time stamping, then only one will be
reported, and that one is the PHY.

Thanks,
Richard
Andrew Lunn Jan. 21, 2021, 4:22 p.m. UTC | #14
On Thu, Jan 21, 2021 at 07:06:11AM -0800, Richard Cochran wrote:
> On Thu, Jan 21, 2021 at 10:27:38AM +0000, Russell King - ARM Linux admin wrote:
> > As I already explained to you, you can *NOT* use kernel configuration
> > to make the choice.  ARM is a multi-platform kernel, and we will not
> > stand for platform choices dictated by kernel configuration options.
> 
> This has nothing to do with ARM as a platform.  The same limitation
> applies to x86 and all the rest.
> 
> The networking stack does not support simultaneous PHY and MAC time
> stamping.

Hi Richard

And this appears to be the crux of the problem.

You say PHY timestamping is exotic. I would disagree with that.  Many
Marvell PHYs support it, using similar hardware as to what is in the
Marvell DSA switches. The Aquantia PHYs support it, but have no driver
support at the moment. Microsemi PHYs have driver support for
it. There are Broadcom PHYs with the needed hardware. So does some of
the Qualcom PHYs. There are automotive T1 PHYs with PTP hardware
support. Some Realtek devices have the hardware.

There is a growing interesting in PTP, the number of drivers keeps
going up. The likelihood of MAC/PHY combination having two
timestamping sources is growing all the time. So the stack needs to
change to support the selection of the timestamp source.

       Andrew
Richard Cochran Jan. 21, 2021, 5:03 p.m. UTC | #15
On Thu, Jan 21, 2021 at 05:22:37PM +0100, Andrew Lunn wrote:

> There is a growing interesting in PTP, the number of drivers keeps
> going up. The likelihood of MAC/PHY combination having two
> timestamping sources is growing all the time. So the stack needs to
> change to support the selection of the timestamp source.

Fine, but How should the support look like?

- New/extended time stamping API that delivers multiple time stamps?

- sysctl to select MAC/PHY preference at run time globally?

- per-interface ethtool control?

- per-socket control?  (probably not feasible, but heh)

Back of the napkin design ideas appreciated!

Thanks,
Richard
Russell King (Oracle) Jan. 21, 2021, 6:18 p.m. UTC | #16
On Thu, Jan 21, 2021 at 07:06:11AM -0800, Richard Cochran wrote:
> On Thu, Jan 21, 2021 at 10:27:38AM +0000, Russell King - ARM Linux admin wrote:
> > As I already explained to you, you can *NOT* use kernel configuration
> > to make the choice.  ARM is a multi-platform kernel, and we will not
> > stand for platform choices dictated by kernel configuration options.
> 
> This has nothing to do with ARM as a platform.  The same limitation
> applies to x86 and all the rest.

You don't plainly don't understand, because you just called ARM a
platform. ARM isn't a platform, it's an architecture, which is a
totally different thing.

A platform is a device, like a phone, or a router, or a computer.
However, all platforms may have the same CPU architecture - that
being ARM.

ARM as an architecture is in the situation where distributions have
pushed hard for it to be like x86. You build one kernel for your
distribution, not multiple hardware specific kernels. That means,
there is one kernel configuration. Platform specifics are handled
through firmware such as device tree.

Having platform specific kernel configuration options, such as "we
want to use PHY timestamping" vs "we want to use MAC timestamping"
do not work in this environment.
Andrew Lunn Jan. 21, 2021, 6:55 p.m. UTC | #17
On Thu, Jan 21, 2021 at 09:03:47AM -0800, Richard Cochran wrote:
> On Thu, Jan 21, 2021 at 05:22:37PM +0100, Andrew Lunn wrote:
> 
> > There is a growing interesting in PTP, the number of drivers keeps
> > going up. The likelihood of MAC/PHY combination having two
> > timestamping sources is growing all the time. So the stack needs to
> > change to support the selection of the timestamp source.
> 
> Fine, but How should the support look like?
> 
> - New/extended time stamping API that delivers multiple time stamps?
> 
> - sysctl to select MAC/PHY preference at run time globally?
> 
> - per-interface ethtool control?
> 
> - per-socket control?  (probably not feasible, but heh)
> 
> Back of the napkin design ideas appreciated!

Do you know of any realistic uses cases for using two time stampers
for the same netdev? If we can eliminate that, then it is down to
selecting which one to use. And i would say, the selection needs to be
per netdev.

I don't know the ptp subsystem very well, but it seems like
ptp_clock_register() does not know which netdev the device is being
registered against. Is it guaranteed that parent is actual a MAC?
Seems like adding a netdev member to that call could be good, so the
core knows what stampers are available per netdev. It then becomes
possible to enumerate the stampers associated to a netdev, and to
select one to be used.

get_ts_info() is a problem because the MAC directly implements it. It
seems like you need all kAPI calls to go into the ptp core. It can
then call into the selected driver. So ptp_clock_info might need
additional methods, get_ts_info() for example.

	   Andrew
Russell King (Oracle) Jan. 21, 2021, 10:59 p.m. UTC | #18
On Thu, Jan 21, 2021 at 07:55:34PM +0100, Andrew Lunn wrote:
> On Thu, Jan 21, 2021 at 09:03:47AM -0800, Richard Cochran wrote:
> > On Thu, Jan 21, 2021 at 05:22:37PM +0100, Andrew Lunn wrote:
> > 
> > > There is a growing interesting in PTP, the number of drivers keeps
> > > going up. The likelihood of MAC/PHY combination having two
> > > timestamping sources is growing all the time. So the stack needs to
> > > change to support the selection of the timestamp source.
> > 
> > Fine, but How should the support look like?
> > 
> > - New/extended time stamping API that delivers multiple time stamps?
> > 
> > - sysctl to select MAC/PHY preference at run time globally?
> > 
> > - per-interface ethtool control?
> > 
> > - per-socket control?  (probably not feasible, but heh)
> > 
> > Back of the napkin design ideas appreciated!
> 
> Do you know of any realistic uses cases for using two time stampers
> for the same netdev? If we can eliminate that, then it is down to
> selecting which one to use. And i would say, the selection needs to be
> per netdev.

Yes, I think it needs to be per-netdev.

As I've already demonstrated in previous emails last year on this
subject, we have the situation where, on the Macchiatobin board,
the ability to sync the PTP clock is quite different between the
PHY and the MAC:

- the PHY (slow to access, only event capture which may not be wired,
  doesn't seem to synchronise well - delay of 58000, frequency changes
  every second between +/-1500ppb.)

- the Ethernet MAC (fast to access, supports event capture and trigger
  generation which also may not be wired, synchronises well, delay of
  700, frequency changes every second +/- 40ppb.)

These are not theoretical numbers, they are observed measurements on
real hardware. The MAC is 37 times more stable than the PHY and 82
times faster to access, which makes the MAC a higher quality clock,
and more desirable to use.

Much of the instabillity comes from the need to access the PHY over
MDIO - which introduces unpredictable latency, such as due to the bus
access locking with other PHYLIB activities delaying PTP accesses.

While it is true that a PHY _may_ be closer to the wire in terms of its
ability to timestamp the packet, if the quality of its clock that it is
stamping with is significantly less than that which is achievable from
the MAC, then "always use a PHY if present because it's more accurate"
is a _very_ false assumption to make.

Things could well be different /if/ the PHY that was being used had
usable hardware synchronisation (including hardware event capture, and
hardware adjustment of the tick rate.) At that point, the PHY would be
the device to use.

So, if we have the case where we have timestamping available at both a
PHY and a MAC, the choice of which should be used, even for one SoC
containing a mvpp2 network device, is not clear-cut "always use mvpp2"
or "always use PHY".

Now, what I complain about is what I see as a messed up interface in
the PTP world:

- get_ts_info is an ethtool method, where _if_ the PHY supports
  timestamping, get_ts_info() will _always_ be directed to the PHY,
  and the MAC driver gets no choice in the matter.

- the IOCTLs to control timestamping go via the .ndo_ioctl interface,
  where, typically, the MAC driver intercepts the calls that it is
  interested in, deferring any calls it is not to phy_ioctl() or
  similar. It is inside phy_ioctl() that we find the rest of the PTP
  interface.

So, right now, if we end up where, say, we have the mvpp2 driver with
a PHY supporting timestamping, we get get_ts_info() reporting what the
PHY is capable of, but the IOCTLs that actually control PTP get
intercepted and acted upon by the mvpp2 driver.

The solution used by fec_main.c is to detect whether the PHY supports
timestamping using phy_has_hwtstamp(phydev), and defer the IOCTLs to
phy_ioctl(). That is fine if you want the PHY PTP implementation to
always override the MAC PTP implementation.

If you want the other way around, then currently it is impossible
because the MAC driver has no choice in whether it sees the
get_ts_info() ethtool call.

So, if I were to merge my Marvell PHY PTP code, let's say for argument
sake that NETWORK_PHY_TIMESTAMPING dependency was dropped from mvpp2,
and a distro comes along and provides its users with a single kernel
(as is the case today) which is built with NETWORK_PHY_TIMESTAMPING
turned on (because some of the platforms require it) then there is
_no_ choice to use the more stable MAC PTP implementation. People would
have to suffer an inferior PHY based implementation despite the
hardware being capable of better.

We should be giving people the choice of which they want to use.

So, can I suggest that we make a step to _first_ santise how PTP is
dealt with in the network layer. Can I suggest that we have a
struct net_ptp_ops (or whatever name you care) that is attached to a
a net_device structure that contains _all_ the operations to do with
a PTP implementation - get_ts_info() and the appropriate PTP ioctls.
At that point, we have a much saner definition of how the networking
layer talks to the PTP layer, and the decision about whether to use a
PHY or a MAC becomes one of which set of operations the pointer to
struct net_ptp_ops are pointing at - and we kill the struct ethtool_ops
get_ts_info member, and the parsing of the ioctl numbers in network
and PHY drivers.

At that point, we have some options. We can think about having a list
of PTP clocks attached to a network device, and an API to userspace to
select between them. Alternatively, we could do what is done with
clocksources, where we provide a rating that determines their "quality"
for use, so the "best" can be selected automatically by the kernel.
Or whatever other scheme takes our fancy.

The problem right now is we're tied in to the current approach that the
PHY always takes precedence, and we must have code in every network
driver that has PTP support to defer to the PHY implementation in its
ioctl handler.

Does this sound like a sane first step?

(Please note, I've spent since yesterday evening tracking down a horrid
KVM regression with 32-bit ARM kernels post 5.6, and we're not yet done
with it, so I may not have lots of time to work on this until that is
properly sorted - hence my short replies earlier today as I didn't have
time for anything else until now. Sorry about that.)
diff mbox series

Patch

diff --git a/net/ethtool/common.c b/net/ethtool/common.c
index 24036e3055a1..9ec93e24f239 100644
--- a/net/ethtool/common.c
+++ b/net/ethtool/common.c
@@ -385,14 +385,18 @@  int __ethtool_get_ts_info(struct net_device *dev, struct ethtool_ts_info *info)
 {
 	const struct ethtool_ops *ops = dev->ethtool_ops;
 	struct phy_device *phydev = dev->phydev;
+	int ret;
 
 	memset(info, 0, sizeof(*info));
 	info->cmd = ETHTOOL_GET_TS_INFO;
 
+	if (ops->get_ts_info) {
+		ret = ops->get_ts_info(dev, info);
+		if (ret != -EOPNOTSUPP)
+			return ret;
+	}
 	if (phy_has_tsinfo(phydev))
 		return phy_ts_info(phydev, info);
-	if (ops->get_ts_info)
-		return ops->get_ts_info(dev, info);
 
 	info->so_timestamping = SOF_TIMESTAMPING_RX_SOFTWARE |
 				SOF_TIMESTAMPING_SOFTWARE;