diff mbox series

[net-next,2/3] dt-bindings: net: micrel: Configure latency values and timestamping check for LAN8814 phy

Message ID 20220304093418.31645-3-Divya.Koppera@microchip.com (mailing list archive)
State Accepted
Commit 2358dd3fd325fc9689652f0c8d4a09475dd31b46
Delegated to: Netdev Maintainers
Headers show
Series Add support for 1588 in LAN8814 | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Series has a cover letter
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers success CCed 5 of 5 maintainers
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 20 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Divya Koppera March 4, 2022, 9:34 a.m. UTC
Supports configuring latency values and also adds
check for phy timestamping feature.

Signed-off-by: Divya Koppera<Divya.Koppera@microchip.com>
---
 .../devicetree/bindings/net/micrel.txt          | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

Comments

Andrew Lunn March 4, 2022, 12:50 p.m. UTC | #1
On Fri, Mar 04, 2022 at 03:04:17PM +0530, Divya Koppera wrote:
> Supports configuring latency values and also adds
> check for phy timestamping feature.
> 
> Signed-off-by: Divya Koppera<Divya.Koppera@microchip.com>
> ---
>  .../devicetree/bindings/net/micrel.txt          | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/net/micrel.txt b/Documentation/devicetree/bindings/net/micrel.txt
> index 8d157f0295a5..c5ab62c39133 100644
> --- a/Documentation/devicetree/bindings/net/micrel.txt
> +++ b/Documentation/devicetree/bindings/net/micrel.txt
> @@ -45,3 +45,20 @@ Optional properties:
>  
>  	In fiber mode, auto-negotiation is disabled and the PHY can only work in
>  	100base-fx (full and half duplex) modes.
> +
> + - lan8814,ignore-ts: If present the PHY will not support timestamping.
> +
> +	This option acts as check whether Timestamping is supported by
> +	hardware or not. LAN8814 phy support hardware tmestamping.

Does this mean the hardware itself cannot tell you it is missing the
needed hardware? What happens when you forget to add this flag? Does
the driver timeout waiting for hardware which does not exists?

> + - lan8814,latency_rx_10: Configures Latency value of phy in ingress at 10 Mbps.
> +
> + - lan8814,latency_tx_10: Configures Latency value of phy in egress at 10 Mbps.
> +
> + - lan8814,latency_rx_100: Configures Latency value of phy in ingress at 100 Mbps.
> +
> + - lan8814,latency_tx_100: Configures Latency value of phy in egress at 100 Mbps.
> +
> + - lan8814,latency_rx_1000: Configures Latency value of phy in ingress at 1000 Mbps.
> +
> + - lan8814,latency_tx_1000: Configures Latency value of phy in egress at 1000 Mbps.

Why does this need to be configured, rather than hard coded? Why would
the latency for a given speed change? I would of thought though you
would take the average length of a PTP packet and divide is by the
link speed.

     Andrew
Richard Cochran March 4, 2022, 1:55 p.m. UTC | #2
On Fri, Mar 04, 2022 at 01:50:47PM +0100, Andrew Lunn wrote:
> Why does this need to be configured, rather than hard coded? Why would
> the latency for a given speed change? I would of thought though you
> would take the average length of a PTP packet and divide is by the
> link speed.

Latency is unrelated to frame length.

My understanding is that it is VERY tricky to measure PHY latency.
Studies have shown that some PHYs vary by link speed, and some vary
randomly, frame by frame.

So I can understand wanting to configure it.  However, DTS is probably
the wrong place.  The linuxptp user space stack has configuration
variables for this purpose:

       egressLatency
              Specifies  the  difference  in  nanoseconds  between  the actual
              transmission time at the reference plane and the reported trans‐
              mit  time  stamp. This value will be added to egress time stamps
              obtained from the hardware.  The default is 0.

       ingressLatency
              Specifies the difference in nanoseconds between the reported re‐
              ceive  time  stamp  and  the  actual reception time at reference
              plane. This value will be subtracted from  ingress  time  stamps
              obtained from the hardware.  The default is 0.

Thanks,
Richard
Divya Koppera March 7, 2022, 4:40 a.m. UTC | #3
Hi Andrew,

Thanks for review and comments. Please find reply inline below.

> -----Original Message-----
> From: Andrew Lunn <andrew@lunn.ch>
> Sent: Friday, March 4, 2022 6:21 PM
> To: Divya Koppera - I30481 <Divya.Koppera@microchip.com>
> Cc: netdev@vger.kernel.org; hkallweit1@gmail.com; linux@armlinux.org.uk;
> davem@davemloft.net; kuba@kernel.org; robh+dt@kernel.org;
> devicetree@vger.kernel.org; richardcochran@gmail.com; linux-
> kernel@vger.kernel.org; UNGLinuxDriver <UNGLinuxDriver@microchip.com>;
> Madhuri Sripada - I34878 <Madhuri.Sripada@microchip.com>; Manohar Puri -
> I30488 <Manohar.Puri@microchip.com>
> Subject: Re: [PATCH net-next 2/3] dt-bindings: net: micrel: Configure latency
> values and timestamping check for LAN8814 phy
> 
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the
> content is safe
> 
> On Fri, Mar 04, 2022 at 03:04:17PM +0530, Divya Koppera wrote:
> > Supports configuring latency values and also adds check for phy
> > timestamping feature.
> >
> > Signed-off-by: Divya Koppera<Divya.Koppera@microchip.com>
> > ---
> >  .../devicetree/bindings/net/micrel.txt          | 17 +++++++++++++++++
> >  1 file changed, 17 insertions(+)
> >
> > diff --git a/Documentation/devicetree/bindings/net/micrel.txt
> > b/Documentation/devicetree/bindings/net/micrel.txt
> > index 8d157f0295a5..c5ab62c39133 100644
> > --- a/Documentation/devicetree/bindings/net/micrel.txt
> > +++ b/Documentation/devicetree/bindings/net/micrel.txt
> > @@ -45,3 +45,20 @@ Optional properties:
> >
> >       In fiber mode, auto-negotiation is disabled and the PHY can only work in
> >       100base-fx (full and half duplex) modes.
> > +
> > + - lan8814,ignore-ts: If present the PHY will not support timestamping.
> > +
> > +     This option acts as check whether Timestamping is supported by
> > +     hardware or not. LAN8814 phy support hardware tmestamping.
> 
> Does this mean the hardware itself cannot tell you it is missing the needed
> hardware? What happens when you forget to add this flag? Does the driver
> timeout waiting for hardware which does not exists?
> 

If forgot to add this flag, driver will try to register ptp_clock that needs
access to clock related registers, which in turn fails if those registers doesn't exists.

> > + - lan8814,latency_rx_10: Configures Latency value of phy in ingress at 10
> Mbps.
> > +
> > + - lan8814,latency_tx_10: Configures Latency value of phy in egress at 10
> Mbps.
> > +
> > + - lan8814,latency_rx_100: Configures Latency value of phy in ingress at 100
> Mbps.
> > +
> > + - lan8814,latency_tx_100: Configures Latency value of phy in egress at 100
> Mbps.
> > +
> > + - lan8814,latency_rx_1000: Configures Latency value of phy in ingress at
> 1000 Mbps.
> > +
> > + - lan8814,latency_tx_1000: Configures Latency value of phy in egress at
> 1000 Mbps.
> 
> Why does this need to be configured, rather than hard coded? Why would the
> latency for a given speed change? I would of thought though you would take
> the average length of a PTP packet and divide is by the link speed.
> 

This latency values could be different for different phy's. So hardcoding will not work here.
Yes in our case latency values depends on port speed. It is delay between network medium and 
PTP timestamp point.

>      Andrew
Divya Koppera March 7, 2022, 5:02 a.m. UTC | #4
Hi Richard,

> -----Original Message-----
> From: Richard Cochran <richardcochran@gmail.com>
> Sent: Friday, March 4, 2022 7:26 PM
> To: Andrew Lunn <andrew@lunn.ch>
> Cc: Divya Koppera - I30481 <Divya.Koppera@microchip.com>;
> netdev@vger.kernel.org; hkallweit1@gmail.com; linux@armlinux.org.uk;
> davem@davemloft.net; kuba@kernel.org; robh+dt@kernel.org;
> devicetree@vger.kernel.org; linux-kernel@vger.kernel.org; UNGLinuxDriver
> <UNGLinuxDriver@microchip.com>; Madhuri Sripada - I34878
> <Madhuri.Sripada@microchip.com>; Manohar Puri - I30488
> <Manohar.Puri@microchip.com>
> Subject: Re: [PATCH net-next 2/3] dt-bindings: net: micrel: Configure latency
> values and timestamping check for LAN8814 phy
> 
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the
> content is safe
> 
> On Fri, Mar 04, 2022 at 01:50:47PM +0100, Andrew Lunn wrote:
> > Why does this need to be configured, rather than hard coded? Why would
> > the latency for a given speed change? I would of thought though you
> > would take the average length of a PTP packet and divide is by the
> > link speed.
> 
> Latency is unrelated to frame length.
> 
> My understanding is that it is VERY tricky to measure PHY latency.
> Studies have shown that some PHYs vary by link speed, and some vary
> randomly, frame by frame.
> 
> So I can understand wanting to configure it.  However, DTS is probably the
> wrong place.  The linuxptp user space stack has configuration variables for this
> purpose:
> 
>        egressLatency
>               Specifies  the  difference  in  nanoseconds  between  the actual
>               transmission time at the reference plane and the reported trans‐
>               mit  time  stamp. This value will be added to egress time stamps
>               obtained from the hardware.  The default is 0.
> 
>        ingressLatency
>               Specifies the difference in nanoseconds between the reported re‐
>               ceive  time  stamp  and  the  actual reception time at reference
>               plane. This value will be subtracted from  ingress  time  stamps
>               obtained from the hardware.  The default is 0.
> 

I will check this and come back with fix if it is applicable.

Thanks,
Divya

> Thanks,
> Richard
Andrew Lunn March 7, 2022, 1:08 p.m. UTC | #5
> > > +
> > > + - lan8814,ignore-ts: If present the PHY will not support timestamping.
> > > +
> > > +     This option acts as check whether Timestamping is supported by
> > > +     hardware or not. LAN8814 phy support hardware tmestamping.
> > 
> > Does this mean the hardware itself cannot tell you it is missing the needed
> > hardware? What happens when you forget to add this flag? Does the driver
> > timeout waiting for hardware which does not exists?
> > 
> 
> If forgot to add this flag, driver will try to register ptp_clock that needs
> access to clock related registers, which in turn fails if those registers doesn't exists.

Thanks for the reply, but you did not answer my question:

  Does this mean the hardware itself cannot tell you it is missing the
  needed hardware?

Don't you have different IDs in register 2 and 3 for those devices
with clock register and those without?

> > > + - lan8814,latency_rx_10: Configures Latency value of phy in ingress at 10
> > Mbps.
> > > +
> > > + - lan8814,latency_tx_10: Configures Latency value of phy in egress at 10
> > Mbps.
> > > +
> > > + - lan8814,latency_rx_100: Configures Latency value of phy in ingress at 100
> > Mbps.
> > > +
> > > + - lan8814,latency_tx_100: Configures Latency value of phy in egress at 100
> > Mbps.
> > > +
> > > + - lan8814,latency_rx_1000: Configures Latency value of phy in ingress at
> > 1000 Mbps.
> > > +
> > > + - lan8814,latency_tx_1000: Configures Latency value of phy in egress at
> > 1000 Mbps.
> > 
> > Why does this need to be configured, rather than hard coded? Why would the
> > latency for a given speed change? I would of thought though you would take
> > the average length of a PTP packet and divide is by the link speed.
> > 
> 
> This latency values could be different for different phy's. So hardcoding will not work here.

But you do actually have hard coded defaults. Those odd hex values i
pointed out.

By different PHYs do you mean different PHY versions? So you can look
at register 2 and 3, determine what PHY it is, and so from that what
defaults should be used? Or do you mean different boards with the same
PHY?

In general, the less tunables you have, the better. If the driver can
figure it out, it is better to not have DT properties. The PHY will
then also work with ACPI and USB etc, where there is no
DT. Implementing the user space API Richard pointed out will also
allow your PHY to work with none DP systems.

> Yes in our case latency values depends on port speed. It is delay between network medium and 
> PTP timestamp point.

What are the units. You generally have the units in the property
name. So e.g. lan8814,latency_tx_1000_ns. If need be, the driver then
converts to whatever value you place into the register.

If you do keep them, please make it clear that these values are
optional, and state what value will be used when the property is not
present.

	Andrew
Rob Herring (Arm) March 7, 2022, 9:33 p.m. UTC | #6
On Fri, Mar 04, 2022 at 03:04:17PM +0530, Divya Koppera wrote:
> Supports configuring latency values and also adds
> check for phy timestamping feature.
> 
> Signed-off-by: Divya Koppera<Divya.Koppera@microchip.com>

should be a space here:       ^

> ---
>  .../devicetree/bindings/net/micrel.txt          | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)

Please convert this to DT schema.

> 
> diff --git a/Documentation/devicetree/bindings/net/micrel.txt b/Documentation/devicetree/bindings/net/micrel.txt
> index 8d157f0295a5..c5ab62c39133 100644
> --- a/Documentation/devicetree/bindings/net/micrel.txt
> +++ b/Documentation/devicetree/bindings/net/micrel.txt
> @@ -45,3 +45,20 @@ Optional properties:
>  
>  	In fiber mode, auto-negotiation is disabled and the PHY can only work in
>  	100base-fx (full and half duplex) modes.
> +
> + - lan8814,ignore-ts: If present the PHY will not support timestamping.

'lan8814' is not a vendor and the format for properties is 
<vendor>,<propname>.

Is this configuration or lack of h/w feature? IOW, instead of 'will 
not', 'does not' or 'timestamping is disabled.'. As configuration, that 
seems like common property. For (lack of) h/w features, that should be 
implied by the compatible or VID/PID.

> +	This option acts as check whether Timestamping is supported by
> +	hardware or not. LAN8814 phy support hardware tmestamping.
> +
> + - lan8814,latency_rx_10: Configures Latency value of phy in ingress at 10 Mbps.

s/_/-/

What are the units here? Is this a common feature of PHYs?

> +
> + - lan8814,latency_tx_10: Configures Latency value of phy in egress at 10 Mbps.
> +
> + - lan8814,latency_rx_100: Configures Latency value of phy in ingress at 100 Mbps.
> +
> + - lan8814,latency_tx_100: Configures Latency value of phy in egress at 100 Mbps.
> +
> + - lan8814,latency_rx_1000: Configures Latency value of phy in ingress at 1000 Mbps.
> +
> + - lan8814,latency_tx_1000: Configures Latency value of phy in egress at 1000 Mbps.
> -- 
> 2.17.1
> 
>
Divya Koppera March 8, 2022, 10:05 a.m. UTC | #7
> -----Original Message-----
> From: Andrew Lunn <andrew@lunn.ch>
> Sent: Monday, March 7, 2022 6:39 PM
> To: Divya Koppera - I30481 <Divya.Koppera@microchip.com>
> Cc: netdev@vger.kernel.org; hkallweit1@gmail.com; linux@armlinux.org.uk;
> davem@davemloft.net; kuba@kernel.org; robh+dt@kernel.org;
> devicetree@vger.kernel.org; richardcochran@gmail.com; linux-
> kernel@vger.kernel.org; UNGLinuxDriver <UNGLinuxDriver@microchip.com>;
> Madhuri Sripada - I34878 <Madhuri.Sripada@microchip.com>; Manohar Puri -
> I30488 <Manohar.Puri@microchip.com>
> Subject: Re: [PATCH net-next 2/3] dt-bindings: net: micrel: Configure latency
> values and timestamping check for LAN8814 phy
> 
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the
> content is safe
> 
> > > > +
> > > > + - lan8814,ignore-ts: If present the PHY will not support timestamping.
> > > > +
> > > > +     This option acts as check whether Timestamping is supported by
> > > > +     hardware or not. LAN8814 phy support hardware tmestamping.
> > >
> > > Does this mean the hardware itself cannot tell you it is missing the
> > > needed hardware? What happens when you forget to add this flag? Does
> > > the driver timeout waiting for hardware which does not exists?
> > >
> >
> > If forgot to add this flag, driver will try to register ptp_clock that
> > needs access to clock related registers, which in turn fails if those registers
> doesn't exists.
> 
> Thanks for the reply, but you did not answer my question:
> 
>   Does this mean the hardware itself cannot tell you it is missing the
>   needed hardware?
> 
> Don't you have different IDs in register 2 and 3 for those devices with clock
> register and those without?
> 

The purpose of this option is, if both PHY and MAC supports timestamping then always timestamping is done in PHY.
If timestamping need to be done in MAC we need a way to stop PHY timestamping. If this flag is used then timestamping is taken care by MAC.

> > > > + - lan8814,latency_rx_10: Configures Latency value of phy in
> > > > + ingress at 10
> > > Mbps.
> > > > +
> > > > + - lan8814,latency_tx_10: Configures Latency value of phy in
> > > > + egress at 10
> > > Mbps.
> > > > +
> > > > + - lan8814,latency_rx_100: Configures Latency value of phy in
> > > > + ingress at 100
> > > Mbps.
> > > > +
> > > > + - lan8814,latency_tx_100: Configures Latency value of phy in
> > > > + egress at 100
> > > Mbps.
> > > > +
> > > > + - lan8814,latency_rx_1000: Configures Latency value of phy in
> > > > + ingress at
> > > 1000 Mbps.
> > > > +
> > > > + - lan8814,latency_tx_1000: Configures Latency value of phy in
> > > > + egress at
> > > 1000 Mbps.
> > >
> > > Why does this need to be configured, rather than hard coded? Why
> > > would the latency for a given speed change? I would of thought
> > > though you would take the average length of a PTP packet and divide is by
> the link speed.
> > >
> >
> > This latency values could be different for different phy's. So hardcoding will
> not work here.
> 
> But you do actually have hard coded defaults. Those odd hex values i pointed
> out.
> 
> By different PHYs do you mean different PHY versions? So you can look at
> register 2 and 3, determine what PHY it is, and so from that what defaults
> should be used? Or do you mean different boards with the same PHY?
> 
> In general, the less tunables you have, the better. If the driver can figure it out,
> it is better to not have DT properties. The PHY will then also work with ACPI
> and USB etc, where there is no DT. Implementing the user space API Richard
> pointed out will also allow your PHY to work with none DP systems.
> 

Sorry I answered wrong. Latency values vary depending on the position of PHY in board. 
We have used this PHY in different hardware's, where latency values differs based on PHY positioning. 
So we used latency option in DTS file.
If you have other ideas or I'm wrong please let me know?

> > Yes in our case latency values depends on port speed. It is delay
> > between network medium and PTP timestamp point.
> 
> What are the units. You generally have the units in the property name. So e.g.
> lan8814,latency_tx_1000_ns. If need be, the driver then converts to whatever
> value you place into the register.
> 
> If you do keep them, please make it clear that these values are optional, and
> state what value will be used when the property is not present.
> 

Yes units are Nanoseconds.

>         Andrew
Andrew Lunn March 8, 2022, 1:54 p.m. UTC | #8
> > Thanks for the reply, but you did not answer my question:
> > 
> >   Does this mean the hardware itself cannot tell you it is missing the
> >   needed hardware?
> > 
> > Don't you have different IDs in register 2 and 3 for those devices with clock
> > register and those without?
> > 
> 

> The purpose of this option is, if both PHY and MAC supports
> timestamping then always timestamping is done in PHY.  If
> timestamping need to be done in MAC we need a way to stop PHY
> timestamping. If this flag is used then timestamping is taken care
> by MAC.

This is not a valid use of DT, since this is configuration, not
describing the hardware. There has been recent extension in the UAPI
to allow user space to do this configuration. Please look at that
work.

> Sorry I answered wrong. Latency values vary depending on the position of PHY in board. 
> We have used this PHY in different hardware's, where latency values differs based on PHY positioning. 
> So we used latency option in DTS file.
> If you have other ideas or I'm wrong please let me know?

So this is a function of the track length between the MAC and the PHY?
How do you determine these values? There is no point having
configuration values if you don't document how to determine what value
should be used.

       Andrew
Richard Cochran March 8, 2022, 2:54 p.m. UTC | #9
On Tue, Mar 08, 2022 at 02:54:37PM +0100, Andrew Lunn wrote:
> This is not a valid use of DT, since this is configuration, not
> describing the hardware. There has been recent extension in the UAPI
> to allow user space to do this configuration. Please look at that
> work.

Yes, I had an RFC up that hopefully will merge soon.

In the mean time, just implement the PHC/time stamping in your PHY
driver unconditionally.

Thanks,
Richard
Horatiu Vultur March 8, 2022, 3:43 p.m. UTC | #10
Hi Andrew,

The 03/08/2022 14:54, Andrew Lunn wrote:
> 
> > > Thanks for the reply, but you did not answer my question:
> > >
> > >   Does this mean the hardware itself cannot tell you it is missing the
> > >   needed hardware?
> > >
> > > Don't you have different IDs in register 2 and 3 for those devices with clock
> > > register and those without?
> > >
> >
> 
> > The purpose of this option is, if both PHY and MAC supports
> > timestamping then always timestamping is done in PHY.  If
> > timestamping need to be done in MAC we need a way to stop PHY
> > timestamping. If this flag is used then timestamping is taken care
> > by MAC.
> 
> This is not a valid use of DT, since this is configuration, not
> describing the hardware. There has been recent extension in the UAPI
> to allow user space to do this configuration. Please look at that
> work.

Ah ... now we have found Richard patch series.
So we will remove this option and once Richard's patch series will be
accepted we will use that.

> 
> > Sorry I answered wrong. Latency values vary depending on the position of PHY in board.
> > We have used this PHY in different hardware's, where latency values differs based on PHY positioning.
> > So we used latency option in DTS file.
> > If you have other ideas or I'm wrong please let me know?
> 
> So this is a function of the track length between the MAC and the PHY?

Nope.
This latency represents the time it takes for the frame to travel from RJ45
module to the timestamping unit inside the PHY. To be more precisely,
the timestamping unit will do the timestamp when it detects the end of
the start of the frame. So it represents the time from when the frame
reaches the RJ45 to when the end of start of the frame reaches the
timestamping unit inside the PHY.

And because each board manufacture could put the same PHY but in
different places, then each of them would have a different latency.
That is the main reason why we put this latencies in the DT and not put
them inside the driver. Because we think each board manufacture will
need to use different values.

Another reason is that we want the board manufacture to determine these
values and not the end users. I have seen that also Richard commenting
on this, saying that the latencies should not be in DT.
Currently I don't know where else they can be. I know that ptp4l has
these option in SW to update the ingress/egress latencies but if someone
else is running another application, what will they do?

> How do you determine these values?

This is a little bit more complicated.
So first you will need a device that you know already that is
calibrated. Then you connect the device that you want to calibrate to
the calibrated one with a known length cable. We presume that there is a
5ns delay per meter of the cable. And then basically we run ptp4l on
each device where the master will be the calibrated one and the slave
will be the device that will be calibrated. When we run ptp4l we can see
mean path delay, and we subtract the delay introduced by the cable(5ns)
and then we take this value and divided by 2. And then
the result is added to the current rx latency and subtracted from tx
latency.
This is how we have calculated the values.

> There is no point having
> configuration values if you don't document how to determine what value
> should be used.

I agree, we should do a better job at this and also explaining what
these values represent. Definitely we will do that in the next patch.

> 
>        Andrew
Andrew Lunn March 8, 2022, 6:10 p.m. UTC | #11
> > So this is a function of the track length between the MAC and the PHY?
> 
> Nope.
> This latency represents the time it takes for the frame to travel from RJ45
> module to the timestamping unit inside the PHY. To be more precisely,
> the timestamping unit will do the timestamp when it detects the end of
> the start of the frame. So it represents the time from when the frame
> reaches the RJ45 to when the end of start of the frame reaches the
> timestamping unit inside the PHY.

I must be missing something here. How do you measure the latency
difference for a 1 meter cable vs a 100m cable? Does 100m cable plus
1cm of track from the RJ45 to the PHY make a difference compared to
100m cable plus 1.5cm of track? Isn't this error all just in the
noise?

   Andrew
Horatiu Vultur March 8, 2022, 10:14 p.m. UTC | #12
The 03/08/2022 19:10, Andrew Lunn wrote:
> 
> > > So this is a function of the track length between the MAC and the PHY?
> >
> > Nope.
> > This latency represents the time it takes for the frame to travel from RJ45
> > module to the timestamping unit inside the PHY. To be more precisely,
> > the timestamping unit will do the timestamp when it detects the end of
> > the start of the frame. So it represents the time from when the frame
> > reaches the RJ45 to when the end of start of the frame reaches the
> > timestamping unit inside the PHY.
> 
> I must be missing something here. How do you measure the latency
> difference for a 1 meter cable vs a 100m cable?

In the same way because the end result will be the same.
Lets presume that the cable introduce a 5ns latency per meter.
So, if we use a 1m cable and the mean path delay is 11, then
the latency is 11 - 5.
If we use a 100m cable then the mean path delay will be 506(if is not
506 then is something wrong) then the latency is 506 - 500.

> Does 100m cable plus 1cm of track from the RJ45 to the PHY make a difference
> compared to 100m cable plus 1.5cm of track?

In that case I don't think you will see any difference.

> Isn't this error all just in the noise?

I am not sure I follow this question.

> 
>    Andrew
Andrew Lunn March 8, 2022, 11:36 p.m. UTC | #13
On Tue, Mar 08, 2022 at 11:14:04PM +0100, Horatiu Vultur wrote:
> The 03/08/2022 19:10, Andrew Lunn wrote:
> > 
> > > > So this is a function of the track length between the MAC and the PHY?
> > >
> > > Nope.
> > > This latency represents the time it takes for the frame to travel from RJ45
> > > module to the timestamping unit inside the PHY. To be more precisely,
> > > the timestamping unit will do the timestamp when it detects the end of
> > > the start of the frame. So it represents the time from when the frame
> > > reaches the RJ45 to when the end of start of the frame reaches the
> > > timestamping unit inside the PHY.
> > 
> > I must be missing something here. How do you measure the latency
> > difference for a 1 meter cable vs a 100m cable?
> 
> In the same way because the end result will be the same.

The latency from the RJ45 to the PHY will be the same. But the latency
from the link peer PHY to the local PHY will be much more, 500ns. In
order for this RJ45 to PHY delay to be meaningful, don't you also need
to know the length of the cable? Is there a configuration knob
somewhere for the cable length?

I'm assuming the ptp protocol does not try to measure the cable delay,
since if it did, there would be no need to know the RJ45-PHY delay, it
would be part of that.

> > Isn't this error all just in the noise?
> 
> I am not sure I follow this question.

At minimum, you expect to have a 1m cable. The RJ45-PHY track length
is maybe 2cm? So 2% of the overall length. So you are trying to
correct the error this 2% causes. If you have a 100m cable, 0.02% is
RJ45-PHY part that you are trying to correct the error on. These
numbers seem so small, it seems pointless. It only seems to make sense
if you know the length of the cable, and to an accuracy of a few cm.

   Andrew
Richard Cochran March 9, 2022, 1:46 a.m. UTC | #14
On Wed, Mar 09, 2022 at 12:36:54AM +0100, Andrew Lunn wrote:

> I'm assuming the ptp protocol does not try to measure the cable delay,
> since if it did, there would be no need to know the RJ45-PHY delay, it
> would be part of that.

The PTP does indeed measure the cable delay.  With a well tuned
system, you can tell the copper cable length directly from the
measured delay.

The problem with uncorrected PHY time stamps is that they affect the
boundary point between the node and the network.  A static error there
will create a path asymmetry that can neither be measured nor
corrected by the PTP, and a variable error degrades the time signal.


Thanks,
Richard
Richard Cochran March 9, 2022, 1:58 a.m. UTC | #15
On Tue, Mar 08, 2022 at 05:46:47PM -0800, Richard Cochran wrote:
> The problem with uncorrected PHY time stamps is that they affect the
> boundary point between the node and the network.  A static error there
> will create a path asymmetry that can neither be measured nor
> corrected by the PTP, and a variable error degrades the time signal.

And FWIW, the imperfections in the cable *also* introduce path
asymmetry and thus uncorrected offsets.  The twisted pairs are never
exactly the same length in both directions, for example.

However the PHY delays can be in the microseconds, but cable delay
deltas in the nanoseconds, so correcting PHY delay is much more
important.

Thanks,
Richard
Horatiu Vultur March 9, 2022, 1:24 p.m. UTC | #16
The 03/09/2022 00:36, Andrew Lunn wrote:
> 
> On Tue, Mar 08, 2022 at 11:14:04PM +0100, Horatiu Vultur wrote:
> > The 03/08/2022 19:10, Andrew Lunn wrote:
> > >
> > > > > So this is a function of the track length between the MAC and the PHY?
> > > >
> > > > Nope.
> > > > This latency represents the time it takes for the frame to travel from RJ45
> > > > module to the timestamping unit inside the PHY. To be more precisely,
> > > > the timestamping unit will do the timestamp when it detects the end of
> > > > the start of the frame. So it represents the time from when the frame
> > > > reaches the RJ45 to when the end of start of the frame reaches the
> > > > timestamping unit inside the PHY.
> > >
> > > I must be missing something here. How do you measure the latency
> > > difference for a 1 meter cable vs a 100m cable?
> >
> > In the same way because the end result will be the same.
> 
> The latency from the RJ45 to the PHY will be the same. But the latency
> from the link peer PHY to the local PHY will be much more, 500ns. In
> order for this RJ45 to PHY delay to be meaningful, don't you also need
> to know the length of the cable? Is there a configuration knob
> somewhere for the cable length?
> 
> I'm assuming the ptp protocol does not try to measure the cable delay,
> since if it did, there would be no need to know the RJ45-PHY delay, it
> would be part of that.
> 
> > > Isn't this error all just in the noise?
> >
> > I am not sure I follow this question.
> 
> At minimum, you expect to have a 1m cable. The RJ45-PHY track length
> is maybe 2cm? So 2% of the overall length. So you are trying to
> correct the error this 2% causes. If you have a 100m cable, 0.02% is
> RJ45-PHY part that you are trying to correct the error on. These
> numbers seem so small, it seems pointless. It only seems to make sense
> if you know the length of the cable, and to an accuracy of a few cm.

I am not trying to adjust for the length of the cable.
If we have the following drawing:

 MAC                     PHY                    RJ45
-----       --------------------------       --------
|   |       |                        |       |       |
|   |<----->|timestamp | FIFO | GPHY |<----->|       |<------> Peer
|   |       |   unit                 |       |       |
-----       --------------------------       --------
                 ^                                   ^
                 |            latency                |
                 -------------------------------------

I am trying to calculate this latency, which includes a 2cm of track +
latency inside the PHY. As Richard mentioned also the PHY introduce some
latency which can be microseconds.

I understand if we consider that this latency should not be in the DT
and be part of the driver because the latency over the 2cm or 1.5cm of track
is almost nothing. But then what about the case when we want to add these
latencies to a MAC? They will depend on the latency inside the PHY so
those should come from DT.

So it really doesn't matter to me if I use a 1m cable or 100m cable.
What it matters is to see that mean path delay will be ~5ns for 1m cable
and ~500ns for 100m cable. And if is not, then I need to update the
register to calculate correctly the latency from RJ45 to timestamp unit
in the PHY.

> 

>    Andrew
Russell King (Oracle) March 9, 2022, 2:55 p.m. UTC | #17
On Wed, Mar 09, 2022 at 02:24:43PM +0100, Horatiu Vultur wrote:
> The 03/09/2022 00:36, Andrew Lunn wrote:
> > 
> > On Tue, Mar 08, 2022 at 11:14:04PM +0100, Horatiu Vultur wrote:
> > > The 03/08/2022 19:10, Andrew Lunn wrote:
> > > >
> > > > > > So this is a function of the track length between the MAC and the PHY?
> > > > >
> > > > > Nope.
> > > > > This latency represents the time it takes for the frame to travel from RJ45
> > > > > module to the timestamping unit inside the PHY. To be more precisely,
> > > > > the timestamping unit will do the timestamp when it detects the end of
> > > > > the start of the frame. So it represents the time from when the frame
> > > > > reaches the RJ45 to when the end of start of the frame reaches the
> > > > > timestamping unit inside the PHY.
> > > >
> > > > I must be missing something here. How do you measure the latency
> > > > difference for a 1 meter cable vs a 100m cable?
> > >
> > > In the same way because the end result will be the same.
> > 
> > The latency from the RJ45 to the PHY will be the same. But the latency
> > from the link peer PHY to the local PHY will be much more, 500ns. In
> > order for this RJ45 to PHY delay to be meaningful, don't you also need
> > to know the length of the cable? Is there a configuration knob
> > somewhere for the cable length?
> > 
> > I'm assuming the ptp protocol does not try to measure the cable delay,
> > since if it did, there would be no need to know the RJ45-PHY delay, it
> > would be part of that.
> > 
> > > > Isn't this error all just in the noise?
> > >
> > > I am not sure I follow this question.
> > 
> > At minimum, you expect to have a 1m cable. The RJ45-PHY track length
> > is maybe 2cm? So 2% of the overall length. So you are trying to
> > correct the error this 2% causes. If you have a 100m cable, 0.02% is
> > RJ45-PHY part that you are trying to correct the error on. These
> > numbers seem so small, it seems pointless. It only seems to make sense
> > if you know the length of the cable, and to an accuracy of a few cm.
> 
> I am not trying to adjust for the length of the cable.
> If we have the following drawing:
> 
>  MAC                     PHY                    RJ45
> -----       --------------------------       --------
> |   |       |                        |       |       |
> |   |<----->|timestamp | FIFO | GPHY |<----->|       |<------> Peer
> |   |       |   unit                 |       |       |
> -----       --------------------------       --------
>                  ^                                   ^
>                  |            latency                |
>                  -------------------------------------
> 
> I am trying to calculate this latency, which includes a 2cm of track +
> latency inside the PHY. As Richard mentioned also the PHY introduce some
> latency which can be microseconds.

I think we understand this, and compensating for the delay in the PHY
is quite reasonable, which surely will be a fixed amount irrespective
of the board.

However, Andrew's point is that the latency introduced by the copper
wire between the PHY and the RJ45 is insignificant, so insignificant
it's not worth bothering with - and I agree.

> I understand if we consider that this latency should not be in the DT
> and be part of the driver because the latency over the 2cm or 1.5cm of track
> is almost nothing. But then what about the case when we want to add these
> latencies to a MAC? They will depend on the latency inside the PHY so
> those should come from DT.

If you want to measure it to the MAC, then yes, the latency through
the PHY needs to be considered, and we probably need some new
interfaces inside the kernel so that MAC drivers can query phylib
to discover what the delay is. I don't think this is soemthing that
should be thrown into firmware, since the delay inside the PHY
should be constant (depending on what MAC side interface mode is
selected.)

Having it in firmware means that we're reliant on people ensuring
that they've looked up the right value for the PHY and its interface
mode not just once, but for every board out there - and if an error
is found, it brings up the question whether it should be corrected
on all boards or just one (and then there'll be questions why some
people have chosen randomly different values.)

> So it really doesn't matter to me if I use a 1m cable or 100m cable.
> What it matters is to see that mean path delay will be ~5ns for 1m cable
> and ~500ns for 100m cable. And if is not, then I need to update the
> register to calculate correctly the latency from RJ45 to timestamp unit
> in the PHY.

Does this mean you ask the user how long the cable is? Do you get them
to measure it to the nearest millimeter?

What about the overlap in the RJ45 connectors, and the height of the
pins in the RJ45? Some RJ45 connectors have staggered lengths for
their pins which would affect the true length. What about the total
length of the conductors in the RJ45 socket to the point that the
RJ45 plug makes contact? What happens if production then has to
change the make of RJ45 socket due to supply issues (which given
what is going on in the world at the moment is not unlikely.)

If you care about the 20mm or so on the board, then you ought to care
about all these other factors as well, and I suspect you're going to
be hard pressed to gather all that.
Richard Cochran March 9, 2022, 7:52 p.m. UTC | #18
On Wed, Mar 09, 2022 at 02:55:49PM +0000, Russell King (Oracle) wrote:

> I think we understand this, and compensating for the delay in the PHY
> is quite reasonable, which surely will be a fixed amount irrespective
> of the board.

The PHY delays are not fixed.  They can be variable, even packet to packet.

https://www.researchgate.net/publication/260434179_Measurement_of_egress_and_ingress_delays_of_PTP_clocks

https://www.researchgate.net/publication/265731050_Experimental_verification_of_the_egress_and_ingress_latency_correction_in_PTP_clocks

Some PHYs are well behaved.  Some are not.

In any case, the linuxptp user space stack supports the standardized
method of correcting a system's delay asymmetry.  IMO it makes no
sense to even try to let kernel device drivers correct these delays.
Driver authors will get it wrong, and indeed they have already tried
and failed.  And when the magic numbers change from one kernel release
to another, it only makes the end user's job harder, because they will
have to update their scripts to correct the bogus numbers.

Thanks,
Richard
Horatiu Vultur March 11, 2022, 2:07 p.m. UTC | #19
The 03/09/2022 14:55, Russell King (Oracle) wrote:
> 
> On Wed, Mar 09, 2022 at 02:24:43PM +0100, Horatiu Vultur wrote:
> > The 03/09/2022 00:36, Andrew Lunn wrote:
> > >
> > > On Tue, Mar 08, 2022 at 11:14:04PM +0100, Horatiu Vultur wrote:
> > > > The 03/08/2022 19:10, Andrew Lunn wrote:
> > > > >
> > > > > > > So this is a function of the track length between the MAC and the PHY?
> > > > > >
> > > > > > Nope.
> > > > > > This latency represents the time it takes for the frame to travel from RJ45
> > > > > > module to the timestamping unit inside the PHY. To be more precisely,
> > > > > > the timestamping unit will do the timestamp when it detects the end of
> > > > > > the start of the frame. So it represents the time from when the frame
> > > > > > reaches the RJ45 to when the end of start of the frame reaches the
> > > > > > timestamping unit inside the PHY.
> > > > >
> > > > > I must be missing something here. How do you measure the latency
> > > > > difference for a 1 meter cable vs a 100m cable?
> > > >
> > > > In the same way because the end result will be the same.
> > >
> > > The latency from the RJ45 to the PHY will be the same. But the latency
> > > from the link peer PHY to the local PHY will be much more, 500ns. In
> > > order for this RJ45 to PHY delay to be meaningful, don't you also need
> > > to know the length of the cable? Is there a configuration knob
> > > somewhere for the cable length?
> > >
> > > I'm assuming the ptp protocol does not try to measure the cable delay,
> > > since if it did, there would be no need to know the RJ45-PHY delay, it
> > > would be part of that.
> > >
> > > > > Isn't this error all just in the noise?
> > > >
> > > > I am not sure I follow this question.
> > >
> > > At minimum, you expect to have a 1m cable. The RJ45-PHY track length
> > > is maybe 2cm? So 2% of the overall length. So you are trying to
> > > correct the error this 2% causes. If you have a 100m cable, 0.02% is
> > > RJ45-PHY part that you are trying to correct the error on. These
> > > numbers seem so small, it seems pointless. It only seems to make sense
> > > if you know the length of the cable, and to an accuracy of a few cm.
> >
> > I am not trying to adjust for the length of the cable.
> > If we have the following drawing:
> >
> >  MAC                     PHY                    RJ45
> > -----       --------------------------       --------
> > |   |       |                        |       |       |
> > |   |<----->|timestamp | FIFO | GPHY |<----->|       |<------> Peer
> > |   |       |   unit                 |       |       |
> > -----       --------------------------       --------
> >                  ^                                   ^
> >                  |            latency                |
> >                  -------------------------------------
> >
> > I am trying to calculate this latency, which includes a 2cm of track +
> > latency inside the PHY. As Richard mentioned also the PHY introduce some
> > latency which can be microseconds.
> 
> I think we understand this, and compensating for the delay in the PHY
> is quite reasonable, which surely will be a fixed amount irrespective
> of the board.
> 
> However, Andrew's point is that the latency introduced by the copper
> wire between the PHY and the RJ45 is insignificant, so insignificant
> it's not worth bothering with - and I agree.

OK, that is fine for me, we can ignore the latency introduced by the
copper wire.

> 
> > I understand if we consider that this latency should not be in the DT
> > and be part of the driver because the latency over the 2cm or 1.5cm of track
> > is almost nothing. But then what about the case when we want to add these
> > latencies to a MAC? They will depend on the latency inside the PHY so
> > those should come from DT.
> 
> If you want to measure it to the MAC, then yes, the latency through
> the PHY needs to be considered, and we probably need some new
> interfaces inside the kernel so that MAC drivers can query phylib
> to discover what the delay is.

Does that mean that each PHY needs to implement a new API? Because that
would be a little bit of work to do there.

> I don't think this is soemthing that
> should be thrown into firmware, since the delay inside the PHY
> should be constant (depending on what MAC side interface mode is
> selected.)
> 
> Having it in firmware means that we're reliant on people ensuring
> that they've looked up the right value for the PHY and its interface
> mode not just once, but for every board out there - and if an error
> is found, it brings up the question whether it should be corrected
> on all boards or just one (and then there'll be questions why some
> people have chosen randomly different values.)
> 
> > So it really doesn't matter to me if I use a 1m cable or 100m cable.
> > What it matters is to see that mean path delay will be ~5ns for 1m cable
> > and ~500ns for 100m cable. And if is not, then I need to update the
> > register to calculate correctly the latency from RJ45 to timestamp unit
> > in the PHY.
> 
> Does this mean you ask the user how long the cable is? Do you get them
> to measure it to the nearest millimeter?

My expectation is that the end user should not care about the length of
the cable. He just needs to start ptp4l and have some sane results.
Only the board manufacture were supposed to know the length of the
cable to set their latency values.

> 
> What about the overlap in the RJ45 connectors, and the height of the
> pins in the RJ45? Some RJ45 connectors have staggered lengths for
> their pins which would affect the true length. What about the total
> length of the conductors in the RJ45 socket to the point that the
> RJ45 plug makes contact? What happens if production then has to
> change the make of RJ45 socket due to supply issues (which given
> what is going on in the world at the moment is not unlikely.)

If they don't introduce any significat latency then we can ignore them.

> 
> If you care about the 20mm or so on the board, then you ought to care
> about all these other factors as well, and I suspect you're going to
> be hard pressed to gather all that.
> 
> --
> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
Horatiu Vultur March 11, 2022, 2:28 p.m. UTC | #20
The 03/09/2022 11:52, Richard Cochran wrote:
> 
> On Wed, Mar 09, 2022 at 02:55:49PM +0000, Russell King (Oracle) wrote:
> 
> > I think we understand this, and compensating for the delay in the PHY
> > is quite reasonable, which surely will be a fixed amount irrespective
> > of the board.
> 
> The PHY delays are not fixed.  They can be variable, even packet to packet.
> 
> https://www.researchgate.net/publication/260434179_Measurement_of_egress_and_ingress_delays_of_PTP_clocks
> 
> https://www.researchgate.net/publication/265731050_Experimental_verification_of_the_egress_and_ingress_latency_correction_in_PTP_clocks
> 
> Some PHYs are well behaved.  Some are not.

What about adding only some sane values in the driver like here [1].
And the allow the user to use linuxptp to fine tune all this.

> 
> In any case, the linuxptp user space stack supports the standardized
> method of correcting a system's delay asymmetry.  IMO it makes no
> sense to even try to let kernel device drivers correct these delays.
> Driver authors will get it wrong, and indeed they have already tried
> and failed.  And when the magic numbers change from one kernel release
> to another, it only makes the end user's job harder, because they will
> have to update their scripts to correct the bogus numbers.
> 
> Thanks,
> Richard
> 

[1] https://elixir.bootlin.com/linux/v5.17-rc7/source/drivers/net/phy/mscc/mscc_ptp.c#L245
Richard Cochran March 11, 2022, 3:08 p.m. UTC | #21
On Fri, Mar 11, 2022 at 03:28:14PM +0100, Horatiu Vultur wrote:

> What about adding only some sane values in the driver like here [1].
> And the allow the user to use linuxptp to fine tune all this.

I mean, that is the point.  Users will surely have to tune it
themselves, second guessing the driver in any case.  So having hard
coded constants in the driver is useless.

Probably even the tuned values will differ by link speed, so having
the per-link speed constants in the driver doesn't help either.

(And yes, linuxptp should offer configuration variables per link
speed, monitor actual link speed, and switch automatically.  So far no
one is demanding that loudly)

Thanks,
Richard
Woojung Huh March 11, 2022, 3:21 p.m. UTC | #22
Hi Richard,

Not sure that it is good idea to reply on not-the-latest thread.

> -----Original Message-----
> From: Richard Cochran <richardcochran@gmail.com>
> Sent: Wednesday, March 9, 2022 2:53 PM
> To: Russell King (Oracle) <linux@armlinux.org.uk>
> Cc: Horatiu Vultur - M31836 <Horatiu.Vultur@microchip.com>; Andrew Lunn
> <andrew@lunn.ch>; Divya Koppera - I30481
> <Divya.Koppera@microchip.com>; netdev@vger.kernel.org;
> hkallweit1@gmail.com; davem@davemloft.net; kuba@kernel.org;
> robh+dt@kernel.org; devicetree@vger.kernel.org; linux-
> kernel@vger.kernel.org; UNGLinuxDriver
> <UNGLinuxDriver@microchip.com>; Madhuri Sripada - I34878
> <Madhuri.Sripada@microchip.com>; Manohar Puri - I30488
> <Manohar.Puri@microchip.com>
> Subject: Re: [PATCH net-next 2/3] dt-bindings: net: micrel: Configure latency
> values and timestamping check for LAN8814 phy
> 
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the
> content is safe
> 
> On Wed, Mar 09, 2022 at 02:55:49PM +0000, Russell King (Oracle) wrote:
> 
> > I think we understand this, and compensating for the delay in the PHY
> > is quite reasonable, which surely will be a fixed amount irrespective
> > of the board.
> 
> The PHY delays are not fixed.  They can be variable, even packet to packet.
> 
> https://www.researchgate.net/publication/260434179_Measurement_of_e
> gress_and_ingress_delays_of_PTP_clocks
> 
> https://www.researchgate.net/publication/265731050_Experimental_verific
> ation_of_the_egress_and_ingress_latency_correction_in_PTP_clocks
> 
> Some PHYs are well behaved.  Some are not.
> 
> In any case, the linuxptp user space stack supports the standardized
> method of correcting a system's delay asymmetry.  IMO it makes no
> sense to even try to let kernel device drivers correct these delays.
> Driver authors will get it wrong, and indeed they have already tried
> and failed.  And when the magic numbers change from one kernel release
> to another, it only makes the end user's job harder, because they will
> have to update their scripts to correct the bogus numbers.
> 

If you are referring to the delayAsymmetry of ptp4l, I think that is different from this latency value.
delayAsymmetry of ptp4l says "The time difference in nanoseconds of the transmit and receive  paths. 
This value should be positive when the master-to-slave propagation time is longer and negative
when the slave-to-master time is longer. The default is 0 nanoseconds."
In my understanding, master-to-slave uses reference timestamp which is defined in IEEE specs.
   <egressTimestamp> = <egressProvidedTimestamp> + <egressLatency>
   <ingressTimestamp> = <ingressProvidedTimestamp> - <ingressLatency>

These latency is egreeLatency or ingressLatency to get accurate timestamp at reference point from 
timestamp of clock in MAC or PHY.
So, this latency should (hopefully) be not-much-change in the same board after manufactured. 
But, value can be different from design to design and port to port if some path (PHY to RJ45) is longer than others.

This doesn't cover any latency from cable length and/or asymmetry which may come from RJ45-to-RJ45.
But, delayAsymmetry may care cable type/length in application point of view.

Of cause, all values may be small enough to ignore though.
Do I miss something here?

Thanks.
Woojung
Richard Cochran March 12, 2022, 2:48 a.m. UTC | #23
On Fri, Mar 11, 2022 at 03:21:58PM +0000, Woojung.Huh@microchip.com wrote:

> If you are referring to the delayAsymmetry of ptp4l,

No, really, I'm not.

       PTP4l(8)                    System Manager's Manual                   PTP4l(8)

       NAME
           ptp4l - PTP Boundary/Ordinary/Transparent Clock

       ...

       egressLatency
              Specifies  the  difference  in  nanoseconds  between  the actual
              transmission time at the reference plane and the reported trans‐
              mit  time  stamp. This value will be added to egress time stamps
              obtained from the hardware.  The default is 0.

       ingressLatency
              Specifies the difference in nanoseconds between the reported re‐
              ceive  time  stamp  and  the  actual reception time at reference
              plane. This value will be subtracted from  ingress  time  stamps
              obtained from the hardware.  The default is 0.

> So, this latency should (hopefully) be not-much-change in the same board after manufactured. 

Please read the papers on this topic.  I posted links in another reply
in this thread.

> Of cause, all values may be small enough to ignore though.
> Do I miss something here?

Yes, you miss the point entirely.  PHY delays are relatively large and
cannot be even measured in some cases.  However, for well behaved
PHYs, the user space stack already covers the configuration.

Thanks,
Richard
Allan W. Nielsen March 12, 2022, 7:36 p.m. UTC | #24
Hi Richard,

On Fri, Mar 11, 2022 at 07:08:42AM -0800, Richard Cochran wrote:
> On Fri, Mar 11, 2022 at 03:28:14PM +0100, Horatiu Vultur wrote:
> 
> > What about adding only some sane values in the driver like here [1].
> > And the allow the user to use linuxptp to fine tune all this.
> 
> I mean, that is the point.  Users will surely have to tune it
> themselves, second guessing the driver in any case.  So having hard
> coded constants in the driver is useless.
> 
> Probably even the tuned values will differ by link speed, so having
> the per-link speed constants in the driver doesn't help either.
> 
> (And yes, linuxptp should offer configuration variables per link
> speed, monitor actual link speed, and switch automatically.  So far no
> one is demanding that loudly)

I did skim through the articles, and as you hinted he does find small
latency differences across different packets. (but as I understood, very
few PHYs was tested).

Also, I know that we (Vitesse -> Microsemi -> Microchip) have been
offering ways to calibrate the individual PHYs in other PTP-SW products.
So, this makes good sense.

With this in mind, I do agree with you that it does not make much sense
to compensate they few cm of PCB tracks without also calibrating for
differences from packet to packet.

But this is not really an argument for not having _default_ values
hard-coded in the driver (or DT, but lets forget about DT for now).
Looking at the default compensation numbers provided in the driver, they
are a lot larger than what we expect to find in calibration.

As pointed out by other, most users will not care about the small error
introduced by the few cm PCB track. My claim is that if we provide
default hard-coded delay values in the driver, most users will not care
about the few ns noise that each packet differs. And those who do care,
have all the hooks and handle to calibrate further by using PTP4L.

If we do not offer default delays directly in the driver, everybody will
have to calibrate all boards just to have decent results, we will not
have a good way to provide default delay numbers, and this will be
different from what is done in other drivers.

I do understand that you have a concern that these numbers may change in
future updates. But this has not been a problem in other drivers doing
the same. But if this is still a concern, we can add a comment to say
that these numbers must be treated as UAPI, and chancing them, may
cause regressions on calibrated PHYs.

Long story short, I can see any real down-sides of adding these delay
numbers, and I see plenty in not doing so.

/Allan
Andrew Lunn March 12, 2022, 8:04 p.m. UTC | #25
>        PTP4l(8)                    System Manager's Manual                   PTP4l(8)
> 
>        NAME
>            ptp4l - PTP Boundary/Ordinary/Transparent Clock
> 
>        ...
> 
>        egressLatency
>               Specifies  the  difference  in  nanoseconds  between  the actual
>               transmission time at the reference plane and the reported trans‐
>               mit  time  stamp. This value will be added to egress time stamps
>               obtained from the hardware.  The default is 0.
> 
>        ingressLatency
>               Specifies the difference in nanoseconds between the reported re‐
>               ceive  time  stamp  and  the  actual reception time at reference
>               plane. This value will be subtracted from  ingress  time  stamps
>               obtained from the hardware.  The default is 0.
> 

Hi Richard

Do these get passed to the kernel so the hardware can act on them, or
are they used purely in userspace by ptp4l?

If they has passed to the kernel, could we provide a getter as well as
a setter, so the defaults hard coded in the driver can be read back?

	Andrew
Richard Cochran March 13, 2022, 2:41 a.m. UTC | #26
On Sat, Mar 12, 2022 at 08:36:20PM +0100, Allan W. Nielsen wrote:
> With this in mind, I do agree with you that it does not make much sense
> to compensate they few cm of PCB tracks without also calibrating for
> differences from packet to packet.

The PCB traces AFTER the PHY are part of the network, and if they
contribute to path asymmetry, then that can and should be corrected
using the delayAsymmetry configuration variable.
 
> If we do not offer default delays directly in the driver, everybody will
> have to calibrate all boards just to have decent results, we will not
> have a good way to provide default delay numbers, and this will be
> different from what is done in other drivers.

Who says the other drivers are even remotely reasonable?  Not me.
I've been fighting this voodoo engineering all along, but people seem
to ignore me.

> I do understand that you have a concern that these numbers may change in
> future updates. But this has not been a problem in other drivers doing
> the same.

Wrong.  See the git history of the i210 driver.  Also the data sheets.

> But if this is still a concern, we can add a comment to say
> that these numbers must be treated as UAPI, and chancing them, may
> cause regressions on calibrated PHYs.

Comments will be ignored.  And when the next batch of developers comes
along, they will ignore your prohibition.

Thanks,
Richard
Richard Cochran March 13, 2022, 2:46 a.m. UTC | #27
On Sat, Mar 12, 2022 at 09:04:31PM +0100, Andrew Lunn wrote:
> Do these get passed to the kernel so the hardware can act on them, or
> are they used purely in userspace by ptp4l?

user space only.
 
> If they has passed to the kernel, could we provide a getter as well as
> a setter, so the defaults hard coded in the driver can be read back?

Any hard coded defaults in the kernel are a nuisance.

I mean, do you want user space to say,

   "okay, so I know the correct value is X.  But the drivers may offer
   random values according to kernel version.  So, I'll read out the
   driver value Y, and then apply X-Y."

Insanity.

(not to mention that this fails for older kernels without the proposed
interface)

Thanks,
Richard
Richard Cochran March 13, 2022, 4:04 a.m. UTC | #28
On Sat, Mar 12, 2022 at 08:36:20PM +0100, Allan W. Nielsen wrote:
> I did skim through the articles, and as you hinted he does find small
> latency differences across different packets. (but as I understood, very
> few PHYs was tested).

There is also previous work cited in those articles.
 
> But this is not really an argument for not having _default_ values
> hard-coded in the driver (or DT, but lets forget about DT for now).

You put them in the DTS.  That means you expect them to need changes.

DTS is the WRONG place for such things.

If your numbers are perfect, then do the corrections in silicon/firmware.

If the numbers aren't 100% perfect, then provide your customers with a
test report providing the recommended numbers.  Include a proper
explanation of the test methodology and assumptions used in your
analysis.  Heck, you can even given them linuxptp config snippets (and
other for other popular PTP stacks, Oregano, ixat, ptpd, etc)

Don't hard code random, changing numbers into kernel drivers.

Thanks,
Richard
Andrew Lunn March 13, 2022, 3:07 p.m. UTC | #29
On Sat, Mar 12, 2022 at 06:46:46PM -0800, Richard Cochran wrote:
> On Sat, Mar 12, 2022 at 09:04:31PM +0100, Andrew Lunn wrote:
> > Do these get passed to the kernel so the hardware can act on them, or
> > are they used purely in userspace by ptp4l?
> 
> user space only.
>  
> > If they has passed to the kernel, could we provide a getter as well as
> > a setter, so the defaults hard coded in the driver can be read back?
> 
> Any hard coded defaults in the kernel are a nuisance.
> 
> I mean, do you want user space to say,
> 
>    "okay, so I know the correct value is X.  But the drivers may offer
>    random values according to kernel version.  So, I'll read out the
>    driver value Y, and then apply X-Y."
> 
> Insanity.

No, i would not suggests that at all.

You quoted the man page and it says the default it zero. If there was
an API to ask the driver what correction it is doing, and an API to
offload the delay correction to the hardware, i would simply remove
the comment about the default being zero. If these calls return
-EOPNOTSUPP, then user space stays the same, and does actually use a
default of 0. If offload is supported, you can show the user the
current absolute values, and allow the user to set the absolute
values.

Anyway, it is clear you don't want the driver doing any correction, so
lets stop this discussion.

     Andrew
Allan W. Nielsen March 13, 2022, 7:37 p.m. UTC | #30
On Sun, Mar 13, 2022 at 04:07:24PM +0100, Andrew Lunn wrote:
> On Sat, Mar 12, 2022 at 06:46:46PM -0800, Richard Cochran wrote:
> > On Sat, Mar 12, 2022 at 09:04:31PM +0100, Andrew Lunn wrote:
> > > Do these get passed to the kernel so the hardware can act on them, or
> > > are they used purely in userspace by ptp4l?
> >
> > user space only.
I'm wondering if one-step will work if these correction values are not
applied to HW.

> > > If they has passed to the kernel, could we provide a getter as well as
> > > a setter, so the defaults hard coded in the driver can be read back?
> >
> > Any hard coded defaults in the kernel are a nuisance.
> >
> > I mean, do you want user space to say,
> >
> >    "okay, so I know the correct value is X.  But the drivers may offer
> >    random values according to kernel version.  So, I'll read out the
> >    driver value Y, and then apply X-Y."
> >
> > Insanity.
> 
> No, i would not suggests that at all.
> 
> You quoted the man page and it says the default it zero. If there was
> an API to ask the driver what correction it is doing, and an API to
> offload the delay correction to the hardware, i would simply remove
> the comment about the default being zero. If these calls return
> -EOPNOTSUPP, then user space stays the same, and does actually use a
> default of 0. If offload is supported, you can show the user the
> current absolute values, and allow the user to set the absolute
> values.
This sounds like a good approach to me (but I know it is not my opinion
you are asking for).

In all cases, if there is a desire to have such APIs, and let drivers
advertise default compensation values in this way, we can work on that.

> Anyway, it is clear you don't want the driver doing any correction, so
> lets stop this discussion.
> 
>      Andrew
Richard Cochran March 13, 2022, 8:03 p.m. UTC | #31
On Sun, Mar 13, 2022 at 08:37:44PM +0100, Allan W. Nielsen wrote:
> On Sun, Mar 13, 2022 at 04:07:24PM +0100, Andrew Lunn wrote:
> > On Sat, Mar 12, 2022 at 06:46:46PM -0800, Richard Cochran wrote:
> > > On Sat, Mar 12, 2022 at 09:04:31PM +0100, Andrew Lunn wrote:
> > > > Do these get passed to the kernel so the hardware can act on them, or
> > > > are they used purely in userspace by ptp4l?
> > >
> > > user space only.
> I'm wondering if one-step will work if these correction values are not
> applied to HW.

They are applied to the time stamps that are available to the
program.  So, no, they obviously won't be applied to one step Sync.
But then again, neither will the driver values.

(You could imagine a HW tstamp unit that includes a correction factor,
for example by adding the egress time stamp value to the correction
field.  But there are no APIs for that, and maybe no HW either.)

(BTW one step is overrated IMO)

Thanks,
Richard
Richard Cochran March 13, 2022, 8:07 p.m. UTC | #32
On Sun, Mar 13, 2022 at 04:07:24PM +0100, Andrew Lunn wrote:
> Anyway, it is clear you don't want the driver doing any correction, so
> lets stop this discussion.

Okay.  Just to be clear, as an end user, I've already been burnt by
well meaning but false corrections in the driver.  As maintainer I
must advocate for the end users.

Thanks,
Richard
diff mbox series

Patch

diff --git a/Documentation/devicetree/bindings/net/micrel.txt b/Documentation/devicetree/bindings/net/micrel.txt
index 8d157f0295a5..c5ab62c39133 100644
--- a/Documentation/devicetree/bindings/net/micrel.txt
+++ b/Documentation/devicetree/bindings/net/micrel.txt
@@ -45,3 +45,20 @@  Optional properties:
 
 	In fiber mode, auto-negotiation is disabled and the PHY can only work in
 	100base-fx (full and half duplex) modes.
+
+ - lan8814,ignore-ts: If present the PHY will not support timestamping.
+
+	This option acts as check whether Timestamping is supported by
+	hardware or not. LAN8814 phy support hardware tmestamping.
+
+ - lan8814,latency_rx_10: Configures Latency value of phy in ingress at 10 Mbps.
+
+ - lan8814,latency_tx_10: Configures Latency value of phy in egress at 10 Mbps.
+
+ - lan8814,latency_rx_100: Configures Latency value of phy in ingress at 100 Mbps.
+
+ - lan8814,latency_tx_100: Configures Latency value of phy in egress at 100 Mbps.
+
+ - lan8814,latency_rx_1000: Configures Latency value of phy in ingress at 1000 Mbps.
+
+ - lan8814,latency_tx_1000: Configures Latency value of phy in egress at 1000 Mbps.