diff mbox series

[v2] net: stmmac: correct MAC propagation delay

Message ID 20230719-stmmac_correct_mac_delay-v2-1-3366f38ee9a6@pengutronix.de (mailing list archive)
State Superseded
Commit 20bf98c94146eb6fe62177817cb32f53e72dd2e8
Delegated to: Netdev Maintainers
Headers show
Series [v2] net: stmmac: correct MAC propagation delay | expand

Checks

Context Check Description
netdev/series_format warning Single patches do not need cover letters; Target tree name not specified in the subject
netdev/tree_selection success Guessed tree name to be net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1369 this patch: 1369
netdev/cc_maintainers success CCed 13 of 13 maintainers
netdev/build_clang success Errors and warnings before: 1365 this patch: 1365
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 1392 this patch: 1392
netdev/checkpatch warning WARNING: line length of 84 exceeds 80 columns WARNING: line length of 86 exceeds 80 columns WARNING: line length of 88 exceeds 80 columns WARNING: line length of 89 exceeds 80 columns WARNING: line length of 92 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Johannes Zink July 24, 2023, 10:01 a.m. UTC
The IEEE1588 Standard specifies that the timestamps of Packets must be
captured when the PTP message timestamp point (leading edge of first
octet after the start of frame delimiter) crosses the boundary between
the node and the network. As the MAC latches the timestamp at an
internal point, the captured timestamp must be corrected for the
additional path latency, as described in the publicly available
datasheet [1].

This patch only corrects for the MAC-Internal delay, which can be read
out from the MAC_Ingress_Timestamp_Latency register, since the Phy
framework currently does not support querying the Phy ingress and egress
latency. The Closs Domain Crossing Circuits errors as indicated in [1]
are already being accounted in the stmmac_get_tx_hwtstamp() function and
are not corrected here.

As the Latency varies for different link speeds and MII
modes of operation, the correction value needs to be updated on each
link state change.

As the delay also causes a phase shift in the timestamp counter compared
to the rest of the network, this correction will also reduce phase error
when generating PPS outputs from the timestamp counter.

[1] i.MX8MP Reference Manual, rev.1 Section 11.7.2.5.3 "Timestamp
correction"

Signed-off-by: Johannes Zink <j.zink@pengutronix.de>
---
Changes in v2:
- fix builds for 32bit, this was found by the kernel build bot
	Reported-by: kernel test robot <lkp@intel.com>
	Closes: https://lore.kernel.org/oe-kbuild-all/202307200225.B8rmKQPN-lkp@intel.com/
- while at it also fix an overflow by shifting a u32 constant from macro by 10bits
  by casting the constant to u64
- Link to v1: https://lore.kernel.org/r/20230719-stmmac_correct_mac_delay-v1-1-768aa4d09334@pengutronix.de
---
 drivers/net/ethernet/stmicro/stmmac/hwif.h         |  3 ++
 .../net/ethernet/stmicro/stmmac/stmmac_hwtstamp.c  | 43 ++++++++++++++++++++++
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c  |  4 ++
 drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.h   |  6 +++
 4 files changed, 56 insertions(+)


---
base-commit: ba80e20d7f3f87dab3f9f0c0ca66e4b1fcc7be9f
change-id: 20230719-stmmac_correct_mac_delay-4278cb9d9bc1

Best regards,

Comments

Jakub Kicinski July 26, 2023, 3:06 a.m. UTC | #1
On Mon, 24 Jul 2023 12:01:31 +0200 Johannes Zink wrote:
> The IEEE1588 Standard specifies that the timestamps of Packets must be
> captured when the PTP message timestamp point (leading edge of first
> octet after the start of frame delimiter) crosses the boundary between
> the node and the network. As the MAC latches the timestamp at an
> internal point, the captured timestamp must be corrected for the
> additional path latency, as described in the publicly available
> datasheet [1].
> 
> This patch only corrects for the MAC-Internal delay, which can be read
> out from the MAC_Ingress_Timestamp_Latency register, since the Phy
> framework currently does not support querying the Phy ingress and egress
> latency. The Closs Domain Crossing Circuits errors as indicated in [1]
> are already being accounted in the stmmac_get_tx_hwtstamp() function and
> are not corrected here.
> 
> As the Latency varies for different link speeds and MII
> modes of operation, the correction value needs to be updated on each
> link state change.
> 
> As the delay also causes a phase shift in the timestamp counter compared
> to the rest of the network, this correction will also reduce phase error
> when generating PPS outputs from the timestamp counter.
> 
> [1] i.MX8MP Reference Manual, rev.1 Section 11.7.2.5.3 "Timestamp
> correction"

Hi Richard,

any opinion on this one?

The subject read to me like it's about *MII clocking delays, I figured
you may have missed it, too.

> diff --git a/drivers/net/ethernet/stmicro/stmmac/hwif.h b/drivers/net/ethernet/stmicro/stmmac/hwif.h
> index 6ee7cf07cfd7..95a4d6099577 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/hwif.h
> +++ b/drivers/net/ethernet/stmicro/stmmac/hwif.h
> @@ -536,6 +536,7 @@ struct stmmac_hwtimestamp {
>  	void (*get_systime) (void __iomem *ioaddr, u64 *systime);
>  	void (*get_ptptime)(void __iomem *ioaddr, u64 *ptp_time);
>  	void (*timestamp_interrupt)(struct stmmac_priv *priv);
> +	void (*correct_latency)(struct stmmac_priv *priv);
>  };
>  
>  #define stmmac_config_hw_tstamping(__priv, __args...) \
> @@ -554,6 +555,8 @@ struct stmmac_hwtimestamp {
>  	stmmac_do_void_callback(__priv, ptp, get_ptptime, __args)
>  #define stmmac_timestamp_interrupt(__priv, __args...) \
>  	stmmac_do_void_callback(__priv, ptp, timestamp_interrupt, __args)
> +#define stmmac_correct_latency(__priv, __args...) \
> +	stmmac_do_void_callback(__priv, ptp, correct_latency, __args)
>  
>  struct stmmac_tx_queue;
>  struct stmmac_rx_queue;
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_hwtstamp.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_hwtstamp.c
> index fa2c3ba7e9fe..7e0fa024e0ad 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_hwtstamp.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_hwtstamp.c
> @@ -60,6 +60,48 @@ static void config_sub_second_increment(void __iomem *ioaddr,
>  		*ssinc = data;
>  }
>  
> +static void correct_latency(struct stmmac_priv *priv)
> +{
> +	void __iomem *ioaddr = priv->ptpaddr;
> +	u32 reg_tsic, reg_tsicsns;
> +	u32 reg_tsec, reg_tsecsns;
> +	u64 scaled_ns;
> +	u32 val;
> +
> +	/* MAC-internal ingress latency */
> +	scaled_ns = readl(ioaddr + PTP_TS_INGR_LAT);
> +
> +	/* See section 11.7.2.5.3.1 "Ingress Correction" on page 4001 of
> +	 * i.MX8MP Applications Processor Reference Manual Rev. 1, 06/2021
> +	 */
> +	val = readl(ioaddr + PTP_TCR);
> +	if (val & PTP_TCR_TSCTRLSSR)
> +		/* nanoseconds field is in decimal format with granularity of 1ns/bit */
> +		scaled_ns = ((u64)NSEC_PER_SEC << 16) - scaled_ns;
> +	else
> +		/* nanoseconds field is in binary format with granularity of ~0.466ns/bit */
> +		scaled_ns = ((1ULL << 31) << 16) -
> +			DIV_U64_ROUND_CLOSEST(scaled_ns * PSEC_PER_NSEC, 466U);
> +
> +	reg_tsic = scaled_ns >> 16;
> +	reg_tsicsns = scaled_ns & 0xff00;
> +
> +	/* set bit 31 for 2's compliment */
> +	reg_tsic |= BIT(31);
> +
> +	writel(reg_tsic, ioaddr + PTP_TS_INGR_CORR_NS);
> +	writel(reg_tsicsns, ioaddr + PTP_TS_INGR_CORR_SNS);
> +
> +	/* MAC-internal egress latency */
> +	scaled_ns = readl(ioaddr + PTP_TS_EGR_LAT);
> +
> +	reg_tsec = scaled_ns >> 16;
> +	reg_tsecsns = scaled_ns & 0xff00;
> +
> +	writel(reg_tsec, ioaddr + PTP_TS_EGR_CORR_NS);
> +	writel(reg_tsecsns, ioaddr + PTP_TS_EGR_CORR_SNS);
> +}
> +
>  static int init_systime(void __iomem *ioaddr, u32 sec, u32 nsec)
>  {
>  	u32 value;
> @@ -221,4 +263,5 @@ const struct stmmac_hwtimestamp stmmac_ptp = {
>  	.get_systime = get_systime,
>  	.get_ptptime = get_ptptime,
>  	.timestamp_interrupt = timestamp_interrupt,
> +	.correct_latency = correct_latency,
>  };
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> index efe85b086abe..ee78e69e9ae3 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> @@ -909,6 +909,8 @@ static int stmmac_init_ptp(struct stmmac_priv *priv)
>  	priv->hwts_tx_en = 0;
>  	priv->hwts_rx_en = 0;
>  
> +	stmmac_correct_latency(priv, priv);
> +
>  	return 0;
>  }
>  
> @@ -1094,6 +1096,8 @@ static void stmmac_mac_link_up(struct phylink_config *config,
>  
>  	if (priv->dma_cap.fpesel)
>  		stmmac_fpe_link_state_handle(priv, true);
> +
> +	stmmac_correct_latency(priv, priv);
>  }
>  
>  static const struct phylink_mac_ops stmmac_phylink_mac_ops = {
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.h b/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.h
> index bf619295d079..d1fe4b46f162 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.h
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.h
> @@ -26,6 +26,12 @@
>  #define	PTP_ACR		0x40	/* Auxiliary Control Reg */
>  #define	PTP_ATNR	0x48	/* Auxiliary Timestamp - Nanoseconds Reg */
>  #define	PTP_ATSR	0x4c	/* Auxiliary Timestamp - Seconds Reg */
> +#define	PTP_TS_INGR_CORR_NS	0x58	/* Ingress timestamp correction nanoseconds */
> +#define	PTP_TS_EGR_CORR_NS	0x5C	/* Egress timestamp correction nanoseconds*/
> +#define	PTP_TS_INGR_CORR_SNS	0x60	/* Ingress timestamp correction subnanoseconds */
> +#define	PTP_TS_EGR_CORR_SNS	0x64	/* Egress timestamp correction subnanoseconds */
> +#define	PTP_TS_INGR_LAT	0x68	/* MAC internal Ingress Latency */
> +#define	PTP_TS_EGR_LAT	0x6c	/* MAC internal Egress Latency */
>  
>  #define	PTP_STNSUR_ADDSUB_SHIFT	31
>  #define	PTP_DIGITAL_ROLLOVER_MODE	0x3B9ACA00	/* 10e9-1 ns */
> 
> ---
> base-commit: ba80e20d7f3f87dab3f9f0c0ca66e4b1fcc7be9f
> change-id: 20230719-stmmac_correct_mac_delay-4278cb9d9bc1
> 
> Best regards,
Richard Cochran July 26, 2023, 3:22 a.m. UTC | #2
On Tue, Jul 25, 2023 at 08:06:06PM -0700, Jakub Kicinski wrote:

> any opinion on this one?

Yeah, I saw it, but I can't get excited about drivers trying to
correct delays.  I don't think this can be done automatically in a
reliable way, and so I expect that the few end users who are really
getting into the microseconds and nanoseconds will calibrate their
systems end to end, maybe even patching out this driver nonsense in
their kernels.

Having said that, I won't stand in the way of such driver stuff.
After all, who cares about a few microseconds time error one way or
the other?

Thanks,
Richard
Jakub Kicinski July 26, 2023, 3:39 a.m. UTC | #3
On Tue, 25 Jul 2023 20:22:53 -0700 Richard Cochran wrote:
> > any opinion on this one?  
> 
> Yeah, I saw it, but I can't get excited about drivers trying to
> correct delays.  I don't think this can be done automatically in a
> reliable way, and so I expect that the few end users who are really
> getting into the microseconds and nanoseconds will calibrate their
> systems end to end, maybe even patching out this driver nonsense in
> their kernels.
> 
> Having said that, I won't stand in the way of such driver stuff.
> After all, who cares about a few microseconds time error one way or
> the other?

I see :)
patchwork-bot+netdevbpf@kernel.org July 26, 2023, 3:50 a.m. UTC | #4
Hello:

This patch was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Mon, 24 Jul 2023 12:01:31 +0200 you wrote:
> The IEEE1588 Standard specifies that the timestamps of Packets must be
> captured when the PTP message timestamp point (leading edge of first
> octet after the start of frame delimiter) crosses the boundary between
> the node and the network. As the MAC latches the timestamp at an
> internal point, the captured timestamp must be corrected for the
> additional path latency, as described in the publicly available
> datasheet [1].
> 
> [...]

Here is the summary with links:
  - [v2] net: stmmac: correct MAC propagation delay
    https://git.kernel.org/netdev/net-next/c/20bf98c94146

You are awesome, thank you!
Marc Kleine-Budde July 26, 2023, 5:50 a.m. UTC | #5
On 25.07.2023 20:06:06, Jakub Kicinski wrote:
> On Mon, 24 Jul 2023 12:01:31 +0200 Johannes Zink wrote:
> > The IEEE1588 Standard specifies that the timestamps of Packets must be
> > captured when the PTP message timestamp point (leading edge of first
> > octet after the start of frame delimiter) crosses the boundary between
> > the node and the network. As the MAC latches the timestamp at an
> > internal point, the captured timestamp must be corrected for the
> > additional path latency, as described in the publicly available
> > datasheet [1].
> > 
> > This patch only corrects for the MAC-Internal delay, which can be read
> > out from the MAC_Ingress_Timestamp_Latency register, since the Phy
> > framework currently does not support querying the Phy ingress and egress
> > latency. The Closs Domain Crossing Circuits errors as indicated in [1]
> > are already being accounted in the stmmac_get_tx_hwtstamp() function and
> > are not corrected here.
> > 
> > As the Latency varies for different link speeds and MII
> > modes of operation, the correction value needs to be updated on each
> > link state change.
> > 
> > As the delay also causes a phase shift in the timestamp counter compared
> > to the rest of the network, this correction will also reduce phase error
> > when generating PPS outputs from the timestamp counter.
> > 
> > [1] i.MX8MP Reference Manual, rev.1 Section 11.7.2.5.3 "Timestamp
> > correction"
> 
> Hi Richard,
> 
> any opinion on this one?
> 
> The subject read to me like it's about *MII clocking delays, I figured
> you may have missed it, too.

The patch description clarifies what is being corrected, namely the
"MAC-internal delay, which can be read out from the
MAC_Ingress_Timestamp_Latency register".

The next step would be to correct PHY latency, but there is no support
for querying PHY latency yet.

regards,
Marc
Marc Kleine-Budde July 26, 2023, 6:04 a.m. UTC | #6
On 25.07.2023 20:22:53, Richard Cochran wrote:
> On Tue, Jul 25, 2023 at 08:06:06PM -0700, Jakub Kicinski wrote:
> 
> > any opinion on this one?
> 
> Yeah, I saw it, but I can't get excited about drivers trying to
> correct delays.  I don't think this can be done automatically in a
> reliable way,

At least the datasheet of the IP core tells to read the MAC delay from
the IP core (1), add the PHY delay (2) and the clock domain crossing
delay (3) and write it to the time stamp correction register.

(1) added in this patch
(2) future work
(3) already in the driver,
    though corrected manually when reading the timestamp

At least in our measurements the peer delay is better with this patch
(measured with ptp4linux) and the end-to-end delay (comparison of 2 PPS
signals on a scope) is also better.

> and so I expect that the few end users who are really
> getting into the microseconds and nanoseconds will calibrate their
> systems end to end, maybe even patching out this driver nonsense in
> their kernels.

What issues make you think this change/approach is counterproductive?

> Having said that, I won't stand in the way of such driver stuff.
> After all, who cares about a few microseconds time error one way or
> the other?

There are several companies that use or plan to use PTP in their
products and are striving to achieve sub-microsecond synchronization.

regards,
Marc
Johannes Zink July 26, 2023, 6:10 a.m. UTC | #7
Hi Richard,

On 7/26/23 05:22, Richard Cochran wrote:
> On Tue, Jul 25, 2023 at 08:06:06PM -0700, Jakub Kicinski wrote:
> 
>> any opinion on this one?
> 
> Yeah, I saw it, but I can't get excited about drivers trying to
> correct delays.  I don't think this can be done automatically in a
> reliable way, and so I expect that the few end users who are really
> getting into the microseconds and nanoseconds will calibrate their
> systems end to end, maybe even patching out this driver nonsense in
> their kernels.
> 

Thanks for your reading and commenting on my patch. As the commit message 
elaborates, the Patch corrects for the MAC-internal delays (this is neither PHY 
delays nor cable delays), that arise from the timestamps not being taken at the 
packet egress, but at an internal point in the MAC. The compensation values are 
read from internal registers of the hardware since these values depend on the 
actual operational mode of the MAC and on the MII link. I have done extensive 
testing, and as far as my results are concerned, this is reliable at least on 
the i.MX8MP Hardware I can access for testing. I would actually like correct 
this on other MACs too, but they are often poorly documented. I have to admit 
that the DWMAC is one of the first hardwares I encountered with proper 
documentation. The driver admittedly still has room for improvements - so here 
we go...

Nevertheless, there is still PHY delays to be corrected for, but I need to 
extend the PHY framework for querying the clause 45 registers to account for 
the PHY delays (which are even a larger factor of). I plan to send another 
series fixing this, but this still needs some cleanup being done.

Also on a side-note, "driver nonsense" sounds a bit harsh from someone always 
insisting that one should not compensate for bad drivers in the userspace stack 
and instead fixing driver and hardware issues in the kernel, don't you think?

> Having said that, I won't stand in the way of such driver stuff.
> After all, who cares about a few microseconds time error one way or
> the other?

I do, and so does my customer. If you want to reach sub-microsecond accuracy 
with a linuxptp setup (which is absolutely feasible on COTS hardware), you have 
to take these things into account. I did quite extensive tests, and measuring 
the peer delay as precisely as possible is one of the key steps in getting 
offsets down between physical nodes. As I use the PHCs to recover clocks with 
as low phase offset as possible, the peer delays matter, as they add phase 
error. At the moment, this patch reduces the offset of approx 150ns to <50ns in 
a real world application, which is not so bad for a few lines of code, i guess...

I don't want to kick off a lengthy discussion here (especially since Jakub 
already picked the patch to next), but maybe this mail can help for 
clarification in the future, when the next poor soul does work on the hwtstamps 
in the dwmac.

Thanks, also for keeping linuxptp going,
Johannes

> 
> Thanks,
> Richard
> 
>
Richard Cochran July 26, 2023, 3:34 p.m. UTC | #8
On Wed, Jul 26, 2023 at 08:04:37AM +0200, Marc Kleine-Budde wrote:

> At least the datasheet of the IP core tells to read the MAC delay from
> the IP core (1), add the PHY delay (2) and the clock domain crossing
> delay (3) and write it to the time stamp correction register.

That is great, until they change the data sheet.  Really, this happens.

Thanks,
Richard
Richard Cochran July 26, 2023, 3:43 p.m. UTC | #9
On Wed, Jul 26, 2023 at 08:10:35AM +0200, Johannes Zink wrote:

> Also on a side-note, "driver nonsense" sounds a bit harsh from someone
> always insisting that one should not compensate for bad drivers in the
> userspace stack and instead fixing driver and hardware issues in the kernel,
> don't you think?

Everything has its place.

The proper place to account for delay asymmetries is in the user space
configuration, for example in linuxptp you have

       delayAsymmetry
              The  time  difference in nanoseconds of the transmit and receive
              paths. This value should be positive when  the  server-to-client
              propagation  time  is  longer  and  negative when the client-to-
              server time is longer. The default is 0 nanoseconds.

       egressLatency
              Specifies the  difference  in  nanoseconds  between  the  actual
              transmission time at the reference plane and the reported trans‐
              mit time stamp. This value will be added to egress  time  stamps
              obtained from the hardware.  The default is 0.

       ingressLatency
              Specifies the difference in nanoseconds between the reported re‐
              ceive  time  stamp  and  the  actual reception time at reference
              plane. This value will be subtracted from  ingress  time  stamps
              obtained from the hardware.  The default is 0.

Trying to hard code those into the driver?  Good luck getting that
right for everyone.

BTW this driver is actually for an IP core used in many, many SoCs.

How many _other_ SoCs did you test your patch on?

Thanks,
Richard
Richard Cochran July 26, 2023, 6 p.m. UTC | #10
On Mon, Jul 24, 2023 at 12:01:31PM +0200, Johannes Zink wrote:

> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.h b/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.h
> index bf619295d079..d1fe4b46f162 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.h
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.h
> @@ -26,6 +26,12 @@
>  #define	PTP_ACR		0x40	/* Auxiliary Control Reg */
>  #define	PTP_ATNR	0x48	/* Auxiliary Timestamp - Nanoseconds Reg */
>  #define	PTP_ATSR	0x4c	/* Auxiliary Timestamp - Seconds Reg */
> +#define	PTP_TS_INGR_CORR_NS	0x58	/* Ingress timestamp correction nanoseconds */
> +#define	PTP_TS_EGR_CORR_NS	0x5C	/* Egress timestamp correction nanoseconds*/
> +#define	PTP_TS_INGR_CORR_SNS	0x60	/* Ingress timestamp correction subnanoseconds */
> +#define	PTP_TS_EGR_CORR_SNS	0x64	/* Egress timestamp correction subnanoseconds */

These two...

> +#define	PTP_TS_INGR_LAT	0x68	/* MAC internal Ingress Latency */
> +#define	PTP_TS_EGR_LAT	0x6c	/* MAC internal Egress Latency */

do not exist on earlier versions of the IP core.

I wonder what values are there?

Thanks,
Richard
Richard Cochran July 26, 2023, 8:57 p.m. UTC | #11
On Mon, Jul 24, 2023 at 12:01:31PM +0200, Johannes Zink wrote:

Earlier versions of the IP core return zero from these...

> +#define	PTP_TS_INGR_LAT	0x68	/* MAC internal Ingress Latency */
> +#define	PTP_TS_EGR_LAT	0x6c	/* MAC internal Egress Latency */

and so...

> +static void correct_latency(struct stmmac_priv *priv)
> +{
> +	void __iomem *ioaddr = priv->ptpaddr;
> +	u32 reg_tsic, reg_tsicsns;
> +	u32 reg_tsec, reg_tsecsns;
> +	u64 scaled_ns;
> +	u32 val;
> +
> +	/* MAC-internal ingress latency */
> +	scaled_ns = readl(ioaddr + PTP_TS_INGR_LAT);
> +
> +	/* See section 11.7.2.5.3.1 "Ingress Correction" on page 4001 of
> +	 * i.MX8MP Applications Processor Reference Manual Rev. 1, 06/2021
> +	 */
> +	val = readl(ioaddr + PTP_TCR);
> +	if (val & PTP_TCR_TSCTRLSSR)
> +		/* nanoseconds field is in decimal format with granularity of 1ns/bit */
> +		scaled_ns = ((u64)NSEC_PER_SEC << 16) - scaled_ns;
> +	else
> +		/* nanoseconds field is in binary format with granularity of ~0.466ns/bit */
> +		scaled_ns = ((1ULL << 31) << 16) -
> +			DIV_U64_ROUND_CLOSEST(scaled_ns * PSEC_PER_NSEC, 466U);
> +
> +	reg_tsic = scaled_ns >> 16;
> +	reg_tsicsns = scaled_ns & 0xff00;
> +
> +	/* set bit 31 for 2's compliment */
> +	reg_tsic |= BIT(31);
> +
> +	writel(reg_tsic, ioaddr + PTP_TS_INGR_CORR_NS);

here reg_tsic = 0x80000000 for a correction of -2.15 seconds!

@Jakub  Can you please revert this patch?

Thanks,
Richard
Johannes Zink July 27, 2023, 6:39 a.m. UTC | #12
Hi Richard,

On 7/26/23 17:43, Richard Cochran wrote:
> On Wed, Jul 26, 2023 at 08:10:35AM +0200, Johannes Zink wrote:
> 
>> Also on a side-note, "driver nonsense" sounds a bit harsh from someone
>> always insisting that one should not compensate for bad drivers in the
>> userspace stack and instead fixing driver and hardware issues in the kernel,
>> don't you think?
> 
> Everything has its place.
> 
> The proper place to account for delay asymmetries is in the user space
> configuration, for example in linuxptp you have
This is not about Delay Asymmetry, but about Additional Errors in Path Delay, 
namely MAC Ingress and Egress Delay.

> 
>         delayAsymmetry
>                The  time  difference in nanoseconds of the transmit and receive
>                paths. This value should be positive when  the  server-to-client
>                propagation  time  is  longer  and  negative when the client-to-
>                server time is longer. The default is 0 nanoseconds.
> 
>         egressLatency
>                Specifies the  difference  in  nanoseconds  between  the  actual
>                transmission time at the reference plane and the reported trans‐
>                mit time stamp. This value will be added to egress  time  stamps
>                obtained from the hardware.  The default is 0.
> >         ingressLatency
>                Specifies the difference in nanoseconds between the reported re‐
>                ceive  time  stamp  and  the  actual reception time at reference
>                plane. This value will be subtracted from  ingress  time  stamps
>                obtained from the hardware.  The default is 0.
For the PTP stack you could probably configure these in the stack, but fixing 
the delay in the driver also has the advantage of reducing phase offset error 
when doing clock revovery from the PHC.

> 
> Trying to hard code those into the driver?  Good luck getting that
> right for everyone.
That's why we don't hardcode the values but read them from the registers 
provided by the IP core.

> 
> BTW this driver is actually for an IP core used in many, many SoCs.
> 
> How many _other_ SoCs did you test your patch on?
> 
I don't have many available, thus as stated in the description: on the i.MX8MP 
only. That's why I am implementing my stuff in the imx glue code, you're 
welcome to help testing on other hardware if you have any at hand.

Best regards
Johannes

> Thanks,
> Richard
> 
> 
>
Johannes Zink July 27, 2023, 6:40 a.m. UTC | #13
Hi Richard,

On 7/26/23 17:34, Richard Cochran wrote:
> On Wed, Jul 26, 2023 at 08:04:37AM +0200, Marc Kleine-Budde wrote:
> 
>> At least the datasheet of the IP core tells to read the MAC delay from
>> the IP core (1), add the PHY delay (2) and the clock domain crossing
>> delay (3) and write it to the time stamp correction register.
> 
> That is great, until they change the data sheet.  Really, this happens.

I think I don't get your point here.

That's true for literally any register of any peripheral in a datasheet.
I think we can just stop doing driver development if we wait for a final
revision that is not changed any more. Datasheets change, and if they do we
update the driver.

Johannes


> 
> Thanks,
> Richard
> 
>
Johannes Zink July 27, 2023, 6:42 a.m. UTC | #14
Hi Richard,

On 7/26/23 20:00, Richard Cochran wrote:
> On Mon, Jul 24, 2023 at 12:01:31PM +0200, Johannes Zink wrote:
> 
>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.h b/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.h
>> index bf619295d079..d1fe4b46f162 100644
>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.h
>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.h
>> @@ -26,6 +26,12 @@
>>   #define	PTP_ACR		0x40	/* Auxiliary Control Reg */
>>   #define	PTP_ATNR	0x48	/* Auxiliary Timestamp - Nanoseconds Reg */
>>   #define	PTP_ATSR	0x4c	/* Auxiliary Timestamp - Seconds Reg */
>> +#define	PTP_TS_INGR_CORR_NS	0x58	/* Ingress timestamp correction nanoseconds */
>> +#define	PTP_TS_EGR_CORR_NS	0x5C	/* Egress timestamp correction nanoseconds*/
>> +#define	PTP_TS_INGR_CORR_SNS	0x60	/* Ingress timestamp correction subnanoseconds */
>> +#define	PTP_TS_EGR_CORR_SNS	0x64	/* Egress timestamp correction subnanoseconds */
> 
> These two...
> 
>> +#define	PTP_TS_INGR_LAT	0x68	/* MAC internal Ingress Latency */
>> +#define	PTP_TS_EGR_LAT	0x6c	/* MAC internal Egress Latency */
> 
> do not exist on earlier versions of the IP core.
> 
> I wonder what values are there?
> 

good catch, I think adding the register definition won't hurt, but if you feel 
more comfortable about it I can add them only for IP core version 5.

Johannes


> Thanks,
> Richard
>
Johannes Zink July 27, 2023, 6:55 a.m. UTC | #15
Hi,

On 7/27/23 08:39, Johannes Zink wrote:
> Hi Richard,
> 

[snip]


>> How many _other_ SoCs did you test your patch on?
>>
> I don't have many available, thus as stated in the description: on the i.MX8MP 
> only. That's why I am implementing my stuff in the imx glue code, you're 
> welcome to help testing on other hardware if you have any at hand.
> 

note: for v3 I am going to check if we have a dwmac v5 and won't call into the 
correction setup function otherwise.

Best regards
Johannes


> Best regards
> Johannes
> 
>> Thanks,
>> Richard
>>
>>
>>
> 
>
Kurt Kanzenbach July 27, 2023, 7:15 a.m. UTC | #16
Hi Johannes, Richard,

On Thu Jul 27 2023, Johannes Zink wrote:
>> BTW this driver is actually for an IP core used in many, many SoCs.
>> 
>> How many _other_ SoCs did you test your patch on?
>> 
> I don't have many available, thus as stated in the description: on the i.MX8MP 
> only. That's why I am implementing my stuff in the imx glue code, you're 
> welcome to help testing on other hardware if you have any at hand.

I can assist with testing on Intel real time platforms, stm32mp1 and
Cyclone V (and imx8mp). Just Cc me on the next the version of this
patch.

Thanks,
Kurt
Johannes Zink July 27, 2023, 7:18 a.m. UTC | #17
Hi Kurt,

On 7/27/23 09:15, Kurt Kanzenbach wrote:
> Hi Johannes, Richard,
> 
> On Thu Jul 27 2023, Johannes Zink wrote:
>>> BTW this driver is actually for an IP core used in many, many SoCs.
>>>
>>> How many _other_ SoCs did you test your patch on?
>>>
>> I don't have many available, thus as stated in the description: on the i.MX8MP
>> only. That's why I am implementing my stuff in the imx glue code, you're
>> welcome to help testing on other hardware if you have any at hand.
> 
> I can assist with testing on Intel real time platforms, stm32mp1 and
> Cyclone V (and imx8mp). Just Cc me on the next the version of this
> patch.

Thanks for your kind offer, I am going to CC you when I send my v3.

Best regards
Johannes

> 
> Thanks,
> Kurt
Johannes Zink July 27, 2023, 7:20 a.m. UTC | #18
Hi Richard,

On 7/26/23 22:57, Richard Cochran wrote:
> On Mon, Jul 24, 2023 at 12:01:31PM +0200, Johannes Zink wrote:
> 
> Earlier versions of the IP core return zero from these...
> 
>> +#define	PTP_TS_INGR_LAT	0x68	/* MAC internal Ingress Latency */
>> +#define	PTP_TS_EGR_LAT	0x6c	/* MAC internal Egress Latency */
> 

good catch. Gonna send a v3 with a check to and set the values for dwmac v5 only.

Best regards
Johannes


> and so...
> 
>> +static void correct_latency(struct stmmac_priv *priv)
>> +{
>> +	void __iomem *ioaddr = priv->ptpaddr;
>> +	u32 reg_tsic, reg_tsicsns;
>> +	u32 reg_tsec, reg_tsecsns;
>> +	u64 scaled_ns;
>> +	u32 val;
>> +
>> +	/* MAC-internal ingress latency */
>> +	scaled_ns = readl(ioaddr + PTP_TS_INGR_LAT);
>> +
>> +	/* See section 11.7.2.5.3.1 "Ingress Correction" on page 4001 of
>> +	 * i.MX8MP Applications Processor Reference Manual Rev. 1, 06/2021
>> +	 */
>> +	val = readl(ioaddr + PTP_TCR);
>> +	if (val & PTP_TCR_TSCTRLSSR)
>> +		/* nanoseconds field is in decimal format with granularity of 1ns/bit */
>> +		scaled_ns = ((u64)NSEC_PER_SEC << 16) - scaled_ns;
>> +	else
>> +		/* nanoseconds field is in binary format with granularity of ~0.466ns/bit */
>> +		scaled_ns = ((1ULL << 31) << 16) -
>> +			DIV_U64_ROUND_CLOSEST(scaled_ns * PSEC_PER_NSEC, 466U);
>> +
>> +	reg_tsic = scaled_ns >> 16;
>> +	reg_tsicsns = scaled_ns & 0xff00;
>> +
>> +	/* set bit 31 for 2's compliment */
>> +	reg_tsic |= BIT(31);
>> +
>> +	writel(reg_tsic, ioaddr + PTP_TS_INGR_CORR_NS);
> 
> here reg_tsic = 0x80000000 for a correction of -2.15 seconds! >
> @Jakub  Can you please revert this patch?
> 
> Thanks,
> Richard
> 
>
Johannes Zink July 27, 2023, 7:41 a.m. UTC | #19
Hi,

On 7/27/23 08:55, Johannes Zink wrote:
> Hi,
> 
> On 7/27/23 08:39, Johannes Zink wrote:
>> Hi Richard,
>>
> 
> [snip]
> 
> 
>>> How many _other_ SoCs did you test your patch on?
>>>
>> I don't have many available, thus as stated in the description: on the 
>> i.MX8MP only. That's why I am implementing my stuff in the imx glue code, 
>> you're welcome to help testing on other hardware if you have any at hand.

small correction to what I wrote earlier: it's not implemented in the gluecode, 
but in the general stmmac_hwtstamp. My bad, I added it to the gluecode in an 
early prototype version, but then tried to generalize it.

Johannes

> 
> note: for v3 I am going to check if we have a dwmac v5 and won't call into the 
> correction setup function otherwise.
> 
> Best regards
> Johannes
> 
> 
>> Best regards
>> Johannes
>>
>>> Thanks,
>>> Richard
>>>
>>>
>>>
>>
>>
>
Richard Cochran July 27, 2023, 1:30 p.m. UTC | #20
On Thu, Jul 27, 2023 at 08:40:51AM +0200, Johannes Zink wrote:
> Hi Richard,
> 
> On 7/26/23 17:34, Richard Cochran wrote:
> > That is great, until they change the data sheet.  Really, this happens.
> 
> I think I don't get your point here.
> 
> That's true for literally any register of any peripheral in a datasheet.
> I think we can just stop doing driver development if we wait for a final
> revision that is not changed any more. Datasheets change, and if they do we
> update the driver.

This is different than normal registers, because the values are a
guess as to what the latency in the hardware design is.

Here is how it works in practice:  Vendor first asks a summer intern to
measure the latency.  Intern does some kind of random measurement, and
that goes into silicon.  One year later, customers discover that the
values are bogus.  Vendor doesn't spin a new silicon revision just for
that.  If vendor is honest, a footnote appears in the errata that the
corrections are wrong.

Thanks,
Richard
Richard Cochran July 27, 2023, 1:34 p.m. UTC | #21
On Thu, Jul 27, 2023 at 08:42:52AM +0200, Johannes Zink wrote:

> good catch, I think adding the register definition won't hurt, but if you
> feel more comfortable about it I can add them only for IP core version 5.

Adding the offsets in the header is not the issue.

The issue is reading from these offsets when there is nothing there to
read!

Thanks,
Richard
Richard Cochran July 27, 2023, 1:36 p.m. UTC | #22
On Thu, Jul 27, 2023 at 09:20:10AM +0200, Johannes Zink wrote:
> Hi Richard,
> 
> On 7/26/23 22:57, Richard Cochran wrote:
> > On Mon, Jul 24, 2023 at 12:01:31PM +0200, Johannes Zink wrote:
> > 
> > Earlier versions of the IP core return zero from these...
> > 
> > > +#define	PTP_TS_INGR_LAT	0x68	/* MAC internal Ingress Latency */
> > > +#define	PTP_TS_EGR_LAT	0x6c	/* MAC internal Egress Latency */
> > 
> 
> good catch. Gonna send a v3 with a check to and set the values for dwmac v5 only.

AFAICT there is no feature bit that indicates the presence or absence
of these two registers.

Are you sure that *all* v5 IP cores have these?

I am not sure.

Thanks,
Richard
Johannes Zink July 31, 2023, 7 a.m. UTC | #23
Hi Richard,

On 7/27/23 15:36, Richard Cochran wrote:
> On Thu, Jul 27, 2023 at 09:20:10AM +0200, Johannes Zink wrote:
>> Hi Richard,
>>
>> On 7/26/23 22:57, Richard Cochran wrote:
>>> On Mon, Jul 24, 2023 at 12:01:31PM +0200, Johannes Zink wrote:
>>>
>>> Earlier versions of the IP core return zero from these...
>>>
>>>> +#define	PTP_TS_INGR_LAT	0x68	/* MAC internal Ingress Latency */
>>>> +#define	PTP_TS_EGR_LAT	0x6c	/* MAC internal Egress Latency */
>>>
>>
>> good catch. Gonna send a v3 with a check to and set the values for dwmac v5 only.
> 
> AFAICT there is no feature bit that indicates the presence or absence
> of these two registers.
> 
> Are you sure that *all* v5 IP cores have these?
> 
> I am not sure.

I cannot tell for sure either, since I have datasheets for the i.MX8MP only. 
Maybe Kurt has some insights here, as he has additional hardware available for 
testing?

Nevertheless, I am going to add a guard to only use the correction codepath on 
i.MX8MP in v3 for the time being, we can add other hardware later trivially if 
they support doing this.

Best regards
Johannes

> 
> Thanks,
> Richard
> 
>
Richard Cochran July 31, 2023, 1:44 p.m. UTC | #24
On Mon, Jul 31, 2023 at 09:00:29AM +0200, Johannes Zink wrote:

> I cannot tell for sure either, since I have datasheets for the i.MX8MP only.
> Maybe Kurt has some insights here, as he has additional hardware available
> for testing?

Maybe give the folks who make the dwc a call to clarify?
 
> Nevertheless, I am going to add a guard to only use the correction codepath
> on i.MX8MP in v3 for the time being, we can add other hardware later
> trivially if they support doing this.

Sure.

Thanks,
Richard
diff mbox series

Patch

diff --git a/drivers/net/ethernet/stmicro/stmmac/hwif.h b/drivers/net/ethernet/stmicro/stmmac/hwif.h
index 6ee7cf07cfd7..95a4d6099577 100644
--- a/drivers/net/ethernet/stmicro/stmmac/hwif.h
+++ b/drivers/net/ethernet/stmicro/stmmac/hwif.h
@@ -536,6 +536,7 @@  struct stmmac_hwtimestamp {
 	void (*get_systime) (void __iomem *ioaddr, u64 *systime);
 	void (*get_ptptime)(void __iomem *ioaddr, u64 *ptp_time);
 	void (*timestamp_interrupt)(struct stmmac_priv *priv);
+	void (*correct_latency)(struct stmmac_priv *priv);
 };
 
 #define stmmac_config_hw_tstamping(__priv, __args...) \
@@ -554,6 +555,8 @@  struct stmmac_hwtimestamp {
 	stmmac_do_void_callback(__priv, ptp, get_ptptime, __args)
 #define stmmac_timestamp_interrupt(__priv, __args...) \
 	stmmac_do_void_callback(__priv, ptp, timestamp_interrupt, __args)
+#define stmmac_correct_latency(__priv, __args...) \
+	stmmac_do_void_callback(__priv, ptp, correct_latency, __args)
 
 struct stmmac_tx_queue;
 struct stmmac_rx_queue;
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_hwtstamp.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_hwtstamp.c
index fa2c3ba7e9fe..7e0fa024e0ad 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_hwtstamp.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_hwtstamp.c
@@ -60,6 +60,48 @@  static void config_sub_second_increment(void __iomem *ioaddr,
 		*ssinc = data;
 }
 
+static void correct_latency(struct stmmac_priv *priv)
+{
+	void __iomem *ioaddr = priv->ptpaddr;
+	u32 reg_tsic, reg_tsicsns;
+	u32 reg_tsec, reg_tsecsns;
+	u64 scaled_ns;
+	u32 val;
+
+	/* MAC-internal ingress latency */
+	scaled_ns = readl(ioaddr + PTP_TS_INGR_LAT);
+
+	/* See section 11.7.2.5.3.1 "Ingress Correction" on page 4001 of
+	 * i.MX8MP Applications Processor Reference Manual Rev. 1, 06/2021
+	 */
+	val = readl(ioaddr + PTP_TCR);
+	if (val & PTP_TCR_TSCTRLSSR)
+		/* nanoseconds field is in decimal format with granularity of 1ns/bit */
+		scaled_ns = ((u64)NSEC_PER_SEC << 16) - scaled_ns;
+	else
+		/* nanoseconds field is in binary format with granularity of ~0.466ns/bit */
+		scaled_ns = ((1ULL << 31) << 16) -
+			DIV_U64_ROUND_CLOSEST(scaled_ns * PSEC_PER_NSEC, 466U);
+
+	reg_tsic = scaled_ns >> 16;
+	reg_tsicsns = scaled_ns & 0xff00;
+
+	/* set bit 31 for 2's compliment */
+	reg_tsic |= BIT(31);
+
+	writel(reg_tsic, ioaddr + PTP_TS_INGR_CORR_NS);
+	writel(reg_tsicsns, ioaddr + PTP_TS_INGR_CORR_SNS);
+
+	/* MAC-internal egress latency */
+	scaled_ns = readl(ioaddr + PTP_TS_EGR_LAT);
+
+	reg_tsec = scaled_ns >> 16;
+	reg_tsecsns = scaled_ns & 0xff00;
+
+	writel(reg_tsec, ioaddr + PTP_TS_EGR_CORR_NS);
+	writel(reg_tsecsns, ioaddr + PTP_TS_EGR_CORR_SNS);
+}
+
 static int init_systime(void __iomem *ioaddr, u32 sec, u32 nsec)
 {
 	u32 value;
@@ -221,4 +263,5 @@  const struct stmmac_hwtimestamp stmmac_ptp = {
 	.get_systime = get_systime,
 	.get_ptptime = get_ptptime,
 	.timestamp_interrupt = timestamp_interrupt,
+	.correct_latency = correct_latency,
 };
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index efe85b086abe..ee78e69e9ae3 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -909,6 +909,8 @@  static int stmmac_init_ptp(struct stmmac_priv *priv)
 	priv->hwts_tx_en = 0;
 	priv->hwts_rx_en = 0;
 
+	stmmac_correct_latency(priv, priv);
+
 	return 0;
 }
 
@@ -1094,6 +1096,8 @@  static void stmmac_mac_link_up(struct phylink_config *config,
 
 	if (priv->dma_cap.fpesel)
 		stmmac_fpe_link_state_handle(priv, true);
+
+	stmmac_correct_latency(priv, priv);
 }
 
 static const struct phylink_mac_ops stmmac_phylink_mac_ops = {
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.h b/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.h
index bf619295d079..d1fe4b46f162 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.h
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.h
@@ -26,6 +26,12 @@ 
 #define	PTP_ACR		0x40	/* Auxiliary Control Reg */
 #define	PTP_ATNR	0x48	/* Auxiliary Timestamp - Nanoseconds Reg */
 #define	PTP_ATSR	0x4c	/* Auxiliary Timestamp - Seconds Reg */
+#define	PTP_TS_INGR_CORR_NS	0x58	/* Ingress timestamp correction nanoseconds */
+#define	PTP_TS_EGR_CORR_NS	0x5C	/* Egress timestamp correction nanoseconds*/
+#define	PTP_TS_INGR_CORR_SNS	0x60	/* Ingress timestamp correction subnanoseconds */
+#define	PTP_TS_EGR_CORR_SNS	0x64	/* Egress timestamp correction subnanoseconds */
+#define	PTP_TS_INGR_LAT	0x68	/* MAC internal Ingress Latency */
+#define	PTP_TS_EGR_LAT	0x6c	/* MAC internal Egress Latency */
 
 #define	PTP_STNSUR_ADDSUB_SHIFT	31
 #define	PTP_DIGITAL_ROLLOVER_MODE	0x3B9ACA00	/* 10e9-1 ns */