Message ID | 20230719-stmmac_correct_mac_delay-v2-1-3366f38ee9a6@pengutronix.de (mailing list archive) |
---|---|
State | Superseded |
Commit | 20bf98c94146eb6fe62177817cb32f53e72dd2e8 |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [v2] net: stmmac: correct MAC propagation delay | expand |
On Mon, 24 Jul 2023 12:01:31 +0200 Johannes Zink wrote: > The IEEE1588 Standard specifies that the timestamps of Packets must be > captured when the PTP message timestamp point (leading edge of first > octet after the start of frame delimiter) crosses the boundary between > the node and the network. As the MAC latches the timestamp at an > internal point, the captured timestamp must be corrected for the > additional path latency, as described in the publicly available > datasheet [1]. > > This patch only corrects for the MAC-Internal delay, which can be read > out from the MAC_Ingress_Timestamp_Latency register, since the Phy > framework currently does not support querying the Phy ingress and egress > latency. The Closs Domain Crossing Circuits errors as indicated in [1] > are already being accounted in the stmmac_get_tx_hwtstamp() function and > are not corrected here. > > As the Latency varies for different link speeds and MII > modes of operation, the correction value needs to be updated on each > link state change. > > As the delay also causes a phase shift in the timestamp counter compared > to the rest of the network, this correction will also reduce phase error > when generating PPS outputs from the timestamp counter. > > [1] i.MX8MP Reference Manual, rev.1 Section 11.7.2.5.3 "Timestamp > correction" Hi Richard, any opinion on this one? The subject read to me like it's about *MII clocking delays, I figured you may have missed it, too. > diff --git a/drivers/net/ethernet/stmicro/stmmac/hwif.h b/drivers/net/ethernet/stmicro/stmmac/hwif.h > index 6ee7cf07cfd7..95a4d6099577 100644 > --- a/drivers/net/ethernet/stmicro/stmmac/hwif.h > +++ b/drivers/net/ethernet/stmicro/stmmac/hwif.h > @@ -536,6 +536,7 @@ struct stmmac_hwtimestamp { > void (*get_systime) (void __iomem *ioaddr, u64 *systime); > void (*get_ptptime)(void __iomem *ioaddr, u64 *ptp_time); > void (*timestamp_interrupt)(struct stmmac_priv *priv); > + void (*correct_latency)(struct stmmac_priv *priv); > }; > > #define stmmac_config_hw_tstamping(__priv, __args...) \ > @@ -554,6 +555,8 @@ struct stmmac_hwtimestamp { > stmmac_do_void_callback(__priv, ptp, get_ptptime, __args) > #define stmmac_timestamp_interrupt(__priv, __args...) \ > stmmac_do_void_callback(__priv, ptp, timestamp_interrupt, __args) > +#define stmmac_correct_latency(__priv, __args...) \ > + stmmac_do_void_callback(__priv, ptp, correct_latency, __args) > > struct stmmac_tx_queue; > struct stmmac_rx_queue; > diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_hwtstamp.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_hwtstamp.c > index fa2c3ba7e9fe..7e0fa024e0ad 100644 > --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_hwtstamp.c > +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_hwtstamp.c > @@ -60,6 +60,48 @@ static void config_sub_second_increment(void __iomem *ioaddr, > *ssinc = data; > } > > +static void correct_latency(struct stmmac_priv *priv) > +{ > + void __iomem *ioaddr = priv->ptpaddr; > + u32 reg_tsic, reg_tsicsns; > + u32 reg_tsec, reg_tsecsns; > + u64 scaled_ns; > + u32 val; > + > + /* MAC-internal ingress latency */ > + scaled_ns = readl(ioaddr + PTP_TS_INGR_LAT); > + > + /* See section 11.7.2.5.3.1 "Ingress Correction" on page 4001 of > + * i.MX8MP Applications Processor Reference Manual Rev. 1, 06/2021 > + */ > + val = readl(ioaddr + PTP_TCR); > + if (val & PTP_TCR_TSCTRLSSR) > + /* nanoseconds field is in decimal format with granularity of 1ns/bit */ > + scaled_ns = ((u64)NSEC_PER_SEC << 16) - scaled_ns; > + else > + /* nanoseconds field is in binary format with granularity of ~0.466ns/bit */ > + scaled_ns = ((1ULL << 31) << 16) - > + DIV_U64_ROUND_CLOSEST(scaled_ns * PSEC_PER_NSEC, 466U); > + > + reg_tsic = scaled_ns >> 16; > + reg_tsicsns = scaled_ns & 0xff00; > + > + /* set bit 31 for 2's compliment */ > + reg_tsic |= BIT(31); > + > + writel(reg_tsic, ioaddr + PTP_TS_INGR_CORR_NS); > + writel(reg_tsicsns, ioaddr + PTP_TS_INGR_CORR_SNS); > + > + /* MAC-internal egress latency */ > + scaled_ns = readl(ioaddr + PTP_TS_EGR_LAT); > + > + reg_tsec = scaled_ns >> 16; > + reg_tsecsns = scaled_ns & 0xff00; > + > + writel(reg_tsec, ioaddr + PTP_TS_EGR_CORR_NS); > + writel(reg_tsecsns, ioaddr + PTP_TS_EGR_CORR_SNS); > +} > + > static int init_systime(void __iomem *ioaddr, u32 sec, u32 nsec) > { > u32 value; > @@ -221,4 +263,5 @@ const struct stmmac_hwtimestamp stmmac_ptp = { > .get_systime = get_systime, > .get_ptptime = get_ptptime, > .timestamp_interrupt = timestamp_interrupt, > + .correct_latency = correct_latency, > }; > diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c > index efe85b086abe..ee78e69e9ae3 100644 > --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c > +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c > @@ -909,6 +909,8 @@ static int stmmac_init_ptp(struct stmmac_priv *priv) > priv->hwts_tx_en = 0; > priv->hwts_rx_en = 0; > > + stmmac_correct_latency(priv, priv); > + > return 0; > } > > @@ -1094,6 +1096,8 @@ static void stmmac_mac_link_up(struct phylink_config *config, > > if (priv->dma_cap.fpesel) > stmmac_fpe_link_state_handle(priv, true); > + > + stmmac_correct_latency(priv, priv); > } > > static const struct phylink_mac_ops stmmac_phylink_mac_ops = { > diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.h b/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.h > index bf619295d079..d1fe4b46f162 100644 > --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.h > +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.h > @@ -26,6 +26,12 @@ > #define PTP_ACR 0x40 /* Auxiliary Control Reg */ > #define PTP_ATNR 0x48 /* Auxiliary Timestamp - Nanoseconds Reg */ > #define PTP_ATSR 0x4c /* Auxiliary Timestamp - Seconds Reg */ > +#define PTP_TS_INGR_CORR_NS 0x58 /* Ingress timestamp correction nanoseconds */ > +#define PTP_TS_EGR_CORR_NS 0x5C /* Egress timestamp correction nanoseconds*/ > +#define PTP_TS_INGR_CORR_SNS 0x60 /* Ingress timestamp correction subnanoseconds */ > +#define PTP_TS_EGR_CORR_SNS 0x64 /* Egress timestamp correction subnanoseconds */ > +#define PTP_TS_INGR_LAT 0x68 /* MAC internal Ingress Latency */ > +#define PTP_TS_EGR_LAT 0x6c /* MAC internal Egress Latency */ > > #define PTP_STNSUR_ADDSUB_SHIFT 31 > #define PTP_DIGITAL_ROLLOVER_MODE 0x3B9ACA00 /* 10e9-1 ns */ > > --- > base-commit: ba80e20d7f3f87dab3f9f0c0ca66e4b1fcc7be9f > change-id: 20230719-stmmac_correct_mac_delay-4278cb9d9bc1 > > Best regards,
On Tue, Jul 25, 2023 at 08:06:06PM -0700, Jakub Kicinski wrote:
> any opinion on this one?
Yeah, I saw it, but I can't get excited about drivers trying to
correct delays. I don't think this can be done automatically in a
reliable way, and so I expect that the few end users who are really
getting into the microseconds and nanoseconds will calibrate their
systems end to end, maybe even patching out this driver nonsense in
their kernels.
Having said that, I won't stand in the way of such driver stuff.
After all, who cares about a few microseconds time error one way or
the other?
Thanks,
Richard
On Tue, 25 Jul 2023 20:22:53 -0700 Richard Cochran wrote: > > any opinion on this one? > > Yeah, I saw it, but I can't get excited about drivers trying to > correct delays. I don't think this can be done automatically in a > reliable way, and so I expect that the few end users who are really > getting into the microseconds and nanoseconds will calibrate their > systems end to end, maybe even patching out this driver nonsense in > their kernels. > > Having said that, I won't stand in the way of such driver stuff. > After all, who cares about a few microseconds time error one way or > the other? I see :)
Hello: This patch was applied to netdev/net-next.git (main) by Jakub Kicinski <kuba@kernel.org>: On Mon, 24 Jul 2023 12:01:31 +0200 you wrote: > The IEEE1588 Standard specifies that the timestamps of Packets must be > captured when the PTP message timestamp point (leading edge of first > octet after the start of frame delimiter) crosses the boundary between > the node and the network. As the MAC latches the timestamp at an > internal point, the captured timestamp must be corrected for the > additional path latency, as described in the publicly available > datasheet [1]. > > [...] Here is the summary with links: - [v2] net: stmmac: correct MAC propagation delay https://git.kernel.org/netdev/net-next/c/20bf98c94146 You are awesome, thank you!
On 25.07.2023 20:06:06, Jakub Kicinski wrote: > On Mon, 24 Jul 2023 12:01:31 +0200 Johannes Zink wrote: > > The IEEE1588 Standard specifies that the timestamps of Packets must be > > captured when the PTP message timestamp point (leading edge of first > > octet after the start of frame delimiter) crosses the boundary between > > the node and the network. As the MAC latches the timestamp at an > > internal point, the captured timestamp must be corrected for the > > additional path latency, as described in the publicly available > > datasheet [1]. > > > > This patch only corrects for the MAC-Internal delay, which can be read > > out from the MAC_Ingress_Timestamp_Latency register, since the Phy > > framework currently does not support querying the Phy ingress and egress > > latency. The Closs Domain Crossing Circuits errors as indicated in [1] > > are already being accounted in the stmmac_get_tx_hwtstamp() function and > > are not corrected here. > > > > As the Latency varies for different link speeds and MII > > modes of operation, the correction value needs to be updated on each > > link state change. > > > > As the delay also causes a phase shift in the timestamp counter compared > > to the rest of the network, this correction will also reduce phase error > > when generating PPS outputs from the timestamp counter. > > > > [1] i.MX8MP Reference Manual, rev.1 Section 11.7.2.5.3 "Timestamp > > correction" > > Hi Richard, > > any opinion on this one? > > The subject read to me like it's about *MII clocking delays, I figured > you may have missed it, too. The patch description clarifies what is being corrected, namely the "MAC-internal delay, which can be read out from the MAC_Ingress_Timestamp_Latency register". The next step would be to correct PHY latency, but there is no support for querying PHY latency yet. regards, Marc
On 25.07.2023 20:22:53, Richard Cochran wrote: > On Tue, Jul 25, 2023 at 08:06:06PM -0700, Jakub Kicinski wrote: > > > any opinion on this one? > > Yeah, I saw it, but I can't get excited about drivers trying to > correct delays. I don't think this can be done automatically in a > reliable way, At least the datasheet of the IP core tells to read the MAC delay from the IP core (1), add the PHY delay (2) and the clock domain crossing delay (3) and write it to the time stamp correction register. (1) added in this patch (2) future work (3) already in the driver, though corrected manually when reading the timestamp At least in our measurements the peer delay is better with this patch (measured with ptp4linux) and the end-to-end delay (comparison of 2 PPS signals on a scope) is also better. > and so I expect that the few end users who are really > getting into the microseconds and nanoseconds will calibrate their > systems end to end, maybe even patching out this driver nonsense in > their kernels. What issues make you think this change/approach is counterproductive? > Having said that, I won't stand in the way of such driver stuff. > After all, who cares about a few microseconds time error one way or > the other? There are several companies that use or plan to use PTP in their products and are striving to achieve sub-microsecond synchronization. regards, Marc
Hi Richard, On 7/26/23 05:22, Richard Cochran wrote: > On Tue, Jul 25, 2023 at 08:06:06PM -0700, Jakub Kicinski wrote: > >> any opinion on this one? > > Yeah, I saw it, but I can't get excited about drivers trying to > correct delays. I don't think this can be done automatically in a > reliable way, and so I expect that the few end users who are really > getting into the microseconds and nanoseconds will calibrate their > systems end to end, maybe even patching out this driver nonsense in > their kernels. > Thanks for your reading and commenting on my patch. As the commit message elaborates, the Patch corrects for the MAC-internal delays (this is neither PHY delays nor cable delays), that arise from the timestamps not being taken at the packet egress, but at an internal point in the MAC. The compensation values are read from internal registers of the hardware since these values depend on the actual operational mode of the MAC and on the MII link. I have done extensive testing, and as far as my results are concerned, this is reliable at least on the i.MX8MP Hardware I can access for testing. I would actually like correct this on other MACs too, but they are often poorly documented. I have to admit that the DWMAC is one of the first hardwares I encountered with proper documentation. The driver admittedly still has room for improvements - so here we go... Nevertheless, there is still PHY delays to be corrected for, but I need to extend the PHY framework for querying the clause 45 registers to account for the PHY delays (which are even a larger factor of). I plan to send another series fixing this, but this still needs some cleanup being done. Also on a side-note, "driver nonsense" sounds a bit harsh from someone always insisting that one should not compensate for bad drivers in the userspace stack and instead fixing driver and hardware issues in the kernel, don't you think? > Having said that, I won't stand in the way of such driver stuff. > After all, who cares about a few microseconds time error one way or > the other? I do, and so does my customer. If you want to reach sub-microsecond accuracy with a linuxptp setup (which is absolutely feasible on COTS hardware), you have to take these things into account. I did quite extensive tests, and measuring the peer delay as precisely as possible is one of the key steps in getting offsets down between physical nodes. As I use the PHCs to recover clocks with as low phase offset as possible, the peer delays matter, as they add phase error. At the moment, this patch reduces the offset of approx 150ns to <50ns in a real world application, which is not so bad for a few lines of code, i guess... I don't want to kick off a lengthy discussion here (especially since Jakub already picked the patch to next), but maybe this mail can help for clarification in the future, when the next poor soul does work on the hwtstamps in the dwmac. Thanks, also for keeping linuxptp going, Johannes > > Thanks, > Richard > >
On Wed, Jul 26, 2023 at 08:04:37AM +0200, Marc Kleine-Budde wrote: > At least the datasheet of the IP core tells to read the MAC delay from > the IP core (1), add the PHY delay (2) and the clock domain crossing > delay (3) and write it to the time stamp correction register. That is great, until they change the data sheet. Really, this happens. Thanks, Richard
On Wed, Jul 26, 2023 at 08:10:35AM +0200, Johannes Zink wrote: > Also on a side-note, "driver nonsense" sounds a bit harsh from someone > always insisting that one should not compensate for bad drivers in the > userspace stack and instead fixing driver and hardware issues in the kernel, > don't you think? Everything has its place. The proper place to account for delay asymmetries is in the user space configuration, for example in linuxptp you have delayAsymmetry The time difference in nanoseconds of the transmit and receive paths. This value should be positive when the server-to-client propagation time is longer and negative when the client-to- server time is longer. The default is 0 nanoseconds. egressLatency Specifies the difference in nanoseconds between the actual transmission time at the reference plane and the reported trans‐ mit time stamp. This value will be added to egress time stamps obtained from the hardware. The default is 0. ingressLatency Specifies the difference in nanoseconds between the reported re‐ ceive time stamp and the actual reception time at reference plane. This value will be subtracted from ingress time stamps obtained from the hardware. The default is 0. Trying to hard code those into the driver? Good luck getting that right for everyone. BTW this driver is actually for an IP core used in many, many SoCs. How many _other_ SoCs did you test your patch on? Thanks, Richard
On Mon, Jul 24, 2023 at 12:01:31PM +0200, Johannes Zink wrote: > diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.h b/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.h > index bf619295d079..d1fe4b46f162 100644 > --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.h > +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.h > @@ -26,6 +26,12 @@ > #define PTP_ACR 0x40 /* Auxiliary Control Reg */ > #define PTP_ATNR 0x48 /* Auxiliary Timestamp - Nanoseconds Reg */ > #define PTP_ATSR 0x4c /* Auxiliary Timestamp - Seconds Reg */ > +#define PTP_TS_INGR_CORR_NS 0x58 /* Ingress timestamp correction nanoseconds */ > +#define PTP_TS_EGR_CORR_NS 0x5C /* Egress timestamp correction nanoseconds*/ > +#define PTP_TS_INGR_CORR_SNS 0x60 /* Ingress timestamp correction subnanoseconds */ > +#define PTP_TS_EGR_CORR_SNS 0x64 /* Egress timestamp correction subnanoseconds */ These two... > +#define PTP_TS_INGR_LAT 0x68 /* MAC internal Ingress Latency */ > +#define PTP_TS_EGR_LAT 0x6c /* MAC internal Egress Latency */ do not exist on earlier versions of the IP core. I wonder what values are there? Thanks, Richard
On Mon, Jul 24, 2023 at 12:01:31PM +0200, Johannes Zink wrote: Earlier versions of the IP core return zero from these... > +#define PTP_TS_INGR_LAT 0x68 /* MAC internal Ingress Latency */ > +#define PTP_TS_EGR_LAT 0x6c /* MAC internal Egress Latency */ and so... > +static void correct_latency(struct stmmac_priv *priv) > +{ > + void __iomem *ioaddr = priv->ptpaddr; > + u32 reg_tsic, reg_tsicsns; > + u32 reg_tsec, reg_tsecsns; > + u64 scaled_ns; > + u32 val; > + > + /* MAC-internal ingress latency */ > + scaled_ns = readl(ioaddr + PTP_TS_INGR_LAT); > + > + /* See section 11.7.2.5.3.1 "Ingress Correction" on page 4001 of > + * i.MX8MP Applications Processor Reference Manual Rev. 1, 06/2021 > + */ > + val = readl(ioaddr + PTP_TCR); > + if (val & PTP_TCR_TSCTRLSSR) > + /* nanoseconds field is in decimal format with granularity of 1ns/bit */ > + scaled_ns = ((u64)NSEC_PER_SEC << 16) - scaled_ns; > + else > + /* nanoseconds field is in binary format with granularity of ~0.466ns/bit */ > + scaled_ns = ((1ULL << 31) << 16) - > + DIV_U64_ROUND_CLOSEST(scaled_ns * PSEC_PER_NSEC, 466U); > + > + reg_tsic = scaled_ns >> 16; > + reg_tsicsns = scaled_ns & 0xff00; > + > + /* set bit 31 for 2's compliment */ > + reg_tsic |= BIT(31); > + > + writel(reg_tsic, ioaddr + PTP_TS_INGR_CORR_NS); here reg_tsic = 0x80000000 for a correction of -2.15 seconds! @Jakub Can you please revert this patch? Thanks, Richard
Hi Richard, On 7/26/23 17:43, Richard Cochran wrote: > On Wed, Jul 26, 2023 at 08:10:35AM +0200, Johannes Zink wrote: > >> Also on a side-note, "driver nonsense" sounds a bit harsh from someone >> always insisting that one should not compensate for bad drivers in the >> userspace stack and instead fixing driver and hardware issues in the kernel, >> don't you think? > > Everything has its place. > > The proper place to account for delay asymmetries is in the user space > configuration, for example in linuxptp you have This is not about Delay Asymmetry, but about Additional Errors in Path Delay, namely MAC Ingress and Egress Delay. > > delayAsymmetry > The time difference in nanoseconds of the transmit and receive > paths. This value should be positive when the server-to-client > propagation time is longer and negative when the client-to- > server time is longer. The default is 0 nanoseconds. > > egressLatency > Specifies the difference in nanoseconds between the actual > transmission time at the reference plane and the reported trans‐ > mit time stamp. This value will be added to egress time stamps > obtained from the hardware. The default is 0. > > ingressLatency > Specifies the difference in nanoseconds between the reported re‐ > ceive time stamp and the actual reception time at reference > plane. This value will be subtracted from ingress time stamps > obtained from the hardware. The default is 0. For the PTP stack you could probably configure these in the stack, but fixing the delay in the driver also has the advantage of reducing phase offset error when doing clock revovery from the PHC. > > Trying to hard code those into the driver? Good luck getting that > right for everyone. That's why we don't hardcode the values but read them from the registers provided by the IP core. > > BTW this driver is actually for an IP core used in many, many SoCs. > > How many _other_ SoCs did you test your patch on? > I don't have many available, thus as stated in the description: on the i.MX8MP only. That's why I am implementing my stuff in the imx glue code, you're welcome to help testing on other hardware if you have any at hand. Best regards Johannes > Thanks, > Richard > > >
Hi Richard, On 7/26/23 17:34, Richard Cochran wrote: > On Wed, Jul 26, 2023 at 08:04:37AM +0200, Marc Kleine-Budde wrote: > >> At least the datasheet of the IP core tells to read the MAC delay from >> the IP core (1), add the PHY delay (2) and the clock domain crossing >> delay (3) and write it to the time stamp correction register. > > That is great, until they change the data sheet. Really, this happens. I think I don't get your point here. That's true for literally any register of any peripheral in a datasheet. I think we can just stop doing driver development if we wait for a final revision that is not changed any more. Datasheets change, and if they do we update the driver. Johannes > > Thanks, > Richard > >
Hi Richard, On 7/26/23 20:00, Richard Cochran wrote: > On Mon, Jul 24, 2023 at 12:01:31PM +0200, Johannes Zink wrote: > >> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.h b/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.h >> index bf619295d079..d1fe4b46f162 100644 >> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.h >> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.h >> @@ -26,6 +26,12 @@ >> #define PTP_ACR 0x40 /* Auxiliary Control Reg */ >> #define PTP_ATNR 0x48 /* Auxiliary Timestamp - Nanoseconds Reg */ >> #define PTP_ATSR 0x4c /* Auxiliary Timestamp - Seconds Reg */ >> +#define PTP_TS_INGR_CORR_NS 0x58 /* Ingress timestamp correction nanoseconds */ >> +#define PTP_TS_EGR_CORR_NS 0x5C /* Egress timestamp correction nanoseconds*/ >> +#define PTP_TS_INGR_CORR_SNS 0x60 /* Ingress timestamp correction subnanoseconds */ >> +#define PTP_TS_EGR_CORR_SNS 0x64 /* Egress timestamp correction subnanoseconds */ > > These two... > >> +#define PTP_TS_INGR_LAT 0x68 /* MAC internal Ingress Latency */ >> +#define PTP_TS_EGR_LAT 0x6c /* MAC internal Egress Latency */ > > do not exist on earlier versions of the IP core. > > I wonder what values are there? > good catch, I think adding the register definition won't hurt, but if you feel more comfortable about it I can add them only for IP core version 5. Johannes > Thanks, > Richard >
Hi, On 7/27/23 08:39, Johannes Zink wrote: > Hi Richard, > [snip] >> How many _other_ SoCs did you test your patch on? >> > I don't have many available, thus as stated in the description: on the i.MX8MP > only. That's why I am implementing my stuff in the imx glue code, you're > welcome to help testing on other hardware if you have any at hand. > note: for v3 I am going to check if we have a dwmac v5 and won't call into the correction setup function otherwise. Best regards Johannes > Best regards > Johannes > >> Thanks, >> Richard >> >> >> > >
Hi Johannes, Richard, On Thu Jul 27 2023, Johannes Zink wrote: >> BTW this driver is actually for an IP core used in many, many SoCs. >> >> How many _other_ SoCs did you test your patch on? >> > I don't have many available, thus as stated in the description: on the i.MX8MP > only. That's why I am implementing my stuff in the imx glue code, you're > welcome to help testing on other hardware if you have any at hand. I can assist with testing on Intel real time platforms, stm32mp1 and Cyclone V (and imx8mp). Just Cc me on the next the version of this patch. Thanks, Kurt
Hi Kurt, On 7/27/23 09:15, Kurt Kanzenbach wrote: > Hi Johannes, Richard, > > On Thu Jul 27 2023, Johannes Zink wrote: >>> BTW this driver is actually for an IP core used in many, many SoCs. >>> >>> How many _other_ SoCs did you test your patch on? >>> >> I don't have many available, thus as stated in the description: on the i.MX8MP >> only. That's why I am implementing my stuff in the imx glue code, you're >> welcome to help testing on other hardware if you have any at hand. > > I can assist with testing on Intel real time platforms, stm32mp1 and > Cyclone V (and imx8mp). Just Cc me on the next the version of this > patch. Thanks for your kind offer, I am going to CC you when I send my v3. Best regards Johannes > > Thanks, > Kurt
Hi Richard, On 7/26/23 22:57, Richard Cochran wrote: > On Mon, Jul 24, 2023 at 12:01:31PM +0200, Johannes Zink wrote: > > Earlier versions of the IP core return zero from these... > >> +#define PTP_TS_INGR_LAT 0x68 /* MAC internal Ingress Latency */ >> +#define PTP_TS_EGR_LAT 0x6c /* MAC internal Egress Latency */ > good catch. Gonna send a v3 with a check to and set the values for dwmac v5 only. Best regards Johannes > and so... > >> +static void correct_latency(struct stmmac_priv *priv) >> +{ >> + void __iomem *ioaddr = priv->ptpaddr; >> + u32 reg_tsic, reg_tsicsns; >> + u32 reg_tsec, reg_tsecsns; >> + u64 scaled_ns; >> + u32 val; >> + >> + /* MAC-internal ingress latency */ >> + scaled_ns = readl(ioaddr + PTP_TS_INGR_LAT); >> + >> + /* See section 11.7.2.5.3.1 "Ingress Correction" on page 4001 of >> + * i.MX8MP Applications Processor Reference Manual Rev. 1, 06/2021 >> + */ >> + val = readl(ioaddr + PTP_TCR); >> + if (val & PTP_TCR_TSCTRLSSR) >> + /* nanoseconds field is in decimal format with granularity of 1ns/bit */ >> + scaled_ns = ((u64)NSEC_PER_SEC << 16) - scaled_ns; >> + else >> + /* nanoseconds field is in binary format with granularity of ~0.466ns/bit */ >> + scaled_ns = ((1ULL << 31) << 16) - >> + DIV_U64_ROUND_CLOSEST(scaled_ns * PSEC_PER_NSEC, 466U); >> + >> + reg_tsic = scaled_ns >> 16; >> + reg_tsicsns = scaled_ns & 0xff00; >> + >> + /* set bit 31 for 2's compliment */ >> + reg_tsic |= BIT(31); >> + >> + writel(reg_tsic, ioaddr + PTP_TS_INGR_CORR_NS); > > here reg_tsic = 0x80000000 for a correction of -2.15 seconds! > > @Jakub Can you please revert this patch? > > Thanks, > Richard > >
Hi, On 7/27/23 08:55, Johannes Zink wrote: > Hi, > > On 7/27/23 08:39, Johannes Zink wrote: >> Hi Richard, >> > > [snip] > > >>> How many _other_ SoCs did you test your patch on? >>> >> I don't have many available, thus as stated in the description: on the >> i.MX8MP only. That's why I am implementing my stuff in the imx glue code, >> you're welcome to help testing on other hardware if you have any at hand. small correction to what I wrote earlier: it's not implemented in the gluecode, but in the general stmmac_hwtstamp. My bad, I added it to the gluecode in an early prototype version, but then tried to generalize it. Johannes > > note: for v3 I am going to check if we have a dwmac v5 and won't call into the > correction setup function otherwise. > > Best regards > Johannes > > >> Best regards >> Johannes >> >>> Thanks, >>> Richard >>> >>> >>> >> >> >
On Thu, Jul 27, 2023 at 08:40:51AM +0200, Johannes Zink wrote: > Hi Richard, > > On 7/26/23 17:34, Richard Cochran wrote: > > That is great, until they change the data sheet. Really, this happens. > > I think I don't get your point here. > > That's true for literally any register of any peripheral in a datasheet. > I think we can just stop doing driver development if we wait for a final > revision that is not changed any more. Datasheets change, and if they do we > update the driver. This is different than normal registers, because the values are a guess as to what the latency in the hardware design is. Here is how it works in practice: Vendor first asks a summer intern to measure the latency. Intern does some kind of random measurement, and that goes into silicon. One year later, customers discover that the values are bogus. Vendor doesn't spin a new silicon revision just for that. If vendor is honest, a footnote appears in the errata that the corrections are wrong. Thanks, Richard
On Thu, Jul 27, 2023 at 08:42:52AM +0200, Johannes Zink wrote: > good catch, I think adding the register definition won't hurt, but if you > feel more comfortable about it I can add them only for IP core version 5. Adding the offsets in the header is not the issue. The issue is reading from these offsets when there is nothing there to read! Thanks, Richard
On Thu, Jul 27, 2023 at 09:20:10AM +0200, Johannes Zink wrote: > Hi Richard, > > On 7/26/23 22:57, Richard Cochran wrote: > > On Mon, Jul 24, 2023 at 12:01:31PM +0200, Johannes Zink wrote: > > > > Earlier versions of the IP core return zero from these... > > > > > +#define PTP_TS_INGR_LAT 0x68 /* MAC internal Ingress Latency */ > > > +#define PTP_TS_EGR_LAT 0x6c /* MAC internal Egress Latency */ > > > > good catch. Gonna send a v3 with a check to and set the values for dwmac v5 only. AFAICT there is no feature bit that indicates the presence or absence of these two registers. Are you sure that *all* v5 IP cores have these? I am not sure. Thanks, Richard
Hi Richard, On 7/27/23 15:36, Richard Cochran wrote: > On Thu, Jul 27, 2023 at 09:20:10AM +0200, Johannes Zink wrote: >> Hi Richard, >> >> On 7/26/23 22:57, Richard Cochran wrote: >>> On Mon, Jul 24, 2023 at 12:01:31PM +0200, Johannes Zink wrote: >>> >>> Earlier versions of the IP core return zero from these... >>> >>>> +#define PTP_TS_INGR_LAT 0x68 /* MAC internal Ingress Latency */ >>>> +#define PTP_TS_EGR_LAT 0x6c /* MAC internal Egress Latency */ >>> >> >> good catch. Gonna send a v3 with a check to and set the values for dwmac v5 only. > > AFAICT there is no feature bit that indicates the presence or absence > of these two registers. > > Are you sure that *all* v5 IP cores have these? > > I am not sure. I cannot tell for sure either, since I have datasheets for the i.MX8MP only. Maybe Kurt has some insights here, as he has additional hardware available for testing? Nevertheless, I am going to add a guard to only use the correction codepath on i.MX8MP in v3 for the time being, we can add other hardware later trivially if they support doing this. Best regards Johannes > > Thanks, > Richard > >
On Mon, Jul 31, 2023 at 09:00:29AM +0200, Johannes Zink wrote: > I cannot tell for sure either, since I have datasheets for the i.MX8MP only. > Maybe Kurt has some insights here, as he has additional hardware available > for testing? Maybe give the folks who make the dwc a call to clarify? > Nevertheless, I am going to add a guard to only use the correction codepath > on i.MX8MP in v3 for the time being, we can add other hardware later > trivially if they support doing this. Sure. Thanks, Richard
diff --git a/drivers/net/ethernet/stmicro/stmmac/hwif.h b/drivers/net/ethernet/stmicro/stmmac/hwif.h index 6ee7cf07cfd7..95a4d6099577 100644 --- a/drivers/net/ethernet/stmicro/stmmac/hwif.h +++ b/drivers/net/ethernet/stmicro/stmmac/hwif.h @@ -536,6 +536,7 @@ struct stmmac_hwtimestamp { void (*get_systime) (void __iomem *ioaddr, u64 *systime); void (*get_ptptime)(void __iomem *ioaddr, u64 *ptp_time); void (*timestamp_interrupt)(struct stmmac_priv *priv); + void (*correct_latency)(struct stmmac_priv *priv); }; #define stmmac_config_hw_tstamping(__priv, __args...) \ @@ -554,6 +555,8 @@ struct stmmac_hwtimestamp { stmmac_do_void_callback(__priv, ptp, get_ptptime, __args) #define stmmac_timestamp_interrupt(__priv, __args...) \ stmmac_do_void_callback(__priv, ptp, timestamp_interrupt, __args) +#define stmmac_correct_latency(__priv, __args...) \ + stmmac_do_void_callback(__priv, ptp, correct_latency, __args) struct stmmac_tx_queue; struct stmmac_rx_queue; diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_hwtstamp.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_hwtstamp.c index fa2c3ba7e9fe..7e0fa024e0ad 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_hwtstamp.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_hwtstamp.c @@ -60,6 +60,48 @@ static void config_sub_second_increment(void __iomem *ioaddr, *ssinc = data; } +static void correct_latency(struct stmmac_priv *priv) +{ + void __iomem *ioaddr = priv->ptpaddr; + u32 reg_tsic, reg_tsicsns; + u32 reg_tsec, reg_tsecsns; + u64 scaled_ns; + u32 val; + + /* MAC-internal ingress latency */ + scaled_ns = readl(ioaddr + PTP_TS_INGR_LAT); + + /* See section 11.7.2.5.3.1 "Ingress Correction" on page 4001 of + * i.MX8MP Applications Processor Reference Manual Rev. 1, 06/2021 + */ + val = readl(ioaddr + PTP_TCR); + if (val & PTP_TCR_TSCTRLSSR) + /* nanoseconds field is in decimal format with granularity of 1ns/bit */ + scaled_ns = ((u64)NSEC_PER_SEC << 16) - scaled_ns; + else + /* nanoseconds field is in binary format with granularity of ~0.466ns/bit */ + scaled_ns = ((1ULL << 31) << 16) - + DIV_U64_ROUND_CLOSEST(scaled_ns * PSEC_PER_NSEC, 466U); + + reg_tsic = scaled_ns >> 16; + reg_tsicsns = scaled_ns & 0xff00; + + /* set bit 31 for 2's compliment */ + reg_tsic |= BIT(31); + + writel(reg_tsic, ioaddr + PTP_TS_INGR_CORR_NS); + writel(reg_tsicsns, ioaddr + PTP_TS_INGR_CORR_SNS); + + /* MAC-internal egress latency */ + scaled_ns = readl(ioaddr + PTP_TS_EGR_LAT); + + reg_tsec = scaled_ns >> 16; + reg_tsecsns = scaled_ns & 0xff00; + + writel(reg_tsec, ioaddr + PTP_TS_EGR_CORR_NS); + writel(reg_tsecsns, ioaddr + PTP_TS_EGR_CORR_SNS); +} + static int init_systime(void __iomem *ioaddr, u32 sec, u32 nsec) { u32 value; @@ -221,4 +263,5 @@ const struct stmmac_hwtimestamp stmmac_ptp = { .get_systime = get_systime, .get_ptptime = get_ptptime, .timestamp_interrupt = timestamp_interrupt, + .correct_latency = correct_latency, }; diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index efe85b086abe..ee78e69e9ae3 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -909,6 +909,8 @@ static int stmmac_init_ptp(struct stmmac_priv *priv) priv->hwts_tx_en = 0; priv->hwts_rx_en = 0; + stmmac_correct_latency(priv, priv); + return 0; } @@ -1094,6 +1096,8 @@ static void stmmac_mac_link_up(struct phylink_config *config, if (priv->dma_cap.fpesel) stmmac_fpe_link_state_handle(priv, true); + + stmmac_correct_latency(priv, priv); } static const struct phylink_mac_ops stmmac_phylink_mac_ops = { diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.h b/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.h index bf619295d079..d1fe4b46f162 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.h +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.h @@ -26,6 +26,12 @@ #define PTP_ACR 0x40 /* Auxiliary Control Reg */ #define PTP_ATNR 0x48 /* Auxiliary Timestamp - Nanoseconds Reg */ #define PTP_ATSR 0x4c /* Auxiliary Timestamp - Seconds Reg */ +#define PTP_TS_INGR_CORR_NS 0x58 /* Ingress timestamp correction nanoseconds */ +#define PTP_TS_EGR_CORR_NS 0x5C /* Egress timestamp correction nanoseconds*/ +#define PTP_TS_INGR_CORR_SNS 0x60 /* Ingress timestamp correction subnanoseconds */ +#define PTP_TS_EGR_CORR_SNS 0x64 /* Egress timestamp correction subnanoseconds */ +#define PTP_TS_INGR_LAT 0x68 /* MAC internal Ingress Latency */ +#define PTP_TS_EGR_LAT 0x6c /* MAC internal Egress Latency */ #define PTP_STNSUR_ADDSUB_SHIFT 31 #define PTP_DIGITAL_ROLLOVER_MODE 0x3B9ACA00 /* 10e9-1 ns */
The IEEE1588 Standard specifies that the timestamps of Packets must be captured when the PTP message timestamp point (leading edge of first octet after the start of frame delimiter) crosses the boundary between the node and the network. As the MAC latches the timestamp at an internal point, the captured timestamp must be corrected for the additional path latency, as described in the publicly available datasheet [1]. This patch only corrects for the MAC-Internal delay, which can be read out from the MAC_Ingress_Timestamp_Latency register, since the Phy framework currently does not support querying the Phy ingress and egress latency. The Closs Domain Crossing Circuits errors as indicated in [1] are already being accounted in the stmmac_get_tx_hwtstamp() function and are not corrected here. As the Latency varies for different link speeds and MII modes of operation, the correction value needs to be updated on each link state change. As the delay also causes a phase shift in the timestamp counter compared to the rest of the network, this correction will also reduce phase error when generating PPS outputs from the timestamp counter. [1] i.MX8MP Reference Manual, rev.1 Section 11.7.2.5.3 "Timestamp correction" Signed-off-by: Johannes Zink <j.zink@pengutronix.de> --- Changes in v2: - fix builds for 32bit, this was found by the kernel build bot Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202307200225.B8rmKQPN-lkp@intel.com/ - while at it also fix an overflow by shifting a u32 constant from macro by 10bits by casting the constant to u64 - Link to v1: https://lore.kernel.org/r/20230719-stmmac_correct_mac_delay-v1-1-768aa4d09334@pengutronix.de --- drivers/net/ethernet/stmicro/stmmac/hwif.h | 3 ++ .../net/ethernet/stmicro/stmmac/stmmac_hwtstamp.c | 43 ++++++++++++++++++++++ drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 4 ++ drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.h | 6 +++ 4 files changed, 56 insertions(+) --- base-commit: ba80e20d7f3f87dab3f9f0c0ca66e4b1fcc7be9f change-id: 20230719-stmmac_correct_mac_delay-4278cb9d9bc1 Best regards,