Message ID | 20211119081010.27084-1-kurt@linutronix.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [net-next,v1] net: stmmac: Caclucate clock domain crossing error only once | expand |
Kurt, On Fri, Nov 19 2021 at 09:10, Kurt Kanzenbach wrote: > The clock domain crossing error (CDC) is calculated at every fetch of Tx or Rx > timestamps. It includes a division. Especially on arm32 based systems it is > expensive. It also saves the two conditionals. This does not make sense. What you want to say here is: It also requires two conditionals in the hotpath. > Therefore, move the calculation to the PTP initialization code and just use the > cached value in the timestamp retrieval functions. Maybe: Add a compensation value cache to struct plat_stmmacenet_data and subtract it unconditionally in the RX/TX functions which spares the conditionals. The value is initialized to 0 and if supported calculated in the PTP initialization code. or something to that effect. > + /* Calculate the clock domain crossing (CDC) error if necessary */ > + priv->plat->cdc_error_adj = 0; > + if (priv->plat->has_gmac4 && priv->plat->clk_ptp_rate) > + priv->plat->cdc_error_adj = (2 * NSEC_PER_SEC) / > + priv->plat->clk_ptp_rate; Nit. Just let stick it out. We lifted the 80 char limitation some time ago. Thanks, tglx
On Fri Nov 19 2021, Thomas Gleixner wrote: > Kurt, > > On Fri, Nov 19 2021 at 09:10, Kurt Kanzenbach wrote: > >> The clock domain crossing error (CDC) is calculated at every fetch of Tx or Rx >> timestamps. It includes a division. Especially on arm32 based systems it is >> expensive. It also saves the two conditionals. > > This does not make sense. What you want to say here is: > > It also requires two conditionals in the hotpath. Yeah, I realized after sending that the last sentence is bogus. > >> Therefore, move the calculation to the PTP initialization code and just use the >> cached value in the timestamp retrieval functions. > > Maybe: > > Add a compensation value cache to struct plat_stmmacenet_data and > subtract it unconditionally in the RX/TX functions which spares the > conditionals. > > The value is initialized to 0 and if supported calculated in the PTP > initialization code. Sounds better, thanks. > > or something to that effect. > >> + /* Calculate the clock domain crossing (CDC) error if necessary */ >> + priv->plat->cdc_error_adj = 0; >> + if (priv->plat->has_gmac4 && priv->plat->clk_ptp_rate) >> + priv->plat->cdc_error_adj = (2 * NSEC_PER_SEC) / >> + priv->plat->clk_ptp_rate; > > Nit. Just let stick it out. We lifted the 80 char limitation some time ago. Good to know. Will do. I'll wait for further comments or test results and will resend next week. Thanks, Kurt
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index 21111df73719..340076b5bb38 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -511,14 +511,6 @@ bool stmmac_eee_init(struct stmmac_priv *priv) return true; } -static inline u32 stmmac_cdc_adjust(struct stmmac_priv *priv) -{ - /* Correct the clk domain crossing(CDC) error */ - if (priv->plat->has_gmac4 && priv->plat->clk_ptp_rate) - return (2 * NSEC_PER_SEC) / priv->plat->clk_ptp_rate; - return 0; -} - /* stmmac_get_tx_hwtstamp - get HW TX timestamps * @priv: driver private structure * @p : descriptor pointer @@ -550,7 +542,7 @@ static void stmmac_get_tx_hwtstamp(struct stmmac_priv *priv, } if (found) { - ns -= stmmac_cdc_adjust(priv); + ns -= priv->plat->cdc_error_adj; memset(&shhwtstamp, 0, sizeof(struct skb_shared_hwtstamps)); shhwtstamp.hwtstamp = ns_to_ktime(ns); @@ -587,7 +579,7 @@ static void stmmac_get_rx_hwtstamp(struct stmmac_priv *priv, struct dma_desc *p, if (stmmac_get_rx_timestamp_status(priv, p, np, priv->adv_ts)) { stmmac_get_timestamp(priv, desc, priv->adv_ts, &ns); - ns -= stmmac_cdc_adjust(priv); + ns -= priv->plat->cdc_error_adj; netdev_dbg(priv->dev, "get valid RX hw timestamp %llu\n", ns); shhwtstamp = skb_hwtstamps(skb); diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.c index 580cc035536b..96b9e4175f08 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.c @@ -309,6 +309,12 @@ void stmmac_ptp_register(struct stmmac_priv *priv) if (priv->plat->ptp_max_adj) stmmac_ptp_clock_ops.max_adj = priv->plat->ptp_max_adj; + /* Calculate the clock domain crossing (CDC) error if necessary */ + priv->plat->cdc_error_adj = 0; + if (priv->plat->has_gmac4 && priv->plat->clk_ptp_rate) + priv->plat->cdc_error_adj = (2 * NSEC_PER_SEC) / + priv->plat->clk_ptp_rate; + stmmac_ptp_clock_ops.n_per_out = priv->dma_cap.pps_out_num; stmmac_ptp_clock_ops.n_ext_ts = priv->dma_cap.aux_snapshot_n; diff --git a/include/linux/stmmac.h b/include/linux/stmmac.h index a6f03b36fc4f..89b8e208cd7b 100644 --- a/include/linux/stmmac.h +++ b/include/linux/stmmac.h @@ -241,6 +241,7 @@ struct plat_stmmacenet_data { unsigned int clk_ref_rate; unsigned int mult_fact_100ns; s32 ptp_max_adj; + u32 cdc_error_adj; struct reset_control *stmmac_rst; struct reset_control *stmmac_ahb_rst; struct stmmac_axi *axi;
The clock domain crossing error (CDC) is calculated at every fetch of Tx or Rx timestamps. It includes a division. Especially on arm32 based systems it is expensive. It also saves the two conditionals. Therefore, move the calculation to the PTP initialization code and just use the cached value in the timestamp retrieval functions. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> --- Bene, would you mind to test this patch on stm32mp1 too? drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 12 ++---------- drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.c | 6 ++++++ include/linux/stmmac.h | 1 + 3 files changed, 9 insertions(+), 10 deletions(-)