diff mbox series

[v2,net-next,2/3] net: phy: nxp-c45-tja11xx: fix potential RX timestamp wraparound

Message ID 20210614123815.443467-3-olteanv@gmail.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series Fixes and improvements to TJA1103 PHY driver | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for net-next
netdev/subject_prefix success Link
netdev/cc_maintainers success CCed 7 of 7 maintainers
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 9 lines checked
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/header_inline success Link

Commit Message

Vladimir Oltean June 14, 2021, 12:38 p.m. UTC
From: Vladimir Oltean <vladimir.oltean@nxp.com>

The reconstruction procedure for partial timestamps reads the current
PTP time and fills in the low 2 bits of the second portion, as well as
the nanoseconds portion, from the actual hardware packet timestamp.
Critically, the reconstruction procedure works because it assumes that
the current PTP time is strictly larger than the hardware timestamp was:
it detects a 2-bit wraparound of the 'seconds' portion by checking whether
the 'seconds' portion of the partial hardware timestamp is larger than
the 'seconds' portion of the current time. That can only happen if the
hardware timestamp was captured by the PHY during the last phase of a
'modulo 4 seconds' interval, and the current PTP time was read by the
driver during the initial phase of the next 'modulo 4 seconds' interval.

The partial RX timestamps are added to priv->rx_queue in
nxp_c45_rxtstamp() and they are processed potentially in parallel by the
aux worker thread in nxp_c45_do_aux_work(). This means that it is
possible for nxp_c45_do_aux_work() to process more than one RX timestamp
during the same schedule.

There is one premature optimization that will cause issues: for RX
timestamping, the driver reads the current time only once, and it uses
that to reconstruct all PTP RX timestamps in the queue. For the second
and later timestamps, this will be an issue if we are processing two RX
timestamps which are to the left and to the right, respectively, of a
4-bit wraparound of the 'seconds' portion of the PTP time, and the
current PTP time is also pre-wraparound.

 0.000000000        4.000000000        8.000000000        12.000000000
 |..................|..................|..................|............>
                 ^ ^ ^ ^                                            time
                 | | | |
                 | | | process hwts 1 and hwts 2
                 | | |
                 | | hwts 2
                 | |
                 | read current PTP time
                 |
                 hwts 1

What will happen in that case is that hwts 2 (post-wraparound) will use
a stale current PTP time that is pre-wraparound.
But nxp_c45_reconstruct_ts will not detect this condition, because it is
not coded up for it, so it will reconstruct hwts 2 with a current time
from the previous 4 second interval (i.e. 0.something instead of
4.something).

This is solvable by making sure that the full 64-bit current time is
always read after the PHY has taken the partial RX timestamp. We do this
by reading the current PTP time for every timestamp in the RX queue.

Fixes: 514def5dd339 ("phy: nxp-c45-tja11xx: add timestamping support")
Cc: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
v1->v2: none

 drivers/net/phy/nxp-c45-tja11xx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Russell King (Oracle) June 14, 2021, 12:59 p.m. UTC | #1
On Mon, Jun 14, 2021 at 03:38:14PM +0300, Vladimir Oltean wrote:
> From: Vladimir Oltean <vladimir.oltean@nxp.com>
> 
> The reconstruction procedure for partial timestamps reads the current
> PTP time and fills in the low 2 bits of the second portion, as well as
> the nanoseconds portion, from the actual hardware packet timestamp.
> Critically, the reconstruction procedure works because it assumes that
> the current PTP time is strictly larger than the hardware timestamp was:
> it detects a 2-bit wraparound of the 'seconds' portion by checking whether
> the 'seconds' portion of the partial hardware timestamp is larger than
> the 'seconds' portion of the current time. That can only happen if the
> hardware timestamp was captured by the PHY during the last phase of a
> 'modulo 4 seconds' interval, and the current PTP time was read by the
> driver during the initial phase of the next 'modulo 4 seconds' interval.
> 
> The partial RX timestamps are added to priv->rx_queue in
> nxp_c45_rxtstamp() and they are processed potentially in parallel by the
> aux worker thread in nxp_c45_do_aux_work(). This means that it is
> possible for nxp_c45_do_aux_work() to process more than one RX timestamp
> during the same schedule.
> 
> There is one premature optimization that will cause issues: for RX
> timestamping, the driver reads the current time only once, and it uses
> that to reconstruct all PTP RX timestamps in the queue. For the second
> and later timestamps, this will be an issue if we are processing two RX
> timestamps which are to the left and to the right, respectively, of a
> 4-bit wraparound of the 'seconds' portion of the PTP time, and the
> current PTP time is also pre-wraparound.
> 
>  0.000000000        4.000000000        8.000000000        12.000000000
>  |..................|..................|..................|............>
>                  ^ ^ ^ ^                                            time
>                  | | | |
>                  | | | process hwts 1 and hwts 2
>                  | | |
>                  | | hwts 2
>                  | |
>                  | read current PTP time
>                  |
>                  hwts 1
> 
> What will happen in that case is that hwts 2 (post-wraparound) will use
> a stale current PTP time that is pre-wraparound.
> But nxp_c45_reconstruct_ts will not detect this condition, because it is
> not coded up for it, so it will reconstruct hwts 2 with a current time
> from the previous 4 second interval (i.e. 0.something instead of
> 4.something).
> 
> This is solvable by making sure that the full 64-bit current time is
> always read after the PHY has taken the partial RX timestamp. We do this
> by reading the current PTP time for every timestamp in the RX queue.
> 
> Fixes: 514def5dd339 ("phy: nxp-c45-tja11xx: add timestamping support")
> Cc: Richard Cochran <richardcochran@gmail.com>
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> ---
> v1->v2: none
> 
>  drivers/net/phy/nxp-c45-tja11xx.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/phy/nxp-c45-tja11xx.c b/drivers/net/phy/nxp-c45-tja11xx.c
> index 902fe1aa7782..118b393b1cbb 100644
> --- a/drivers/net/phy/nxp-c45-tja11xx.c
> +++ b/drivers/net/phy/nxp-c45-tja11xx.c
> @@ -427,8 +427,8 @@ static long nxp_c45_do_aux_work(struct ptp_clock_info *ptp)
>  		nxp_c45_process_txts(priv, &hwts);
>  	}
>  
> -	nxp_c45_ptp_gettimex64(&priv->caps, &ts, NULL);
>  	while ((skb = skb_dequeue(&priv->rx_queue)) != NULL) {
> +		nxp_c45_ptp_gettimex64(&priv->caps, &ts, NULL);
>  		ts_raw = __be32_to_cpu(NXP_C45_SKB_CB(skb)->header->reserved2);
>  		hwts.sec = ts_raw >> 30;
>  		hwts.nsec = ts_raw & GENMASK(29, 0);

This looks good, but while reviewing it, I've been looking at
nxp_c45_reconstruct_ts(). While it is logically correct, this makes
it difficult to understand:

	if ((ts->tv_sec & TS_SEC_MASK) < (hwts->sec & TS_SEC_MASK))
		ts->tv_sec -= BIT(2);

The use of BIT() here seems completely inappropriate and misleading.

The value we really want to be using here is TS_SEC_MASK + 1 - this
has the advantage of making it completely clear that _this_ value
depends on TS_SEC_MASK, and is not some unrelated value.

In any case, for _this_ patch:

Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
diff mbox series

Patch

diff --git a/drivers/net/phy/nxp-c45-tja11xx.c b/drivers/net/phy/nxp-c45-tja11xx.c
index 902fe1aa7782..118b393b1cbb 100644
--- a/drivers/net/phy/nxp-c45-tja11xx.c
+++ b/drivers/net/phy/nxp-c45-tja11xx.c
@@ -427,8 +427,8 @@  static long nxp_c45_do_aux_work(struct ptp_clock_info *ptp)
 		nxp_c45_process_txts(priv, &hwts);
 	}
 
-	nxp_c45_ptp_gettimex64(&priv->caps, &ts, NULL);
 	while ((skb = skb_dequeue(&priv->rx_queue)) != NULL) {
+		nxp_c45_ptp_gettimex64(&priv->caps, &ts, NULL);
 		ts_raw = __be32_to_cpu(NXP_C45_SKB_CB(skb)->header->reserved2);
 		hwts.sec = ts_raw >> 30;
 		hwts.nsec = ts_raw & GENMASK(29, 0);