diff mbox series

[iwl-next] ice: periodically kick Tx timestamp interrupt

Message ID 20231103162943.485467-1-karol.kolacinski@intel.com (mailing list archive)
State Awaiting Upstream
Delegated to: Netdev Maintainers
Headers show
Series [iwl-next] ice: periodically kick Tx timestamp interrupt | expand

Checks

Context Check Description
netdev/series_format warning Single patches do not need cover letters; Target tree name not specified in the subject
netdev/tree_selection success Guessed tree name to be net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1350 this patch: 1350
netdev/cc_maintainers warning 5 maintainers not CCed: richardcochran@gmail.com kuba@kernel.org pabeni@redhat.com davem@davemloft.net edumazet@google.com
netdev/build_clang success Errors and warnings before: 1378 this patch: 1378
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 1378 this patch: 1378
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 62 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Karol Kolacinski Nov. 3, 2023, 4:29 p.m. UTC
From: Jacob Keller <jacob.e.keller@intel.com>

The E822 hardware for Tx timestamping keeps track of how many
outstanding timestamps are still in the PHY memory block. It will not
generate a new interrupt to the MAC until all of the timestamps in the
region have been read.

If somehow all the available data is not read, but the driver has exited
its interrupt routine already, the PHY will not generate a new interrupt
even if new timestamp data is captured. Because no interrupt is
generated, the driver never processes the timestamp data. This state
results in a permanent failure for all future Tx timestamps.

It is not clear how the driver and hardware could enter this state.
However, if it does, there is currently no recovery mechanism.

Add a recovery mechanism via the periodic PTP work thread which invokes
ice_ptp_periodic_work(). Introduce a new check,
ice_ptp_maybe_trigger_tx_interrupt() which checks the PHY timestamp
ready bitmask. If any bits are set, trigger a software interrupt by
writing to PFINT_OICR.

Once triggered, the main timestamp processing thread will read through
the PHY data and clear the outstanding timestamp data. Once cleared, new
data should trigger interrupts as expected.

This should allow recovery from such a state rather than leaving the
device in a state where we cannot process Tx timestamps.

It is possible that this function checks for timestamp data
simultaneously with the interrupt, and it might trigger additional
unnecessary interrupts. This will cause a small amount of additional
processing.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com>
Reviewed-by: Andrii Staikov <andrii.staikov@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_ptp.c | 50 ++++++++++++++++++++++++
 1 file changed, 50 insertions(+)


base-commit: 75eabdb41c25d36bd15661df81033a4e3086c4c9

Comments

Simon Horman Nov. 4, 2023, 3:36 p.m. UTC | #1
On Fri, Nov 03, 2023 at 05:29:43PM +0100, Karol Kolacinski wrote:
> From: Jacob Keller <jacob.e.keller@intel.com>
> 
> The E822 hardware for Tx timestamping keeps track of how many
> outstanding timestamps are still in the PHY memory block. It will not
> generate a new interrupt to the MAC until all of the timestamps in the
> region have been read.
> 
> If somehow all the available data is not read, but the driver has exited
> its interrupt routine already, the PHY will not generate a new interrupt
> even if new timestamp data is captured. Because no interrupt is
> generated, the driver never processes the timestamp data. This state
> results in a permanent failure for all future Tx timestamps.
> 
> It is not clear how the driver and hardware could enter this state.
> However, if it does, there is currently no recovery mechanism.
> 
> Add a recovery mechanism via the periodic PTP work thread which invokes
> ice_ptp_periodic_work(). Introduce a new check,
> ice_ptp_maybe_trigger_tx_interrupt() which checks the PHY timestamp
> ready bitmask. If any bits are set, trigger a software interrupt by
> writing to PFINT_OICR.
> 
> Once triggered, the main timestamp processing thread will read through
> the PHY data and clear the outstanding timestamp data. Once cleared, new
> data should trigger interrupts as expected.
> 
> This should allow recovery from such a state rather than leaving the
> device in a state where we cannot process Tx timestamps.
> 
> It is possible that this function checks for timestamp data
> simultaneously with the interrupt, and it might trigger additional
> unnecessary interrupts. This will cause a small amount of additional
> processing.
> 
> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com>
> Reviewed-by: Andrii Staikov <andrii.staikov@intel.com>

Reviewed-by: Simon Horman <horms@kernel.org>
Pucha, HimasekharX Reddy Nov. 16, 2023, 8:12 a.m. UTC | #2
> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of Karol Kolacinski
> Sent: Friday, November 3, 2023 10:00 PM
> To: intel-wired-lan@lists.osuosl.org
> Cc: netdev@vger.kernel.org; Brandeburg, Jesse <jesse.brandeburg@intel.com>; Staikov, Andrii <andrii.staikov@intel.com>; Kolacinski, Karol <karol.kolacinski@intel.com>; Nguyen, Anthony L <anthony.l.nguyen@intel.com>; Keller, Jacob E <jacob.e.keller@intel.com>
> Subject: [Intel-wired-lan] [PATCH iwl-next] ice: periodically kick Tx timestamp interrupt
>
> From: Jacob Keller <jacob.e.keller@intel.com>
>
> The E822 hardware for Tx timestamping keeps track of how many
> outstanding timestamps are still in the PHY memory block. It will not
> generate a new interrupt to the MAC until all of the timestamps in the
> region have been read.
>
> If somehow all the available data is not read, but the driver has exited
> its interrupt routine already, the PHY will not generate a new interrupt
> even if new timestamp data is captured. Because no interrupt is
> generated, the driver never processes the timestamp data. This state
> results in a permanent failure for all future Tx timestamps.
>
> It is not clear how the driver and hardware could enter this state.
> However, if it does, there is currently no recovery mechanism.
>
> Add a recovery mechanism via the periodic PTP work thread which invokes
> ice_ptp_periodic_work(). Introduce a new check,
> ice_ptp_maybe_trigger_tx_interrupt() which checks the PHY timestamp
> ready bitmask. If any bits are set, trigger a software interrupt by
> writing to PFINT_OICR.
>
> Once triggered, the main timestamp processing thread will read through
> the PHY data and clear the outstanding timestamp data. Once cleared, new
> data should trigger interrupts as expected.
>
> This should allow recovery from such a state rather than leaving the
> device in a state where we cannot process Tx timestamps.
>
> It is possible that this function checks for timestamp data
> simultaneously with the interrupt, and it might trigger additional
> unnecessary interrupts. This will cause a small amount of additional
> processing.
>
> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com>
> Reviewed-by: Andrii Staikov <andrii.staikov@intel.com>
> ---
>  drivers/net/ethernet/intel/ice/ice_ptp.c | 50 ++++++++++++++++++++++++
>  1 file changed, 50 insertions(+)
>

Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
diff mbox series

Patch

diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c
index 5fb9dbbdfc16..79c1fa61d1a8 100644
--- a/drivers/net/ethernet/intel/ice/ice_ptp.c
+++ b/drivers/net/ethernet/intel/ice/ice_ptp.c
@@ -2460,6 +2460,54 @@  enum ice_tx_tstamp_work ice_ptp_process_ts(struct ice_pf *pf)
 	}
 }
 
+/**
+ * ice_ptp_maybe_trigger_tx_interrupt - Trigger Tx timstamp interrupt
+ * @pf: Board private structure
+ *
+ * The device PHY issues Tx timestamp interrupts to the driver for processing
+ * timestamp data from the PHY. It will not interrupt again until all
+ * current timestamp data is read. In rare circumstances, it is possible that
+ * the driver fails to read all outstanding data.
+ *
+ * To avoid getting permanently stuck, periodically check if the PHY has
+ * outstanding timestamp data. If so, trigger an interrupt from software to
+ * process this data.
+ */
+static void ice_ptp_maybe_trigger_tx_interrupt(struct ice_pf *pf)
+{
+	struct device *dev = ice_pf_to_dev(pf);
+	struct ice_hw *hw = &pf->hw;
+	bool trigger_oicr = false;
+	unsigned int i;
+
+	if (ice_is_e810(hw))
+		return;
+
+	if (!ice_pf_src_tmr_owned(pf))
+		return;
+
+	for (i = 0; i < ICE_MAX_QUAD; i++) {
+		u64 tstamp_ready;
+		int err;
+
+		err = ice_get_phy_tx_tstamp_ready(&pf->hw, i, &tstamp_ready);
+		if (!err && tstamp_ready) {
+			trigger_oicr = true;
+			break;
+		}
+	}
+
+	if (trigger_oicr) {
+		/* Trigger a software interrupt, to ensure this data
+		 * gets processed.
+		 */
+		dev_dbg(dev, "PTP periodic task detected waiting timestamps. Triggering Tx timestamp interrupt now.\n");
+
+		wr32(hw, PFINT_OICR, PFINT_OICR_TSYN_TX_M);
+		ice_flush(hw);
+	}
+}
+
 static void ice_ptp_periodic_work(struct kthread_work *work)
 {
 	struct ice_ptp *ptp = container_of(work, struct ice_ptp, work.work);
@@ -2471,6 +2519,8 @@  static void ice_ptp_periodic_work(struct kthread_work *work)
 
 	err = ice_ptp_update_cached_phctime(pf);
 
+	ice_ptp_maybe_trigger_tx_interrupt(pf);
+
 	/* Run twice a second or reschedule if phc update failed */
 	kthread_queue_delayed_work(ptp->kworker, &ptp->work,
 				   msecs_to_jiffies(err ? 10 : 500));