diff mbox series

[v5,net-next,4/5] net: ena: PHC error bound/flags support

Message ID 20250122102040.752-5-darinzon@amazon.com (mailing list archive)
State Deferred
Delegated to: Netdev Maintainers
Headers show
Series PHC support in ENA driver | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers warning 5 maintainers not CCed: linux-doc@vger.kernel.org andrew+netdev@lunn.ch linux@treblig.org horms@kernel.org corbet@lwn.net
netdev/build_clang success Errors and warnings before: 6 this patch: 6
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch warning WARNING: line length of 81 exceeds 80 columns WARNING: line length of 85 exceeds 80 columns
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Arinzon, David Jan. 22, 2025, 10:20 a.m. UTC
PHC algorithm is updated to support reading new PHC values.
Until this change, the driver retrieved PHC timestamp from the device's
PHC address, this change expands this API by adding 2 new values
to ena_admin_phc_resp:
1. PHC error bound:
   PTP HW clock error bound refers to the maximum allowable difference
   between the clock of the device and the reference clock.
   The error bound is used to ensure that the clock of the device
   remains within a certain level of accuracy relative to the reference
   clock. The error bound (expressed in nanoseconds) is calculated by
   the device, taking into account the accuracy of the PTA device,
   march hare network, TOR, Chrony, Pacemaker and ENA driver read delay.
   Error bound (u32) may contain values of 0-4294967295 (nsec) while
   driver may only report values of 0-4294967294 (nsec) because max
   error bound value (4294967295) will be used to represent error bound
   read error. The error bound value is retrieved from the device by the
   driver upon every get PHC timestamp request and is cached for future
   retrieval by the user.
2. PHC error flags:
   Indicates any PHC timestamp and error bound errors.
   The error flags value is retrieved from the device by the driver upon
   every get PHC timestamp request.
   Any PHC error type will:
   1. Enter the PHC into blocked state until passing blocking time
   2. Return device busy error to timestamp caller
   3. Return device busy error to error bound caller

Signed-off-by: Amit Bernstein <amitbern@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
---
 .../device_drivers/ethernet/amazon/ena.rst    | 15 +++-
 .../net/ethernet/amazon/ena/ena_admin_defs.h  | 28 +++++--
 drivers/net/ethernet/amazon/ena/ena_com.c     | 83 +++++++++++++------
 drivers/net/ethernet/amazon/ena/ena_com.h     | 16 +++-
 drivers/net/ethernet/amazon/ena/ena_phc.c     |  3 +-
 5 files changed, 107 insertions(+), 38 deletions(-)
diff mbox series

Patch

diff --git a/Documentation/networking/device_drivers/ethernet/amazon/ena.rst b/Documentation/networking/device_drivers/ethernet/amazon/ena.rst
index 12b13da0..19697f63 100644
--- a/Documentation/networking/device_drivers/ethernet/amazon/ena.rst
+++ b/Documentation/networking/device_drivers/ethernet/amazon/ena.rst
@@ -279,6 +279,16 @@  The ENA device restricts the frequency of PHC get time requests to a maximum
 of 125 requests per second. If this limit is surpassed, the get time request
 will fail, leading to an increment in the phc_err statistic.
 
+**PHC error bound**
+
+PTP HW clock error bound refers to the maximum allowable difference
+between the clock of the device and the reference clock.
+The error bound is used to ensure that the clock of the device
+remains within a certain level of accuracy relative to the reference
+clock. The error bound (expressed in nanoseconds) is calculated by
+the device and is retrieved and cached by the driver upon every get PHC
+timestamp request.
+
 **PHC statistics**
 
 PHC can be monitored using :code:`ethtool -S` counters:
@@ -287,7 +297,10 @@  PHC can be monitored using :code:`ethtool -S` counters:
 **phc_cnt**         Number of successful retrieved timestamps (below expire timeout).
 **phc_exp**         Number of expired retrieved timestamps (above expire timeout).
 **phc_skp**         Number of skipped get time attempts (during block period).
-**phc_err**         Number of failed get time attempts (entering into block state).
+**phc_err**         Number of failed get time attempts due to timestamp/error bound errors
+                    (entering into block state).
+                    Must remain below 1% of all PHC requests to maintain the desired level of
+                    accuracy and reliability.
 =================   ======================================================
 
 PHC timeouts:
diff --git a/drivers/net/ethernet/amazon/ena/ena_admin_defs.h b/drivers/net/ethernet/amazon/ena/ena_admin_defs.h
index 28770e60..de5c28f5 100644
--- a/drivers/net/ethernet/amazon/ena/ena_admin_defs.h
+++ b/drivers/net/ethernet/amazon/ena/ena_admin_defs.h
@@ -128,8 +128,14 @@  enum ena_admin_get_stats_scope {
 	ENA_ADMIN_ETH_TRAFFIC                       = 1,
 };
 
-enum ena_admin_phc_type {
-	ENA_ADMIN_PHC_TYPE_READLESS                 = 0,
+enum ena_admin_phc_feature_version {
+	/* Readless with error_bound */
+	ENA_ADMIN_PHC_FEATURE_VERSION_0             = 0,
+};
+
+enum ena_admin_phc_error_flags {
+	ENA_ADMIN_PHC_ERROR_FLAG_TIMESTAMP   = BIT(0),
+	ENA_ADMIN_PHC_ERROR_FLAG_ERROR_BOUND = BIT(1),
 };
 
 /* ENA SRD configuration for ENI */
@@ -1031,10 +1037,10 @@  struct ena_admin_queue_ext_feature_desc {
 };
 
 struct ena_admin_feature_phc_desc {
-	/* PHC type as defined in enum ena_admin_get_phc_type,
-	 * used only for GET command.
+	/* PHC version as defined in enum ena_admin_phc_feature_version,
+	 * used only for GET command as max supported PHC version by the device.
 	 */
-	u8 type;
+	u8 version;
 
 	/* Reserved - MBZ */
 	u8 reserved1[3];
@@ -1212,13 +1218,23 @@  struct ena_admin_ena_mmio_req_read_less_resp {
 };
 
 struct ena_admin_phc_resp {
+	/* Request Id, received from DB register */
 	u16 req_id;
 
 	u8 reserved1[6];
 
+	/* PHC timestamp (nsec) */
 	u64 timestamp;
 
-	u8 reserved2[48];
+	u8 reserved2[8];
+
+	/* Timestamp error limit (nsec) */
+	u32 error_bound;
+
+	/* Bit field of enum ena_admin_phc_error_flags */
+	u32 error_flags;
+
+	u8 reserved3[32];
 };
 
 /* aq_common_desc */
diff --git a/drivers/net/ethernet/amazon/ena/ena_com.c b/drivers/net/ethernet/amazon/ena/ena_com.c
index c6b9939e..66b1ab92 100644
--- a/drivers/net/ethernet/amazon/ena/ena_com.c
+++ b/drivers/net/ethernet/amazon/ena/ena_com.c
@@ -44,8 +44,10 @@ 
 /* PHC definitions */
 #define ENA_PHC_DEFAULT_EXPIRE_TIMEOUT_USEC 10
 #define ENA_PHC_DEFAULT_BLOCK_TIMEOUT_USEC 1000
-#define ENA_PHC_TIMESTAMP_ERROR 0xFFFFFFFFFFFFFFFF
+#define ENA_PHC_MAX_ERROR_BOUND 0xFFFFFFFF
 #define ENA_PHC_REQ_ID_OFFSET 0xDEAD
+#define ENA_PHC_ERROR_FLAGS (ENA_ADMIN_PHC_ERROR_FLAG_TIMESTAMP | \
+			     ENA_ADMIN_PHC_ERROR_FLAG_ERROR_BOUND)
 
 /*****************************************************************************/
 /*****************************************************************************/
@@ -1682,11 +1684,11 @@  int ena_com_phc_config(struct ena_com_dev *ena_dev)
 	struct ena_admin_set_feat_cmd set_feat_cmd;
 	int ret = 0;
 
-	/* Get device PHC default configuration */
+	/* Get default device PHC configuration */
 	ret = ena_com_get_feature(ena_dev,
 				  &get_feat_resp,
 				  ENA_ADMIN_PHC_CONFIG,
-				  0);
+				  ENA_ADMIN_PHC_FEATURE_VERSION_0);
 	if (unlikely(ret)) {
 		netdev_err(ena_dev->net_device,
 			   "Failed to get PHC feature configuration, error: %d\n",
@@ -1694,10 +1696,10 @@  int ena_com_phc_config(struct ena_com_dev *ena_dev)
 		return ret;
 	}
 
-	/* Supporting only readless PHC retrieval */
-	if (get_feat_resp.u.phc.type != ENA_ADMIN_PHC_TYPE_READLESS) {
-		netdev_err(ena_dev->net_device, "Unsupported PHC type, error: %d\n",
-			   -EOPNOTSUPP);
+	/* Supporting only PHC V0 (readless mode with error bound) */
+	if (get_feat_resp.u.phc.version != ENA_ADMIN_PHC_FEATURE_VERSION_0) {
+		netdev_err(ena_dev->net_device, "Unsupported PHC version (0x%X), error: %d\n",
+			   get_feat_resp.u.phc.version, -EOPNOTSUPP);
 		return -EOPNOTSUPP;
 	}
 
@@ -1720,7 +1722,7 @@  int ena_com_phc_config(struct ena_com_dev *ena_dev)
 				   get_feat_resp.u.phc.block_timeout_usec :
 				   ENA_PHC_DEFAULT_BLOCK_TIMEOUT_USEC;
 
-	/* Sanity check - expire timeout must not be above skip timeout */
+	/* Sanity check - expire timeout must not exceed block timeout */
 	if (phc->expire_timeout_usec > phc->block_timeout_usec)
 		phc->expire_timeout_usec = phc->block_timeout_usec;
 
@@ -1778,7 +1780,7 @@  void ena_com_phc_destroy(struct ena_com_dev *ena_dev)
 	phc->virt_addr = NULL;
 }
 
-int ena_com_phc_get(struct ena_com_dev *ena_dev, u64 *timestamp)
+int ena_com_phc_get_timestamp(struct ena_com_dev *ena_dev, u64 *timestamp)
 {
 	volatile struct ena_admin_phc_resp *read_resp = ena_dev->phc.virt_addr;
 	const ktime_t zero_system_time = ktime_set(0, 0);
@@ -1806,14 +1808,13 @@  int ena_com_phc_get(struct ena_com_dev *ena_dev, u64 *timestamp)
 			goto skip;
 		}
 
-		/* PHC is in active state, update statistics according to
-		 * req_id and timestamp
+		/* PHC is in active state, update statistics according
+		 * to req_id and error_flags
 		 */
 		if ((READ_ONCE(read_resp->req_id) != phc->req_id) ||
-		    read_resp->timestamp == ENA_PHC_TIMESTAMP_ERROR)
-			/* Device didn't update req_id during blocking time
-			 * or timestamp is invalid, this indicates on a
-			 * device error
+		    (read_resp->error_flags & ENA_PHC_ERROR_FLAGS))
+			/* Device didn't update req_id during blocking time or
+			 * timestamp is invalid, this indicates on a device error
 			 */
 			phc->stats.phc_err++;
 		else
@@ -1845,36 +1846,46 @@  int ena_com_phc_get(struct ena_com_dev *ena_dev, u64 *timestamp)
 	while (1) {
 		if (unlikely(ktime_after(ktime_get(), expire_time))) {
 			/* Gave up waiting for updated req_id,
-			 * PHC enters into blocked state until passing
-			 * blocking time
+			 * PHC enters into blocked state until passing blocking time,
+			 * during this time any get PHC timestamp or error bound
+			 * requests will fail with device busy error
 			 */
+			phc->error_bound = ENA_PHC_MAX_ERROR_BOUND;
 			ret = -EBUSY;
 			break;
 		}
 
 		/* Check if req_id was updated by the device */
 		if (READ_ONCE(read_resp->req_id) != phc->req_id) {
-			/* req_id was not updated by the device,
+			/* req_id was not updated by the device yet,
 			 * check again on next loop
 			 */
 			continue;
 		}
 
-		/* req_id was updated which indicates that PHC timestamp
-		 * was updated too
+		/* req_id was updated by the device which indicates that
+		 * PHC timestamp, error_bound and error_flags are updated too,
+		 * checking errors before retrieving timestamp and
+		 * error_bound values
 		 */
-		*timestamp = read_resp->timestamp;
-
-		/* PHC timestamp validty check */
-		if (unlikely(*timestamp == ENA_PHC_TIMESTAMP_ERROR)) {
-			/* Retrieved invalid PHC timestamp, PHC enters into
-			 * blocked state until passing blocking time
+		if (unlikely(read_resp->error_flags & ENA_PHC_ERROR_FLAGS)) {
+			/* Retrieved timestamp or error bound errors,
+			 * PHC enters into blocked state until passing blocking time,
+			 * during this time any get PHC timestamp or error bound
+			 * requests will fail with device busy error
 			 */
+			phc->error_bound = ENA_PHC_MAX_ERROR_BOUND;
 			ret = -EBUSY;
 			break;
 		}
 
-		/* Retrieved valid PHC timestamp */
+		/* PHC timestamp value is returned to the caller */
+		*timestamp = read_resp->timestamp;
+
+		/* Error bound value is cached for future retrieval by caller */
+		phc->error_bound = read_resp->error_bound;
+
+		/* Update statistic on valid PHC timestamp retrieval */
 		phc->stats.phc_cnt++;
 
 		/* This indicates PHC state is active */
@@ -1888,6 +1899,24 @@  skip:
 	return ret;
 }
 
+int ena_com_phc_get_error_bound(struct ena_com_dev *ena_dev, u32 *error_bound)
+{
+	struct ena_com_phc_info *phc = &ena_dev->phc;
+	u32 local_error_bound = phc->error_bound;
+
+	if (!phc->active) {
+		netdev_err(ena_dev->net_device, "PHC feature is not active in the device\n");
+		return -EOPNOTSUPP;
+	}
+
+	if (local_error_bound == ENA_PHC_MAX_ERROR_BOUND)
+		return -EBUSY;
+
+	*error_bound = local_error_bound;
+
+	return 0;
+}
+
 int ena_com_mmio_reg_read_request_init(struct ena_com_dev *ena_dev)
 {
 	struct ena_com_mmio_read *mmio_read = &ena_dev->mmio_read;
diff --git a/drivers/net/ethernet/amazon/ena/ena_com.h b/drivers/net/ethernet/amazon/ena/ena_com.h
index 3905d348..8df63eef 100644
--- a/drivers/net/ethernet/amazon/ena/ena_com.h
+++ b/drivers/net/ethernet/amazon/ena/ena_com.h
@@ -299,6 +299,9 @@  struct ena_com_phc_info {
 	/* PHC shared memory - physical address */
 	dma_addr_t phys_addr;
 
+	/* Cached error bound per timestamp sample */
+	u32 error_bound;
+
 	/* Request id sent to the device */
 	u16 req_id;
 
@@ -458,12 +461,19 @@  int ena_com_phc_config(struct ena_com_dev *ena_dev);
  */
 void ena_com_phc_destroy(struct ena_com_dev *ena_dev);
 
-/* ena_com_phc_get - Retrieve PHC timestamp
+/* ena_com_phc_get_timestamp - Retrieve PHC timestamp
+ * @ena_dev: ENA communication layer struct
+ * @timestamp: Retrieved PHC timestamp
+ * @return - 0 on success, negative value on failure
+ */
+int ena_com_phc_get_timestamp(struct ena_com_dev *ena_dev, u64 *timestamp);
+
+/* ena_com_phc_get_error_bound - Retrieve cached PHC error bound
  * @ena_dev: ENA communication layer struct
- * @timestamp: Retrieve PHC timestamp
+ * @error_bound: Cached PHC error bound
  * @return - 0 on success, negative value on failure
  */
-int ena_com_phc_get(struct ena_com_dev *ena_dev, u64 *timestamp);
+int ena_com_phc_get_error_bound(struct ena_com_dev *ena_dev, u32 *error_bound);
 
 /* ena_com_set_mmio_read_mode - Enable/disable the indirect mmio reg read mechanism
  * @ena_dev: ENA communication layer struct
diff --git a/drivers/net/ethernet/amazon/ena/ena_phc.c b/drivers/net/ethernet/amazon/ena/ena_phc.c
index 5c1acd88..5ce9a32d 100644
--- a/drivers/net/ethernet/amazon/ena/ena_phc.c
+++ b/drivers/net/ethernet/amazon/ena/ena_phc.c
@@ -38,7 +38,8 @@  static int ena_phc_gettimex64(struct ptp_clock_info *clock_info,
 
 	ptp_read_system_prets(sts);
 
-	rc = ena_com_phc_get(phc_info->adapter->ena_dev, &timestamp_nsec);
+	rc = ena_com_phc_get_timestamp(phc_info->adapter->ena_dev,
+				       &timestamp_nsec);
 
 	ptp_read_system_postts(sts);