diff mbox series

[net-next,09/15] net/mlx5: Add counter information to mlx5 driver documentation

Message ID 20230204100854.388126-10-saeed@kernel.org (mailing list archive)
State Accepted
Commit 8ce3b586faa471ab750bd201c0fd063c2a29e515
Delegated to: Netdev Maintainers
Headers show
Series [net-next,01/15] net/mlx5: Lag, Update multiport eswitch check to log an error | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Pull request is its own cover letter
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers warning 8 maintainers not CCed: richardcochran@gmail.com john.fastabend@gmail.com daniel@iogearbox.net corbet@lwn.net linux-doc@vger.kernel.org bpf@vger.kernel.org hawk@kernel.org ast@kernel.org
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch warning WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Saeed Mahameed Feb. 4, 2023, 10:08 a.m. UTC
From: Rahul Rameshbabu <rrameshbabu@nvidia.com>

Update rst file to contain general information about statistics counters
for the mlx5 driver. Add specifics about individual counters in list
tables.

Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../ethernet/mellanox/mlx5/counters.rst       | 1302 +++++++++++++++++
 .../ethernet/mellanox/mlx5/index.rst          |    1 +
 2 files changed, 1303 insertions(+)
 create mode 100644 Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst
diff mbox series

Patch

diff --git a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst
new file mode 100644
index 000000000000..4cd8e869762b
--- /dev/null
+++ b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst
@@ -0,0 +1,1302 @@ 
+.. SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
+.. include:: <isonum.txt>
+
+================
+Ethtool counters
+================
+
+:Copyright: |copy| 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+
+Contents
+========
+
+- `Overview`_
+- `Groups`_
+- `Types`_
+- `Descriptions`_
+
+Overview
+========
+
+There are several counter groups based on where the counter is being counted. In
+addition, each group of counters may have different counter types.
+
+These counter groups are based on which component in a networking setup,
+illustrated below, that they describe::
+
+                                                  ----------------------------------------
+                                                  |                                      |
+    ----------------------------------------    ---------------------------------------- |
+    |              Hypervisor              |    |                  VM                  | |
+    |                                      |    |                                      | |
+    | -------------------  --------------- |    | -------------------  --------------- | |
+    | | Ethernet driver |  | RDMA driver | |    | | Ethernet driver |  | RDMA driver | | |
+    | -------------------  --------------- |    | -------------------  --------------- | |
+    |           |                 |        |    |           |                 |        | |
+    |           -------------------        |    |           -------------------        | |
+    |                   |                  |    |                   |                  |--
+    ----------------------------------------    ----------------------------------------
+                        |                                           |
+            -------------               -----------------------------
+            |                           |
+         ------                      ------ ------ ------         ------      ------      ------
+    -----| PF |----------------------| VF |-| VF |-| VF |-----  --| PF |--- --| PF |--- --| PF |---
+    |    ------                      ------ ------ ------    |  | ------  | | ------  | | ------  |
+    |                                                        |  |         | |         | |         |
+    |                                                        |  |         | |         | |         |
+    |                                                        |  |         | |         | |         |
+    | eSwitch                                                |  | eSwitch | | eSwitch | | eSwitch |
+    ----------------------------------------------------------  ----------- ----------- -----------
+               -------------------------------------------------------------------------------
+               |                                                                             |
+               |                                                                             |
+               | Uplink (no counters)                                                        |
+               -------------------------------------------------------------------------------
+                       ---------------------------------------------------------------
+                       |                                                             |
+                       |                                                             |
+                       | MPFS (no counters)                                          |
+                       ---------------------------------------------------------------
+                                                     |
+                                                     |
+                                                     | Port
+
+Groups
+======
+
+Ring
+  Software counters populated by the driver stack.
+
+Netdev
+  An aggregation of software ring counters.
+
+vPort counters
+  Traffic counters and drops due to steering or no buffers. May indicate issues
+  with NIC. These counters include Ethernet traffic counters (including Raw
+  Ethernet) and RDMA/RoCE traffic counters.
+
+Physical port counters
+  Counters that collect statistics about the PFs and VFs. May indicate issues
+  with NIC, link, or network. This measuring point holds information on
+  standardized counters like IEEE 802.3, RFC2863, RFC 2819, RFC 3635 and
+  additional counters like flow control, FEC and more. Physical port counters
+  are not exposed to virtual machines.
+
+Priority Port Counters
+  A set of the physical port counters, per priority per port.
+
+Types
+=====
+
+Counters are divided into three types.
+
+Traffic Informative Counters
+  Counters which count traffic. These counters can be used for load estimation
+  or for general debug.
+
+Traffic Acceleration Counters
+  Counters which count traffic that was accelerated by Mellanox driver or by
+  hardware. The counters are an additional layer to the informative counter set,
+  and the same traffic is counted in both informative and acceleration counters.
+
+.. [#accel] Traffic acceleration counter.
+
+Error Counters
+  Increment of these counters might indicate a problem. Each of these counters
+  has an explanation and correction action.
+
+Statistic can be fetched via the `ip link` or `ethtool` commands. `ethtool`
+provides more detailed information.::
+
+    ip –s link show <if-name>
+    ethtool -S <if-name>
+
+Descriptions
+============
+
+XSK, PTP, and QoS counters that are similar to counters defined previously will
+not be separately listed. For example, `ptp_tx[i]_packets` will not be
+explicitly documented since `tx[i]_packets` describes the behavior of both
+counters, except `ptp_tx[i]_packets` is only counted when precision time
+protocol is used.
+
+Ring / Netdev Counter
+----------------------------
+The following counters are available per ring or software port.
+
+These counters provide information on the amount of traffic that was accelerated
+by the NIC. The counters are counting the accelerated traffic in addition to the
+standard counters which counts it (i.e. accelerated traffic is counted twice).
+
+The counter names in the table below refers to both ring and port counters. The
+notation for ring counters includes the [i] index without the braces. The
+notation for port counters doesn't include the [i]. A counter name
+`rx[i]_packets` will be printed as `rx0_packets` for ring 0 and `rx_packets` for
+the software port.
+
+.. flat-table:: Ring / Software Port Counter Table
+   :widths: 2 3 1
+
+   * - Counter
+     - Description
+     - Type
+
+   * - `rx[i]_packets`
+     - The number of packets received on ring i.
+     - Informative
+
+   * - `rx[i]_bytes`
+     - The number of bytes received on ring i.
+     - Informative
+
+   * - `tx[i]_packets`
+     - The number of packets transmitted on ring i.
+     - Informative
+
+   * - `tx[i]_bytes`
+     - The number of bytes transmitted on ring i.
+     - Informative
+
+   * - `tx[i]_recover`
+     - The number of times the SQ was recovered.
+     - Error
+
+   * - `tx[i]_cqes`
+     - Number of CQEs events on SQ issued on ring i.
+     - Informative
+
+   * - `tx[i]_cqe_err`
+     - The number of error CQEs encountered on the SQ for ring i.
+     - Error
+
+   * - `tx[i]_tso_packets`
+     - The number of TSO packets transmitted on ring i [#accel]_.
+     - Acceleration
+
+   * - `tx[i]_tso_bytes`
+     - The number of TSO bytes transmitted on ring i [#accel]_.
+     - Acceleration
+
+   * - `tx[i]_tso_inner_packets`
+     - The number of TSO packets which are indicated to be carry internal
+       encapsulation transmitted on ring i [#accel]_.
+     - Acceleration
+
+   * - `tx[i]_tso_inner_bytes`
+     - The number of TSO bytes which are indicated to be carry internal
+       encapsulation transmitted on ring i [#accel]_.
+     - Acceleration
+
+   * - `rx[i]_gro_packets`
+     - Number of received packets processed using hardware-accelerated GRO. The
+       number of hardware GRO offloaded packets received on ring i.
+     - Acceleration
+
+   * - `rx[i]_gro_bytes`
+     - Number of received bytes processed using hardware-accelerated GRO. The
+       number of hardware GRO offloaded bytes received on ring i.
+     - Acceleration
+
+   * - `rx[i]_gro_skbs`
+     - The number of receive SKBs constructed while performing
+       hardware-accelerated GRO.
+     - Informative
+
+   * - `rx[i]_gro_match_packets`
+     - Number of received packets processed using hardware-accelerated GRO that
+       met the flow table match criteria.
+     - Informative
+
+   * - `rx[i]_gro_large_hds`
+     - Number of receive packets using hardware-accelerated GRO that have large
+       headers that require additional memory to be allocated.
+     - Informative
+
+   * - `rx[i]_lro_packets`
+     - The number of LRO packets received on ring i [#accel]_.
+     - Acceleration
+
+   * - `rx[i]_lro_bytes`
+     - The number of LRO bytes received on ring i [#accel]_.
+     - Acceleration
+
+   * - `rx[i]_ecn_mark`
+     - The number of received packets where the ECN mark was turned on.
+     - Informative
+
+   * - `rx_oversize_pkts_buffer`
+     - The number of dropped received packets due to length which arrived to RQ
+       and exceed software buffer size allocated by the device for incoming
+       traffic. It might imply that the device MTU is larger than the software
+       buffers size.
+     - Error
+
+   * - `rx_oversize_pkts_sw_drop`
+     - Number of received packets dropped in software because the CQE data is
+       larger than the MTU size.
+     - Error
+
+   * - `rx[i]_csum_unnecessary`
+     - Packets received with a `CHECKSUM_UNNECESSARY` on ring i [#accel]_.
+     - Acceleration
+
+   * - `rx[i]_csum_unnecessary_inner`
+     - Packets received with inner encapsulation with a `CHECKSUM_UNNECESSARY`
+       on ring i [#accel]_.
+     - Acceleration
+
+   * - `rx[i]_csum_none`
+     - Packets received with a `CHECKSUM_NONE` on ring i [#accel]_.
+     - Acceleration
+
+   * - `rx[i]_csum_complete`
+     - Packets received with a `CHECKSUM_COMPLETE` on ring i [#accel]_.
+     - Acceleration
+
+   * - `rx[i]_csum_complete_tail`
+     - Number of received packets that had checksum calculation computed,
+       potentially needed padding, and were able to do so with
+       `CHECKSUM_PARTIAL`.
+     - Informative
+
+   * - `rx[i]_csum_complete_tail_slow`
+     - Number of received packets that need padding larger than eight bytes for
+       the checksum.
+     - Informative
+
+   * - `tx[i]_csum_partial`
+     - Packets transmitted with a `CHECKSUM_PARTIAL` on ring i [#accel]_.
+     - Acceleration
+
+   * - `tx[i]_csum_partial_inner`
+     - Packets transmitted with inner encapsulation with a `CHECKSUM_PARTIAL` on
+       ring i [#accel]_.
+     - Acceleration
+
+   * - `tx[i]_csum_none`
+     - Packets transmitted with no hardware checksum acceleration on ring i.
+     - Informative
+
+   * - `tx[i]_stopped` / `tx_queue_stopped` [#ring_global]_
+     - Events where SQ was full on ring i. If this counter is increased, check
+       the amount of buffers allocated for transmission.
+     - Informative
+
+   * - `tx[i]_wake` / `tx_queue_wake` [#ring_global]_
+     - Events where SQ was full and has become not full on ring i.
+     - Informative
+
+   * - `tx[i]_dropped` / `tx_queue_dropped` [#ring_global]_
+     - Packets transmitted that were dropped due to DMA mapping failure on
+       ring i. If this counter is increased, check the amount of buffers
+       allocated for transmission.
+     - Error
+
+   * - `tx[i]_nop`
+     - The number of nop WQEs (empty WQEs) inserted to the SQ (related to
+       ring i) due to the reach of the end of the cyclic buffer. When reaching
+       near to the end of cyclic buffer the driver may add those empty WQEs to
+       avoid handling a state the a WQE start in the end of the queue and ends
+       in the beginning of the queue. This is a normal condition.
+     - Informative
+
+   * - `tx[i]_added_vlan_packets`
+     - The number of packets sent where vlan tag insertion was offloaded to the
+       hardware.
+     - Acceleration
+
+   * - `rx[i]_removed_vlan_packets`
+     - The number of packets received where vlan tag stripping was offloaded to
+       the hardware.
+     - Acceleration
+
+   * - `rx[i]_wqe_err`
+     - The number of wrong opcodes received on ring i.
+     - Error
+
+   * - `rx[i]_mpwqe_frag`
+     - The number of WQEs that failed to allocate compound page and hence
+       fragmented MPWQE’s (Multi Packet WQEs) were used on ring i. If this
+       counter raise, it may suggest that there is no enough memory for large
+       pages, the driver allocated fragmented pages. This is not abnormal
+       condition.
+     - Informative
+
+   * - `rx[i]_mpwqe_filler_cqes`
+     - The number of filler CQEs events that were issued on ring i.
+     - Informative
+
+   * - `rx[i]_mpwqe_filler_strides`
+     - The number of strides consumed by filler CQEs on ring i.
+     - Informative
+
+   * - `tx[i]_mpwqe_blks`
+     - The number of send blocks processed from Multi-Packet WQEs (mpwqe).
+     - Informative
+
+   * - `tx[i]_mpwqe_pkts`
+     - The number of send packets processed from Multi-Packet WQEs (mpwqe).
+     - Informative
+
+   * - `rx[i]_cqe_compress_blks`
+     - The number of receive blocks with CQE compression on ring i [#accel]_.
+     - Acceleration
+
+   * - `rx[i]_cqe_compress_pkts`
+     - The number of receive packets with CQE compression on ring i [#accel]_.
+     - Acceleration
+
+   * - `rx[i]_cache_reuse`
+     - The number of events of successful reuse of a page from a driver's
+       internal page cache.
+     - Acceleration
+
+   * - `rx[i]_cache_full`
+     - The number of events of full internal page cache where driver can't put a
+       page back to the cache for recycling (page will be freed).
+     - Acceleration
+
+   * - `rx[i]_cache_empty`
+     - The number of events where cache was empty - no page to give. Driver
+       shall allocate new page.
+     - Acceleration
+
+   * - `rx[i]_cache_busy`
+     - The number of events where cache head was busy and cannot be recycled.
+       Driver allocated new page.
+     - Acceleration
+
+   * - `rx[i]_cache_waive`
+     - The number of cache evacuation. This can occur due to page move to
+       another NUMA node or page was pfmemalloc-ed and should be freed as soon
+       as possible.
+     - Acceleration
+
+   * - `rx[i]_arfs_err`
+     - Number of flow rules that failed to be added to the flow table.
+     - Error
+
+   * - `rx[i]_recover`
+     - The number of times the RQ was recovered.
+     - Error
+
+   * - `tx[i]_xmit_more`
+     - The number of packets sent with `xmit_more` indication set on the skbuff
+       (no doorbell).
+     - Acceleration
+
+   * - `ch[i]_poll`
+     - The number of invocations of NAPI poll of channel i.
+     - Informative
+
+   * - `ch[i]_arm`
+     - The number of times the NAPI poll function completed and armed the
+       completion queues on channel i.
+     - Informative
+
+   * - `ch[i]_aff_change`
+     - The number of times the NAPI poll function explicitly stopped execution
+       on a CPU due to a change in affinity, on channel i.
+     - Informative
+
+   * - `ch[i]_events`
+     - The number of hard interrupt events on the completion queues of channel i.
+     - Informative
+
+   * - `ch[i]_eq_rearm`
+     - The number of times the EQ was recovered.
+     - Error
+
+   * - `ch[i]_force_irq`
+     - Number of times NAPI is triggered by XSK wakeups by posting a NOP to
+       ICOSQ.
+     - Acceleration
+
+   * - `rx[i]_congst_umr`
+     - The number of times an outstanding UMR request is delayed due to
+       congestion, on ring i.
+     - Informative
+
+   * - `rx_pp_alloc_fast`
+     - Number of successful fast path allocations.
+     - Informative
+
+   * - `rx_pp_alloc_slow`
+     - Number of slow path order-0 allocations.
+     - Informative
+
+   * - `rx_pp_alloc_slow_high_order`
+     - Number of slow path high order allocations.
+     - Informative
+
+   * - `rx_pp_alloc_empty`
+     - Counter is incremented when ptr ring is empty, so a slow path allocation
+       was forced.
+     - Informative
+
+   * - `rx_pp_alloc_refill`
+     - Counter is incremented when an allocation which triggered a refill of the
+       cache.
+     - Informative
+
+   * - `rx_pp_alloc_waive`
+     - Counter is incremented when pages obtained from the ptr ring that cannot
+       be added to the cache due to a NUMA mismatch.
+     - Informative
+
+   * - `rx_pp_recycle_cached`
+     - Counter is incremented when recycling placed page in the page pool cache.
+     - Informative
+
+   * - `rx_pp_recycle_cache_full`
+     - Counter is incremented when page pool cache was full.
+     - Informative
+
+   * - `rx_pp_recycle_ring`
+     - Counter is incremented when page placed into the ptr ring.
+     - Informative
+
+   * - `rx_pp_recycle_ring_full`
+     - Counter is incremented when page released from page pool because the ptr
+       ring was full.
+     - Informative
+
+   * - `rx_pp_recycle_released_ref`
+     - Counter is incremented when page released (and not recycled) because
+       refcnt > 1.
+     - Informative
+
+   * - `rx[i]_xsk_buff_alloc_err`
+     - The number of times allocating an skb or XSK buffer failed in the XSK RQ
+       context.
+     - Error
+
+   * - `rx[i]_xsk_arfs_err`
+     - aRFS (accelerated Receive Flow Steering) does not occur in the XSK RQ
+       context, so this counter should never increment.
+     - Error
+
+   * - `rx[i]_xdp_tx_xmit`
+     - The number of packets forwarded back to the port due to XDP program
+       `XDP_TX` action (bouncing). these packets are not counted by other
+       software counters. These packets are counted by physical port and vPort
+       counters.
+     - Informative
+
+   * - `rx[i]_xdp_tx_mpwqe`
+     - Number of multi-packet WQEs transmitted by the netdev and `XDP_TX`-ed by
+       the netdev during the RQ context.
+     - Acceleration
+
+   * - `rx[i]_xdp_tx_inlnw`
+     - Number of WQE data segments transmitted where the data could be inlined
+       in the WQE and then `XDP_TX`-ed during the RQ context.
+     - Acceleration
+
+   * - `rx[i]_xdp_tx_nops`
+     - Number of NOP WQEBBs (WQE building blocks) received posted to the XDP SQ.
+     - Acceleration
+
+   * - `rx[i]_xdp_tx_full`
+     - The number of packets that should have been forwarded back to the port
+       due to `XDP_TX` action but were dropped due to full tx queue. These packets
+       are not counted by other software counters. These packets are counted by
+       physical port and vPort counters. You may open more rx queues and spread
+       traffic rx over all queues and/or increase rx ring size.
+     - Error
+
+   * - `rx[i]_xdp_tx_err`
+     - The number of times an `XDP_TX` error such as frame too long and frame
+       too short occurred on `XDP_TX` ring of RX ring.
+     - Error
+
+   * - `rx[i]_xdp_tx_cqes` / `rx_xdp_tx_cqe` [#ring_global]_
+     - The number of completions received on the CQ of the `XDP_TX` ring.
+     - Informative
+
+   * - `rx[i]_xdp_drop`
+     - The number of packets dropped due to XDP program `XDP_DROP` action. these
+       packets are not counted by other software counters. These packets are
+       counted by physical port and vPort counters.
+     - Informative
+
+   * - `rx[i]_xdp_redirect`
+     - The number of times an XDP redirect action was triggered on ring i.
+     - Acceleration
+
+   * - `tx[i]_xdp_xmit`
+     - The number of packets redirected to the interface(due to XDP redirect).
+       These packets are not counted by other software counters. These packets
+       are counted by physical port and vPort counters.
+     - Informative
+
+   * - `tx[i]_xdp_full`
+     - The number of packets redirected to the interface(due to XDP redirect),
+       but were dropped due to full tx queue. these packets are not counted by
+       other software counters. you may enlarge tx queues.
+     - Informative
+
+   * - `tx[i]_xdp_mpwqe`
+     - Number of multi-packet WQEs offloaded onto the NIC that were
+       `XDP_REDIRECT`-ed from other netdevs.
+     - Acceleration
+
+   * - `tx[i]_xdp_inlnw`
+     - Number of WQE data segments where the data could be inlined in the WQE
+       where the data segments were `XDP_REDIRECT`-ed from other netdevs.
+     - Acceleration
+
+   * - `tx[i]_xdp_nops`
+     - Number of NOP WQEBBs (WQE building blocks) posted to the SQ that were
+       `XDP_REDIRECT`-ed from other netdevs.
+     - Acceleration
+
+   * - `tx[i]_xdp_err`
+     - The number of packets redirected to the interface(due to XDP redirect)
+       but were dropped due to error such as frame too long and frame too short.
+     - Error
+
+   * - `tx[i]_xdp_cqes`
+     - The number of completions received for packets redirected to the
+       interface(due to XDP redirect) on the CQ.
+     - Informative
+
+   * - `tx[i]_xsk_xmit`
+     - The number of packets transmitted using XSK zerocopy functionality.
+     - Acceleration
+
+   * - `tx[i]_xsk_mpwqe`
+     - Number of multi-packet WQEs offloaded onto the NIC that were
+       `XDP_REDIRECT`-ed from other netdevs.
+     - Acceleration
+
+   * - `tx[i]_xsk_inlnw`
+     - Number of WQE data segments where the data could be inlined in the WQE
+       that are transmitted using XSK zerocopy.
+     - Acceleration
+
+   * - `tx[i]_xsk_full`
+     - Number of times doorbell is rung in XSK zerocopy mode when SQ is full.
+     - Error
+
+   * - `tx[i]_xsk_err`
+     - Number of errors that occurred in XSK zerocopy mode such as if the data
+       size is larger than the MTU size.
+     - Error
+
+   * - `tx[i]_xsk_cqes`
+     - Number of CQEs processed in XSK zerocopy mode.
+     - Acceleration
+
+   * - `tx_tls_ctx`
+     - Number of TLS TX HW offload contexts added to device for encryption.
+     - Acceleration
+
+   * - `tx_tls_del`
+     - Number of TLS TX HW offload contexts removed from device (connection
+       closed).
+     - Acceleration
+
+   * - `tx_tls_pool_alloc`
+     - Number of times a unit of work is successfully allocated in the TLS HW
+       offload pool.
+     - Acceleration
+
+   * - `tx_tls_pool_free`
+     - Number of times a unit of work is freed in the TLS HW offload pool.
+     - Acceleration
+
+   * - `rx_tls_ctx`
+     - Number of TLS RX HW offload contexts added to device for decryption.
+     - Acceleration
+
+   * - `rx_tls_del`
+     - Number of TLS RX HW offload contexts deleted from device (connection has
+       finished).
+     - Acceleration
+
+   * - `rx[i]_tls_decrypted_packets`
+     - Number of successfully decrypted RX packets which were part of a TLS
+       stream.
+     - Acceleration
+
+   * - `rx[i]_tls_decrypted_bytes`
+     - Number of TLS payload bytes in RX packets which were successfully
+       decrypted.
+     - Acceleration
+
+   * - `rx[i]_tls_resync_req_pkt`
+     - Number of received TLS packets with a resync request.
+     - Acceleration
+
+   * - `rx[i]_tls_resync_req_start`
+     - Number of times the TLS async resync request was started.
+     - Acceleration
+
+   * - `rx[i]_tls_resync_req_end`
+     - Number of times the TLS async resync request properly ended with
+       providing the HW tracked tcp-seq.
+     - Acceleration
+
+   * - `rx[i]_tls_resync_req_skip`
+     - Number of times the TLS async resync request procedure was started but
+       not properly ended.
+     - Error
+
+   * - `rx[i]_tls_resync_res_ok`
+     - Number of times the TLS resync response call to the driver was
+       successfully handled.
+     - Acceleration
+
+   * - `rx[i]_tls_resync_res_retry`
+     - Number of times the TLS resync response call to the driver was
+       reattempted when ICOSQ is full.
+     - Error
+
+   * - `rx[i]_tls_resync_res_skip`
+     - Number of times the TLS resync response call to the driver was terminated
+       unsuccessfully.
+     - Error
+
+   * - `rx[i]_tls_err`
+     - Number of times when CQE TLS offload was problematic.
+     - Error
+
+   * - `tx[i]_tls_encrypted_packets`
+     - The number of send packets that are TLS encrypted by the kernel.
+     - Acceleration
+
+   * - `tx[i]_tls_encrypted_bytes`
+     - The number of send bytes that are TLS encrypted by the kernel.
+     - Acceleration
+
+   * - `tx[i]_tls_ooo`
+     - Number of times out of order TLS SQE fragments were handled on ring i.
+     - Acceleration
+
+   * - `tx[i]_tls_dump_packets`
+     - Number of TLS decrypted packets copied over from NIC over DMA.
+     - Acceleration
+
+   * - `tx[i]_tls_dump_bytes`
+     - Number of TLS decrypted bytes copied over from NIC over DMA.
+     - Acceleration
+
+   * - `tx[i]_tls_resync_bytes`
+     - Number of TLS bytes requested to be resynchronized in order to be
+       decrypted.
+     - Acceleration
+
+   * - `tx[i]_tls_skip_no_sync_data`
+     - Number of TLS send data that can safely be skipped / do not need to be
+       decrypted.
+     - Acceleration
+
+   * - `tx[i]_tls_drop_no_sync_data`
+     - Number of TLS send data that were dropped due to retransmission of TLS
+       data.
+     - Acceleration
+
+   * - `ptp_cq[i]_abort`
+     - Number of times a CQE has to be skipped in precision time protocol due to
+       a skew between the port timestamp and CQE timestamp being greater than
+       128 seconds.
+     - Error
+
+   * - `ptp_cq[i]_abort_abs_diff_ns`
+     - Accumulation of time differences between the port timestamp and CQE
+       timestamp when the difference is greater than 128 seconds in precision
+       time protocol.
+     - Error
+
+.. [#ring_global] The corresponding ring and global counters do not share the
+                  same name (i.e. do not follow the common naming scheme).
+
+vPort Counters
+--------------
+Counters on the NIC port that is connected to a eSwitch.
+
+.. flat-table:: vPort Counter Table
+   :widths: 2 3 1
+
+   * - Counter
+     - Description
+     - Type
+
+   * - `rx_vport_unicast_packets`
+     - Unicast packets received, steered to a port including Raw Ethernet
+       QP/DPDK traffic, excluding RDMA traffic.
+     - Informative
+
+   * - `rx_vport_unicast_bytes`
+     - Unicast bytes received, steered to a port including Raw Ethernet QP/DPDK
+       traffic, excluding RDMA traffic.
+     - Informative
+
+   * - `tx_vport_unicast_packets`
+     - Unicast packets transmitted, steered from a port including Raw Ethernet
+       QP/DPDK traffic, excluding RDMA traffic.
+     - Informative
+
+   * - `tx_vport_unicast_bytes`
+     - Unicast bytes transmitted, steered from a port including Raw Ethernet
+       QP/DPDK traffic, excluding RDMA traffic.
+     - Informative
+
+   * - `rx_vport_multicast_packets`
+     - Multicast packets received, steered to a port including Raw Ethernet
+       QP/DPDK traffic, excluding RDMA traffic.
+     - Informative
+
+   * - `rx_vport_multicast_bytes`
+     - Multicast bytes received, steered to a port including Raw Ethernet
+       QP/DPDK traffic, excluding RDMA traffic.
+     - Informative
+
+   * - `tx_vport_multicast_packets`
+     - Multicast packets transmitted, steered from a port including Raw Ethernet
+       QP/DPDK traffic, excluding RDMA traffic.
+     - Informative
+
+   * - `tx_vport_multicast_bytes`
+     - Multicast bytes transmitted, steered from a port including Raw Ethernet
+       QP/DPDK traffic, excluding RDMA traffic.
+     - Informative
+
+   * - `rx_vport_broadcast_packets`
+     - Broadcast packets received, steered to a port including Raw Ethernet
+       QP/DPDK traffic, excluding RDMA traffic.
+     - Informative
+
+   * - `rx_vport_broadcast_bytes`
+     - Broadcast bytes received, steered to a port including Raw Ethernet
+       QP/DPDK traffic, excluding RDMA traffic.
+     - Informative
+
+   * - `tx_vport_broadcast_packets`
+     - Broadcast packets transmitted, steered from a port including Raw Ethernet
+       QP/DPDK traffic, excluding RDMA traffic.
+     - Informative
+
+   * - `tx_vport_broadcast_bytes`
+     - Broadcast bytes transmitted, steered from a port including Raw Ethernet
+       QP/DPDK traffic, excluding RDMA traffic.
+     - Informative
+
+   * - `rx_vport_rdma_unicast_packets`
+     - RDMA unicast packets received, steered to a port (counters counts
+       RoCE/UD/RC traffic) [#accel]_.
+     - Acceleration
+
+   * - `rx_vport_rdma_unicast_bytes`
+     - RDMA unicast bytes received, steered to a port (counters counts
+       RoCE/UD/RC traffic) [#accel]_.
+     - Acceleration
+
+   * - `tx_vport_rdma_unicast_packets`
+     - RDMA unicast packets transmitted, steered from a port (counters counts
+       RoCE/UD/RC traffic) [#accel]_.
+     - Acceleration
+
+   * - `tx_vport_rdma_unicast_bytes`
+     - RDMA unicast bytes transmitted, steered from a port (counters counts
+       RoCE/UD/RC traffic) [#accel]_.
+     - Acceleration
+
+   * - `rx_vport_rdma_multicast_packets`
+     - RDMA multicast packets received, steered to a port (counters counts
+       RoCE/UD/RC traffic) [#accel]_.
+     - Acceleration
+
+   * - `rx_vport_rdma_multicast_bytes`
+     - RDMA multicast bytes received, steered to a port (counters counts
+       RoCE/UD/RC traffic) [#accel]_.
+     - Acceleration
+
+   * - `tx_vport_rdma_multicast_packets`
+     - RDMA multicast packets transmitted, steered from a port (counters counts
+       RoCE/UD/RC traffic) [#accel]_.
+     - Acceleration
+
+   * - `tx_vport_rdma_multicast_bytes`
+     - RDMA multicast bytes transmitted, steered from a port (counters counts
+       RoCE/UD/RC traffic) [#accel]_.
+     - Acceleration
+
+   * - `rx_steer_missed_packets`
+     - Number of packets that was received by the NIC, however was discarded
+       because it did not match any flow in the NIC flow table.
+     - Error
+
+   * - `rx_packets`
+     - Representor only: packets received, that were handled by the hypervisor.
+     - Informative
+
+   * - `rx_bytes`
+     - Representor only: bytes received, that were handled by the hypervisor.
+     - Informative
+
+   * - `tx_packets`
+     - Representor only: packets transmitted, that were handled by the
+       hypervisor.
+     - Informative
+
+   * - `tx_bytes`
+     - Representor only: bytes transmitted, that were handled by the hypervisor.
+     - Informative
+
+   * - `dev_internal_queue_oob`
+     - The number of dropped packets due to lack of receive WQEs for an internal
+       device RQ.
+     - Error
+
+Physical Port Counters
+----------------------
+The physical port counters are the counters on the external port connecting the
+adapter to the network. This measuring point holds information on standardized
+counters like IEEE 802.3, RFC2863, RFC 2819, RFC 3635 and additional counters
+like flow control, FEC and more.
+
+.. flat-table:: Physical Port Counter Table
+   :widths: 2 3 1
+
+   * - Counter
+     - Description
+     - Type
+
+   * - `rx_packets_phy`
+     - The number of packets received on the physical port. This counter doesn’t
+       include packets that were discarded due to FCS, frame size and similar
+       errors.
+     - Informative
+
+   * - `tx_packets_phy`
+     - The number of packets transmitted on the physical port.
+     - Informative
+
+   * - `rx_bytes_phy`
+     - The number of bytes received on the physical port, including Ethernet
+       header and FCS.
+     - Informative
+
+   * - `tx_bytes_phy`
+     - The number of bytes transmitted on the physical port.
+     - Informative
+
+   * - `rx_multicast_phy`
+     - The number of multicast packets received on the physical port.
+     - Informative
+
+   * - `tx_multicast_phy`
+     - The number of multicast packets transmitted on the physical port.
+     - Informative
+
+   * - `rx_broadcast_phy`
+     - The number of broadcast packets received on the physical port.
+     - Informative
+
+   * - `tx_broadcast_phy`
+     - The number of broadcast packets transmitted on the physical port.
+     - Informative
+
+   * - `rx_crc_errors_phy`
+     - The number of dropped received packets due to FCS (Frame Check Sequence)
+       error on the physical port. If this counter is increased in high rate,
+       check the link quality using `rx_symbol_error_phy` and
+       `rx_corrected_bits_phy` counters below.
+     - Error
+
+   * - `rx_in_range_len_errors_phy`
+     - The number of received packets dropped due to length/type errors on a
+       physical port.
+     - Error
+
+   * - `rx_out_of_range_len_phy`
+     - The number of received packets dropped due to length greater than allowed
+       on a physical port. If this counter is increasing, it implies that the
+       peer connected to the adapter has a larger MTU configured. Using same MTU
+       configuration shall resolve this issue.
+     - Error
+
+   * - `rx_oversize_pkts_phy`
+     - The number of dropped received packets due to length which exceed MTU
+       size on a physical port. If this counter is increasing, it implies that
+       the peer connected to the adapter has a larger MTU configured. Using same
+       MTU configuration shall resolve this issue.
+     - Error
+
+   * - `rx_symbol_err_phy`
+     - The number of received packets dropped due to physical coding errors
+       (symbol errors) on a physical port.
+     - Error
+
+   * - `rx_mac_control_phy`
+     - The number of MAC control packets received on the physical port.
+     - Informative
+
+   * - `tx_mac_control_phy`
+     - The number of MAC control packets transmitted on the physical port.
+     - Informative
+
+   * - `rx_pause_ctrl_phy`
+     - The number of link layer pause packets received on a physical port. If
+       this counter is increasing, it implies that the network is congested and
+       cannot absorb the traffic coming from to the adapter.
+     - Informative
+
+   * - `tx_pause_ctrl_phy`
+     - The number of link layer pause packets transmitted on a physical port. If
+       this counter is increasing, it implies that the NIC is congested and
+       cannot absorb the traffic coming from the network.
+     - Informative
+
+   * - `rx_unsupported_op_phy`
+     - The number of MAC control packets received with unsupported opcode on a
+       physical port.
+     - Error
+
+   * - `rx_discards_phy`
+     - The number of received packets dropped due to lack of buffers on a
+       physical port. If this counter is increasing, it implies that the adapter
+       is congested and cannot absorb the traffic coming from the network.
+     - Error
+
+   * - `tx_discards_phy`
+     - The number of packets which were discarded on transmission, even no
+       errors were detected. the drop might occur due to link in down state,
+       head of line drop, pause from the network, etc.
+     - Error
+
+   * - `tx_errors_phy`
+     - The number of transmitted packets dropped due to a length which exceed
+       MTU size on a physical port.
+     - Error
+
+   * - `rx_undersize_pkts_phy`
+     - The number of received packets dropped due to length which is shorter
+       than 64 bytes on a physical port. If this counter is increasing, it
+       implies that the peer connected to the adapter has a non-standard MTU
+       configured or malformed packet had arrived.
+     - Error
+
+   * - `rx_fragments_phy`
+     - The number of received packets dropped due to a length which is shorter
+       than 64 bytes and has FCS error on a physical port. If this counter is
+       increasing, it implies that the peer connected to the adapter has a
+       non-standard MTU configured.
+     - Error
+
+   * - `rx_jabbers_phy`
+     - The number of received packets d due to a length which is longer than 64
+       bytes and had FCS error on a physical port.
+     - Error
+
+   * - `rx_64_bytes_phy`
+     - The number of packets received on the physical port with size of 64 bytes.
+     - Informative
+
+   * - `rx_65_to_127_bytes_phy`
+     - The number of packets received on the physical port with size of 65 to
+       127 bytes.
+     - Informative
+
+   * - `rx_128_to_255_bytes_phy`
+     - The number of packets received on the physical port with size of 128 to
+       255 bytes.
+     - Informative
+
+   * - `rx_256_to_511_bytes_phy`
+     - The number of packets received on the physical port with size of 256 to
+       512 bytes.
+     - Informative
+
+   * - `rx_512_to_1023_bytes_phy`
+     - The number of packets received on the physical port with size of 512 to
+       1023 bytes.
+     - Informative
+
+   * - `rx_1024_to_1518_bytes_phy`
+     - The number of packets received on the physical port with size of 1024 to
+       1518 bytes.
+     - Informative
+
+   * - `rx_1519_to_2047_bytes_phy`
+     - The number of packets received on the physical port with size of 1519 to
+       2047 bytes.
+     - Informative
+
+   * - `rx_2048_to_4095_bytes_phy`
+     - The number of packets received on the physical port with size of 2048 to
+       4095 bytes.
+     - Informative
+
+   * - `rx_4096_to_8191_bytes_phy`
+     - The number of packets received on the physical port with size of 4096 to
+       8191 bytes.
+     - Informative
+
+   * - `rx_8192_to_10239_bytes_phy`
+     - The number of packets received on the physical port with size of 8192 to
+       10239 bytes.
+     - Informative
+
+   * - `link_down_events_phy`
+     - The number of times where the link operative state changed to down. In
+       case this counter is increasing it may imply on port flapping. You may
+       need to replace the cable/transceiver.
+     - Error
+
+   * - `rx_out_of_buffer`
+     - Number of times receive queue had no software buffers allocated for the
+       adapter's incoming traffic.
+     - Error
+
+   * - `module_bus_stuck`
+     - The number of times that module's I\ :sup:`2`\C bus (data or clock)
+       short-wire was detected. You may need to replace the cable/transceiver.
+     - Error
+
+   * - `module_high_temp`
+     - The number of times that the module temperature was too high. If this
+       issue persist, you may need to check the ambient temperature or replace
+       the cable/transceiver module.
+     - Error
+
+   * - `module_bad_shorted`
+     - The number of times that the module cables were shorted. You may need to
+       replace the cable/transceiver module.
+     - Error
+
+   * - `module_unplug`
+     - The number of times that module was ejected.
+     - Informative
+
+   * - `rx_buffer_passed_thres_phy`
+     - The number of events where the port receive buffer was over 85% full.
+     - Informative
+
+   * - `tx_pause_storm_warning_events`
+     - The number of times the device was sending pauses for a long period of
+       time.
+     - Informative
+
+   * - `tx_pause_storm_error_events`
+     - The number of times the device was sending pauses for a long period of
+       time, reaching time out and disabling transmission of pause frames. on
+       the period where pause frames were disabled, drop could have been
+       occurred.
+     - Error
+
+   * - `rx[i]_buff_alloc_err`
+     - Failed to allocate a buffer to received packet (or SKB) on ring i.
+     - Error
+
+   * - `rx_bits_phy`
+     - This counter provides information on the total amount of traffic that
+       could have been received and can be used as a guideline to measure the
+       ratio of errored traffic in `rx_pcs_symbol_err_phy` and
+       `rx_corrected_bits_phy`.
+     - Informative
+
+   * - `rx_pcs_symbol_err_phy`
+     - This counter counts the number of symbol errors that wasn’t corrected by
+       FEC correction algorithm or that FEC algorithm was not active on this
+       interface. If this counter is increasing, it implies that the link
+       between the NIC and the network is suffering from high BER, and that
+       traffic is lost. You may need to replace the cable/transceiver. The error
+       rate is the number of `rx_pcs_symbol_err_phy` divided by the number of
+       `rx_bits_phy` on a specific time frame.
+     - Error
+
+   * - `rx_corrected_bits_phy`
+     - The number of corrected bits on this port according to active FEC
+       (RS/FC). If this counter is increasing, it implies that the link between
+       the NIC and the network is suffering from high BER. The corrected bit
+       rate is the number of `rx_corrected_bits_phy` divided by the number of
+       `rx_bits_phy` on a specific time frame.
+     - Error
+
+   * - `rx_err_lane_[l]_phy`
+     - This counter counts the number of physical raw errors per lane l index.
+       The counter counts errors before FEC corrections. If this counter is
+       increasing, it implies that the link between the NIC and the network is
+       suffering from high BER, and that traffic might be lost. You may need to
+       replace the cable/transceiver. Please check in accordance with
+       `rx_corrected_bits_phy`.
+     - Error
+
+   * - `rx_global_pause`
+     - The number of pause packets received on the physical port. If this
+       counter is increasing, it implies that the network is congested and
+       cannot absorb the traffic coming from the adapter. Note: This counter is
+       only enabled when global pause mode is enabled.
+     - Informative
+
+   * - `rx_global_pause_duration`
+     - The duration of pause received (in microSec) on the physical port. The
+       counter represents the time the port did not send any traffic. If this
+       counter is increasing, it implies that the network is congested and
+       cannot absorb the traffic coming from the adapter. Note: This counter is
+       only enabled when global pause mode is enabled.
+     - Informative
+
+   * - `tx_global_pause`
+     - The number of pause packets transmitted on a physical port. If this
+       counter is increasing, it implies that the adapter is congested and
+       cannot absorb the traffic coming from the network. Note: This counter is
+       only enabled when global pause mode is enabled.
+     - Informative
+
+   * - `tx_global_pause_duration`
+     - The duration of pause transmitter (in microSec) on the physical port.
+       Note: This counter is only enabled when global pause mode is enabled.
+     - Informative
+
+   * - `rx_global_pause_transition`
+     - The number of times a transition from Xoff to Xon on the physical port
+       has occurred. Note: This counter is only enabled when global pause mode
+       is enabled.
+     - Informative
+
+   * - `rx_if_down_packets`
+     - The number of received packets that were dropped due to interface down.
+     - Informative
+
+Priority Port Counters
+----------------------
+The following counters are physical port counters that are counted per L2
+priority (0-7).
+
+**Note:** `p` in the counter name represents the priority.
+
+.. flat-table:: Priority Port Counter Table
+   :widths: 2 3 1
+
+   * - Counter
+     - Description
+     - Type
+
+   * - `rx_prio[p]_bytes`
+     - The number of bytes received with priority p on the physical port.
+     - Informative
+
+   * - `rx_prio[p]_packets`
+     - The number of packets received with priority p on the physical port.
+     - Informative
+
+   * - `tx_prio[p]_bytes`
+     - The number of bytes transmitted on priority p on the physical port.
+     - Informative
+
+   * - `tx_prio[p]_packets`
+     - The number of packets transmitted on priority p on the physical port.
+     - Informative
+
+   * - `rx_prio[p]_pause`
+     - The number of pause packets received with priority p on a physical port.
+       If this counter is increasing, it implies that the network is congested
+       and cannot absorb the traffic coming from the adapter. Note: This counter
+       is available only if PFC was enabled on priority p.
+     - Informative
+
+   * - `rx_prio[p]_pause_duration`
+     - The duration of pause received (in microSec) on priority p on the
+       physical port. The counter represents the time the port did not send any
+       traffic on this priority. If this counter is increasing, it implies that
+       the network is congested and cannot absorb the traffic coming from the
+       adapter. Note: This counter is available only if PFC was enabled on
+       priority p.
+     - Informative
+
+   * - `rx_prio[p]_pause_transition`
+     - The number of times a transition from Xoff to Xon on priority p on the
+       physical port has occurred. Note: This counter is available only if PFC
+       was enabled on priority p.
+     - Informative
+
+   * - `tx_prio[p]_pause`
+     - The number of pause packets transmitted on priority p on a physical port.
+       If this counter is increasing, it implies that the adapter is congested
+       and cannot absorb the traffic coming from the network. Note: This counter
+       is available only if PFC was enabled on priority p.
+     - Informative
+
+   * - `tx_prio[p]_pause_duration`
+     - The duration of pause transmitter (in microSec) on priority p on the
+       physical port. Note: This counter is available only if PFC was enabled on
+       priority p.
+     - Informative
+
+   * - `rx_prio[p]_buf_discard`
+     - The number of packets discarded by device due to lack of per host receive
+       buffers.
+     - Informative
+
+   * - `rx_prio[p]_cong_discard`
+     - The number of packets discarded by device due to per host congestion.
+     - Informative
+
+   * - `rx_prio[p]_marked`
+     - The number of packets ecn marked by device due to per host congestion.
+     - Informative
+
+   * - `rx_prio[p]_discards`
+     - The number of packets discarded by device due to lack of receive buffers.
+     - Informative
+
+Device Counters
+---------------
+.. flat-table:: Device Counter Table
+   :widths: 2 3 1
+
+   * - Counter
+     - Description
+     - Type
+
+   * - `rx_pci_signal_integrity`
+     - Counts physical layer PCIe signal integrity errors, the number of
+       transitions to recovery due to Framing errors and CRC (dlp and tlp). If
+       this counter is raising, try moving the adapter card to a different slot
+       to rule out a bad PCI slot. Validate that you are running with the latest
+       firmware available and latest server BIOS version.
+     - Error
+
+   * - `tx_pci_signal_integrity`
+     - Counts physical layer PCIe signal integrity errors, the number of
+       transition to recovery initiated by the other side (moving to recovery
+       due to getting TS/EIEOS). If this counter is raising, try moving the
+       adapter card to a different slot to rule out a bad PCI slot. Validate
+       that you are running with the latest firmware available and latest server
+       BIOS version.
+     - Error
+
+   * - `outbound_pci_buffer_overflow`
+     - The number of packets dropped due to pci buffer overflow. If this counter
+       is raising in high rate, it might indicate that the receive traffic rate
+       for a host is larger than the PCIe bus and therefore a congestion occurs.
+     - Informative
+
+   * - `outbound_pci_stalled_rd`
+     - The percentage (in the range 0...100) of time within the last second that
+       the NIC had outbound non-posted reads requests but could not perform the
+       operation due to insufficient posted credits.
+     - Informative
+
+   * - `outbound_pci_stalled_wr`
+     - The percentage (in the range 0...100) of time within the last second that
+       the NIC had outbound posted writes requests but could not perform the
+       operation due to insufficient posted credits.
+     - Informative
+
+   * - `outbound_pci_stalled_rd_events`
+     - The number of seconds where `outbound_pci_stalled_rd` was above 30%.
+     - Informative
+
+   * - `outbound_pci_stalled_wr_events`
+     - The number of seconds where `outbound_pci_stalled_wr` was above 30%.
+     - Informative
+
+   * - `dev_out_of_buffer`
+     - The number of times the device owned queue had not enough buffers
+       allocated.
+     - Error
diff --git a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/index.rst b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/index.rst
index 2346459ae6cc..3fdcd6b61ccf 100644
--- a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/index.rst
+++ b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/index.rst
@@ -16,6 +16,7 @@  Contents:
    devlink
    switchdev
    tracepoints
+   counters
 
 .. only::  subproject and html