mbox series

[rdma-next,v2,00/13] Optional counter statistics support

Message ID cover.1632988543.git.leonro@nvidia.com (mailing list archive)
Headers show
Series Optional counter statistics support | expand

Message

Leon Romanovsky Sept. 30, 2021, 8:02 a.m. UTC
From: Leon Romanovsky <leonro@nvidia.com>

Change Log:
v2:
 * Add rdma_free_hw_stats_struct() helper API (with a new patch)
 * In sysfs add a WARN_ON to check if optional stats are always at the end
 * Add a new nldev command to get the counter status
 * Improve nldev_stat_set_counter_dynamic_doit() by creating a target state bitmap
v1: https://lore.kernel.org/all/cover.1631660727.git.leonro@nvidia.com
* Add a descriptor structure to replace name in struct rdma_hw_stats;
* Add a bitmap in struct rdma_hw_stats to indicate the enable/disable
  status of all counters;
* Add a "flag" field in counter descriptor and define
  IB_STAT_FLAG_OPTIONAL flag;
* add/remove_op_stat() are replaced by modify_op_stat();
* Use "set/unset" in command line and send full opcounters list through
  netlink, and send opcounter indexes instead of names;
* Patches are re-ordered.
v0: https://lore.kernel.org/all/20210818112428.209111-1-markzhang@nvidia.com

----------------------------------------------------------------------
Hi,

This series from Neta and Aharon provides an extension to the rdma
statistics tool that allows to set optional counters dynamically, using
netlink.

The idea of having optional counters is to provide to the users the
ability to get statistics of counters that hurts performance.

Once an optional counter was added, its statistics will be presented
along with all the counters, using the show command.

Binding objects to the optional counters is currently not supported,
neither in auto mode nor in manual mode.

To get the list of optional counters that are supported on this device,
use "rdma statistic mode supported". To see which counters are currently
enabled, use "rdma statistic mode".

Examples:

$ rdma statistic mode supported
link rocep8s0f0/1 supported optional-counters cc_rx_ce_pkts,cc_rx_cnp_pkts,cc_tx_cnp_pkts
link rocep8s0f1/1 supported optional-counters cc_rx_ce_pkts,cc_rx_cnp_pkts,cc_tx_cnp_pkts

$ sudo rdma statistic set link rocep8s0f0/1 optional-counters cc_rx_ce_pkts,cc_rx_cnp_pkts
$ rdma statistic mode link rocep8s0f0/1
link rocep8s0f0/1 optional-counters cc_rx_ce_pkts,cc_rx_cnp_pkts

$ rdma statistic show link rocep8s0f0/1
link rocep8s0f0/1 rx_write_requests 0 rx_read_requests 0 rx_atomic_requests 0 out_of_buffer 0
out_of_sequence 0 duplicate_request 0 rnr_nak_retry_err 0 packet_seq_err 0 implied_nak_seq_err 0
local_ack_timeout_err 0 resp_local_length_error 0 resp_cqe_error 0 req_cqe_error 0
req_remote_invalid_request 0 req_remote_access_errors 0 resp_remote_access_errors 0
resp_cqe_flush_error 0 req_cqe_flush_error 0 roce_adp_retrans 0 roce_adp_retrans_to 0
roce_slow_restart 0 roce_slow_restart_cnps 0 roce_slow_restart_trans 0 rp_cnp_ignored 0
rp_cnp_handled 0 np_ecn_marked_roce_packets 0 np_cnp_sent 0 rx_icrc_encapsulated 0 cc_rx_ce_pkts 0
cc_rx_cnp_pkts 0

$ sudo rdma statistic set link rocep8s0f0/1 optional-counters cc_rx_ce_pkts
$ rdma statistic mode link rocep8s0f0/1
link rocep8s0f0/1 optional-counters cc_rx_ce_pkts

Thanks

Aharon Landau (12):
  net/mlx5: Add ifc bits to support optional counters
  net/mlx5: Add priorities for counters in RDMA namespaces
  RDMA/counter: Add a descriptor in struct rdma_hw_stats
  RDMA/counter: Add an is_disabled field in struct rdma_hw_stats
  RDMA/counter: Add optional counter support
  RDMA/nldev: Add support to get status of all counters
  RDMA/nldev: Allow optional-counter status configuration through RDMA
    netlink
  RDMA/mlx5: Support optional counters in hw_stats initialization
  RDMA/mlx5: Add steering support in optional flow counters
  RDMA/mlx5: Add modify_op_stat() support
  RDMA/mlx5: Add optional counter support in get_hw_stats callback
  RDMA/nldev: Add support to get status of all counters

Mark Zhang (1):
  RDMA/core: Add a helper API rdma_free_hw_stats_struct

 drivers/infiniband/core/counters.c            |  38 +-
 drivers/infiniband/core/device.c              |   1 +
 drivers/infiniband/core/nldev.c               | 388 ++++++++++++++----
 drivers/infiniband/core/sysfs.c               |  52 ++-
 drivers/infiniband/core/verbs.c               |  36 ++
 drivers/infiniband/hw/bnxt_re/hw_counters.c   | 137 +++----
 drivers/infiniband/hw/cxgb4/provider.c        |  22 +-
 drivers/infiniband/hw/efa/efa_verbs.c         |  19 +-
 drivers/infiniband/hw/hfi1/verbs.c            |  47 ++-
 drivers/infiniband/hw/irdma/verbs.c           |  98 ++---
 drivers/infiniband/hw/mlx4/main.c             |  37 +-
 drivers/infiniband/hw/mlx4/mlx4_ib.h          |   2 +-
 drivers/infiniband/hw/mlx5/counters.c         | 280 +++++++++++--
 drivers/infiniband/hw/mlx5/fs.c               | 187 +++++++++
 drivers/infiniband/hw/mlx5/mlx5_ib.h          |  28 +-
 drivers/infiniband/sw/rxe/rxe_hw_counters.c   |  42 +-
 .../net/ethernet/mellanox/mlx5/core/fs_core.c |  54 ++-
 include/linux/mlx5/device.h                   |   2 +
 include/linux/mlx5/fs.h                       |   2 +
 include/linux/mlx5/mlx5_ifc.h                 |  22 +-
 include/rdma/ib_hdrs.h                        |   1 +
 include/rdma/ib_verbs.h                       |  57 ++-
 include/rdma/rdma_counter.h                   |   2 +
 include/uapi/rdma/rdma_netlink.h              |   5 +
 24 files changed, 1199 insertions(+), 360 deletions(-)

Comments

Jason Gunthorpe Oct. 4, 2021, 6:11 p.m. UTC | #1
On Thu, Sep 30, 2021 at 11:02:16AM +0300, Leon Romanovsky wrote:

> v2:
>  * Add rdma_free_hw_stats_struct() helper API (with a new patch)
>  * In sysfs add a WARN_ON to check if optional stats are always at the end
>  * Add a new nldev command to get the counter status
>  * Improve nldev_stat_set_counter_dynamic_doit() by creating a
> target state bitmap

Other than still having different get behavior it looks mostly Ok

Please mind some of the wonky formatting clang-format-diff will give a
clue where to look, I got this list:

diff --git a/drivers/infiniband/core/nldev.c b/drivers/infiniband/core/nldev.c
index adbddb07b08ed9..a7f9fe234a9e93 100644
--- a/drivers/infiniband/core/nldev.c
+++ b/drivers/infiniband/core/nldev.c
@@ -1910,13 +1910,13 @@ static int nldev_stat_set_counter_dynamic_doit(struct nlattr *tb[],
 	if (!stats)
 		return -EINVAL;
 
-	target = kcalloc(BITS_TO_LONGS(stats->num_counters),
-		       sizeof(long), GFP_KERNEL);
+	target = kcalloc(BITS_TO_LONGS(stats->num_counters), sizeof(long),
+			 GFP_KERNEL);
 	if (!target)
 		return -ENOMEM;
 
-	nla_for_each_nested(entry_attr,
-			    tb[RDMA_NLDEV_ATTR_STAT_HWCOUNTERS], rem) {
+	nla_for_each_nested (entry_attr, tb[RDMA_NLDEV_ATTR_STAT_HWCOUNTERS],
+			     rem) {
 		index = nla_get_u32(entry_attr);
 		if ((index >= stats->num_counters) ||
 		    !(stats->descs[index].flags & IB_STAT_FLAG_OPTIONAL)) {
@@ -1959,10 +1959,9 @@ static int nldev_stat_set_mode_doit(struct sk_buff *skb, struct nlmsghdr *nlh,
 	if (!msg)
 		return -ENOMEM;
 
-	nlh = nlmsg_put(msg, NETLINK_CB(skb).portid, nlh->nlmsg_seq,
-			RDMA_NL_GET_TYPE(RDMA_NL_NLDEV,
-					 RDMA_NLDEV_CMD_STAT_SET),
-			0, 0);
+	nlh = nlmsg_put(
+		msg, NETLINK_CB(skb).portid, nlh->nlmsg_seq,
+		RDMA_NL_GET_TYPE(RDMA_NL_NLDEV, RDMA_NLDEV_CMD_STAT_SET), 0, 0);
 
 	mode = nla_get_u32(tb[RDMA_NLDEV_ATTR_STAT_MODE]);
 	if (mode == RDMA_COUNTER_MODE_AUTO) {
@@ -2052,8 +2051,8 @@ static int nldev_stat_del_doit(struct sk_buff *skb, struct nlmsghdr *nlh,
 	u32 index, port, qpn, cntn;
 	int ret;
 
-	ret = nlmsg_parse(nlh, 0, tb, RDMA_NLDEV_ATTR_MAX - 1,
-			  nldev_policy, extack);
+	ret = nlmsg_parse(nlh, 0, tb, RDMA_NLDEV_ATTR_MAX - 1, nldev_policy,
+			  extack);
 	if (ret || !tb[RDMA_NLDEV_ATTR_STAT_RES] ||
 	    !tb[RDMA_NLDEV_ATTR_DEV_INDEX] || !tb[RDMA_NLDEV_ATTR_PORT_INDEX] ||
 	    !tb[RDMA_NLDEV_ATTR_STAT_COUNTER_ID] ||
@@ -2079,10 +2078,9 @@ static int nldev_stat_del_doit(struct sk_buff *skb, struct nlmsghdr *nlh,
 		ret = -ENOMEM;
 		goto err;
 	}
-	nlh = nlmsg_put(msg, NETLINK_CB(skb).portid, nlh->nlmsg_seq,
-			RDMA_NL_GET_TYPE(RDMA_NL_NLDEV,
-					 RDMA_NLDEV_CMD_STAT_SET),
-			0, 0);
+	nlh = nlmsg_put(
+		msg, NETLINK_CB(skb).portid, nlh->nlmsg_seq,
+		RDMA_NL_GET_TYPE(RDMA_NL_NLDEV, RDMA_NLDEV_CMD_STAT_SET), 0, 0);
 
 	cntn = nla_get_u32(tb[RDMA_NLDEV_ATTR_STAT_COUNTER_ID]);
 	qpn = nla_get_u32(tb[RDMA_NLDEV_ATTR_RES_LQPN]);
@@ -2109,8 +2107,7 @@ static int nldev_stat_del_doit(struct sk_buff *skb, struct nlmsghdr *nlh,
 	return ret;
 }
 
-static int stat_get_doit_stats_list(struct sk_buff *skb,
-				    struct nlmsghdr *nlh,
+static int stat_get_doit_stats_list(struct sk_buff *skb, struct nlmsghdr *nlh,
 				    struct netlink_ext_ack *extack,
 				    struct nlattr *tb[],
 				    struct ib_device *device, u32 port,

diff --git a/drivers/infiniband/hw/efa/efa_verbs.c b/drivers/infiniband/hw/efa/efa_verbs.c
index 35d818b38e7780..e5919bd9a25106 100644
--- a/drivers/infiniband/hw/efa/efa_verbs.c
+++ b/drivers/infiniband/hw/efa/efa_verbs.c
@@ -60,8 +60,7 @@ struct efa_user_mmap_entry {
 	op(EFA_RDMA_READ_RESP_BYTES, "rdma_read_resp_bytes") \
 
 #define EFA_STATS_ENUM(ename, name) ename,
-#define EFA_STATS_STR(ename, nam) \
-	[ename].name = nam,
+#define EFA_STATS_STR(ename, nam) [ename].name = nam,
 
 enum efa_hw_device_stats {
 	EFA_DEFINE_DEVICE_STATS(EFA_STATS_ENUM)
diff --git a/drivers/infiniband/hw/hfi1/verbs.c b/drivers/infiniband/hw/hfi1/verbs.c
index 09354f1257a947..ed9fa0d84e9ed3 100644
--- a/drivers/infiniband/hw/hfi1/verbs.c
+++ b/drivers/infiniband/hw/hfi1/verbs.c
@@ -1614,10 +1614,8 @@ static int cntr_names_initialized;
  * strings. Optionally some entries can be reserved in the array to hold extra
  * external strings.
  */
-static int init_cntr_names(const char *names_in,
-			   const size_t names_len,
-			   int num_extra_names,
-			   int *num_cntrs,
+static int init_cntr_names(const char *names_in, const size_t names_len,
+			   int num_extra_names, int *num_cntrs,
 			   struct rdma_stat_desc **cntr_descs)
 {
 	struct rdma_stat_desc *q;
diff --git a/drivers/infiniband/hw/irdma/verbs.c b/drivers/infiniband/hw/irdma/verbs.c
diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
index fd4dfb43006b54..38742e233915f9 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -2152,9 +2152,7 @@ static int mlx4_ib_get_hw_stats(struct ib_device *ibdev,
 
 static int __mlx4_ib_alloc_diag_counters(struct mlx4_ib_dev *ibdev,
 					 struct rdma_stat_desc **pdescs,
-					 u32 **offset,
-					 u32 *num,
-					 bool port)
+					 u32 **offset, u32 *num, bool port)
 {
 	u32 num_counters;
 
@@ -2186,8 +2184,7 @@ static int __mlx4_ib_alloc_diag_counters(struct mlx4_ib_dev *ibdev,
 
 static void mlx4_ib_fill_diag_counters(struct mlx4_ib_dev *ibdev,
 				       struct rdma_stat_desc *descs,
-				       u32 *offset,
-				       bool port)
+				       u32 *offset, bool port)
 {
 	int i;
 	int j;
diff --git a/drivers/infiniband/hw/mlx5/counters.c b/drivers/infiniband/hw/mlx5/counters.c
index 206c190d0ce4f9..f72dffd2f42c03 100644
--- a/drivers/infiniband/hw/mlx5/counters.c
+++ b/drivers/infiniband/hw/mlx5/counters.c
@@ -478,10 +478,8 @@ static int mlx5_ib_counter_unbind_qp(struct ib_qp *qp)
 	return mlx5_ib_qp_set_counter(qp, NULL);
 }
 
-
 static void mlx5_ib_fill_counters(struct mlx5_ib_dev *dev,
-				  struct rdma_stat_desc *descs,
-				  size_t *offsets)
+				  struct rdma_stat_desc *descs, size_t *offsets)
 {
 	int i;
 	int j = 0;
@@ -34,7 +34,7 @@ int rxe_ib_get_hw_stats(struct ib_device *ibdev,
 	if (!port || !stats)
 		return -EINVAL;
 
-	for (cnt = 0; cnt  < ARRAY_SIZE(rxe_counter_descs); cnt++)
+	for (cnt = 0; cnt < ARRAY_SIZE(rxe_counter_descs); cnt++)
 		stats->value[cnt] = atomic64_read(&dev->stats_counters[cnt]);
 
 	return ARRAY_SIZE(rxe_counter_descs);