Message ID | 20210818112428.209111-1-markzhang@nvidia.com (mailing list archive) |
---|---|
Headers | show |
Series | Optional counter statistics support | expand |
On Wed, Aug 18, 2021 at 02:24:18PM +0300, Mark Zhang wrote: > Hi, > > This series from Aharon and Neta provides an extension to the rdma > statistics tool that allows to add and remove optional counters > dynamically, using new netlink commands. > > The idea of having optional counters is to provide to the users the > ability to get statistics of counters that hurts performance. > > Once an optional counter was added, its statistics will be presented > along with all the counters, using the show command. > > Binding objects to the optional counters is currently not supported, > neither in auto mode nor in manual mode. > > To get the list of optional counters that are supported on this device, > use "rdma statistic mode supported". To see which counters are currently > enabled, use "rdma statistic mode". > > $ rdma statistic mode supported > link rocep8s0f0/1 > Optional-set: cc_rx_ce_pkts cc_rx_cnp_pkts cc_tx_cnp_pkts > link rocep8s0f1/1 > Optional-set: cc_rx_ce_pkts cc_rx_cnp_pkts cc_tx_cnp_pkts > > $ sudo rdma statistic add link rocep8s0f0/1 optional-set cc_rx_ce_pkts > $ rdma statistic mode > link rocep8s0f0/1 > Optional-set: cc_rx_ce_pkts > $ sudo rdma statistic add link rocep8s0f0/1 optional-set cc_tx_cnp_pkts > $ rdma statistic mode > link rocep8s0f0/1 > Optional-set: cc_rx_ce_pkts cc_tx_cnp_pkts This doesn't look like the right output to iproute to me, the two command should not be using the same tag and the output of iproute should always be formed to be valid input to iproute > $ rdma statistic show link rocep8s0f0/1 > link rocep8s0f0/1 rx_write_requests 0 rx_read_requests 0 rx_atomic_requests 0 out_of_buffer 0 > out_of_sequence 0 duplicate_request 0 rnr_nak_retry_err 0 packet_seq_err 0 implied_nak_seq_err 0 > local_ack_timeout_err 0 resp_local_length_error 0 resp_cqe_error 0 req_cqe_error 0 > req_remote_invalid_request 0 req_remote_access_errors 0 resp_remote_access_errors 0 > resp_cqe_flush_error 0 req_cqe_flush_error 0 roce_adp_retrans 0 roce_adp_retrans_to 0 > roce_slow_restart 0 roce_slow_restart_cnps 0 roce_slow_restart_trans 0 rp_cnp_ignored 0 > rp_cnp_handled 0 np_ecn_marked_roce_packets 0 np_cnp_sent 0 rx_icrc_encapsulated 0 > Optional-set: cc_rx_ce_pkts 0 cc_tx_cnp_pkts 0 Also looks bad, optional counters should not be marked specially at this point. > Aharon Landau (9): > net/mlx5: Add support in bth_opcode as a match criteria > net/mlx5: Add priorities for counters in RDMA namespaces > RDMA/counters: Support to allocate per-port optional counter > statistics > RDMA/mlx5: Add alloc_op_port_stats() support > RDMA/mlx5: Add steering support in optional flow counters > RDMA/nldev: Add support to add and remove optional counters > RDMA/mlx5: Add add_op_stat() and remove_op_stat() support > RDMA/mlx5: Add get_op_stats() support > RDMA/nldev: Add support to get current enabled optional counters > > Neta Ostrovsky (1): > RDMA/nldev: Add support to get optional counters statistics This series is in a poor order, all the core update should come first and the commit messages should explain what is going on when building out the new APIs. The RDMA/mlx5 patches can go last Jason
On 8/24/2021 3:33 AM, Jason Gunthorpe wrote: > On Wed, Aug 18, 2021 at 02:24:18PM +0300, Mark Zhang wrote: >> Hi, >> >> This series from Aharon and Neta provides an extension to the rdma >> statistics tool that allows to add and remove optional counters >> dynamically, using new netlink commands. >> >> The idea of having optional counters is to provide to the users the >> ability to get statistics of counters that hurts performance. >> >> Once an optional counter was added, its statistics will be presented >> along with all the counters, using the show command. >> >> Binding objects to the optional counters is currently not supported, >> neither in auto mode nor in manual mode. >> >> To get the list of optional counters that are supported on this device, >> use "rdma statistic mode supported". To see which counters are currently >> enabled, use "rdma statistic mode". >> >> $ rdma statistic mode supported >> link rocep8s0f0/1 >> Optional-set: cc_rx_ce_pkts cc_rx_cnp_pkts cc_tx_cnp_pkts >> link rocep8s0f1/1 >> Optional-set: cc_rx_ce_pkts cc_rx_cnp_pkts cc_tx_cnp_pkts >> >> $ sudo rdma statistic add link rocep8s0f0/1 optional-set cc_rx_ce_pkts >> $ rdma statistic mode >> link rocep8s0f0/1 >> Optional-set: cc_rx_ce_pkts >> $ sudo rdma statistic add link rocep8s0f0/1 optional-set cc_tx_cnp_pkts >> $ rdma statistic mode >> link rocep8s0f0/1 >> Optional-set: cc_rx_ce_pkts cc_tx_cnp_pkts > > This doesn't look like the right output to iproute to me, the two > command should not be using the same tag and the output of iproute > should always be formed to be valid input to iproute So it should be like this: $ rdma statistic mode supported link rocep8s0f0/1 optional-set cc_rx_ce_pkts cc_rx_cnp_pkts cc_tx_cnp_pkts link rocep8s0f1/1 optional-set cc_rx_ce_pkts cc_rx_cnp_pkts cc_tx_cnp_pkts $ sudo rdma statistic add link rocep8s0f0/1 optional-set cc_rx_ce_pkts $ rdma statistic mode link rocep8s0f0/1 optional-set cc_rx_ce_pkts $ sudo rdma statistic add link rocep8s0f0/1 optional-set cc_tx_cnp_pkts $ rdma statistic mode link rocep8s0f0/1 optional-set cc_rx_ce_pkts cc_tx_cnp_pkts > >> $ rdma statistic show link rocep8s0f0/1 >> link rocep8s0f0/1 rx_write_requests 0 rx_read_requests 0 rx_atomic_requests 0 out_of_buffer 0 >> out_of_sequence 0 duplicate_request 0 rnr_nak_retry_err 0 packet_seq_err 0 implied_nak_seq_err 0 >> local_ack_timeout_err 0 resp_local_length_error 0 resp_cqe_error 0 req_cqe_error 0 >> req_remote_invalid_request 0 req_remote_access_errors 0 resp_remote_access_errors 0 >> resp_cqe_flush_error 0 req_cqe_flush_error 0 roce_adp_retrans 0 roce_adp_retrans_to 0 >> roce_slow_restart 0 roce_slow_restart_cnps 0 roce_slow_restart_trans 0 rp_cnp_ignored 0 >> rp_cnp_handled 0 np_ecn_marked_roce_packets 0 np_cnp_sent 0 rx_icrc_encapsulated 0 >> Optional-set: cc_rx_ce_pkts 0 cc_tx_cnp_pkts 0 > > Also looks bad, optional counters should not be marked specially at > this point. Will put optional counters in the last, like this: $ rdma statistic show link rocep8s0f0/1 link rocep8s0f0/1 rx_write_requests 0 rx_read_requests 0 rx_atomic_requests 0 out_of_buffer 0 out_of_sequence 0 duplicate_request 0 rnr_nak_retry_err 0 packet_seq_err 0 implied_nak_seq_err 0 local_ack_timeout_err 0 resp_local_length_error 0 resp_cqe_error 0 req_cqe_error 0 req_remote_invalid_request 0 req_remote_access_errors 0 resp_remote_access_errors 0 resp_cqe_flush_error 0 req_cqe_flush_error 0 roce_adp_retrans 0 roce_adp_retrans_to 0 roce_slow_restart 0 roce_slow_restart_cnps 0 roce_slow_restart_trans 0 rp_cnp_ignored 0 rp_cnp_handled 0 np_ecn_marked_roce_packets 0 np_cnp_sent 0 rx_icrc_encapsulated 0 cc_rx_ce_pkts 0 cc_tx_cnp_pkts 0 >> Aharon Landau (9): >> net/mlx5: Add support in bth_opcode as a match criteria >> net/mlx5: Add priorities for counters in RDMA namespaces >> RDMA/counters: Support to allocate per-port optional counter >> statistics >> RDMA/mlx5: Add alloc_op_port_stats() support >> RDMA/mlx5: Add steering support in optional flow counters >> RDMA/nldev: Add support to add and remove optional counters >> RDMA/mlx5: Add add_op_stat() and remove_op_stat() support >> RDMA/mlx5: Add get_op_stats() support >> RDMA/nldev: Add support to get current enabled optional counters >> >> Neta Ostrovsky (1): >> RDMA/nldev: Add support to get optional counters statistics > > This series is in a poor order, all the core update should come first > and the commit messages should explain what is going on when building > out the new APIs. > > The RDMA/mlx5 patches can go last Will fix it, thanks Jason. > Jason >
On Tue, Aug 24, 2021 at 09:44:26AM +0800, Mark Zhang wrote: > On 8/24/2021 3:33 AM, Jason Gunthorpe wrote: > > On Wed, Aug 18, 2021 at 02:24:18PM +0300, Mark Zhang wrote: > > > Hi, > > > > > > This series from Aharon and Neta provides an extension to the rdma > > > statistics tool that allows to add and remove optional counters > > > dynamically, using new netlink commands. > > > > > > The idea of having optional counters is to provide to the users the > > > ability to get statistics of counters that hurts performance. > > > > > > Once an optional counter was added, its statistics will be presented > > > along with all the counters, using the show command. > > > > > > Binding objects to the optional counters is currently not supported, > > > neither in auto mode nor in manual mode. > > > > > > To get the list of optional counters that are supported on this device, > > > use "rdma statistic mode supported". To see which counters are currently > > > enabled, use "rdma statistic mode". > > > > > > $ rdma statistic mode supported > > > link rocep8s0f0/1 > > > Optional-set: cc_rx_ce_pkts cc_rx_cnp_pkts cc_tx_cnp_pkts > > > link rocep8s0f1/1 > > > Optional-set: cc_rx_ce_pkts cc_rx_cnp_pkts cc_tx_cnp_pkts > > > > > > $ sudo rdma statistic add link rocep8s0f0/1 optional-set cc_rx_ce_pkts > > > $ rdma statistic mode > > > link rocep8s0f0/1 > > > Optional-set: cc_rx_ce_pkts > > > $ sudo rdma statistic add link rocep8s0f0/1 optional-set cc_tx_cnp_pkts > > > $ rdma statistic mode > > > link rocep8s0f0/1 > > > Optional-set: cc_rx_ce_pkts cc_tx_cnp_pkts > > > > This doesn't look like the right output to iproute to me, the two > > command should not be using the same tag and the output of iproute > > should always be formed to be valid input to iproute > > So it should be like this: > > $ rdma statistic mode supported > link rocep8s0f0/1 optional-set cc_rx_ce_pkts cc_rx_cnp_pkts cc_tx_cnp_pkts > link rocep8s0f1/1 optional-set cc_rx_ce_pkts cc_rx_cnp_pkts cc_tx_cnp_pkts Each netlink tag in the protocol should have a unique string in the output. So you need strings that mean "optional set supported" and "optional set currently enabled" Jason