diff mbox

Support optional performance counters, including congestion control performance counters.

Message ID 1311206683.6898.184.camel@auk59.llnl.gov (mailing list archive)
State Accepted, archived
Delegated to: Ira Weiny
Headers show

Commit Message

Al Chu July 21, 2011, 12:04 a.m. UTC
Hey everyone,

Here's a new patch series for the optional performance counters.  It
fixes up the issues brought up by Hal.  I also include a new patch that
fixes up the incorrect BITSOFFS for other fields in libibmad.

Al

On Wed, 2011-07-20 at 11:35 -0700, Hal Rosenstock wrote:
> Hi again Al,
> 
> On 7/20/2011 1:38 PM, Albert Chu wrote:
> > Hey Hal,
> > 
> > Thanks for the nit-catches.  As for
> > 
> >> +     {32, 2, "PortVLXmitFlowCtlUpdateErrors0", mad_dump_uint},
> >>> +     {34, 2, "PortVLXmitFlowCtlUpdateErrors1", mad_dump_uint},
> >>> +     {36, 2, "PortVLXmitFlowCtlUpdateErrors2", mad_dump_uint},
> >>> +     {38, 2, "PortVLXmitFlowCtlUpdateErrors3", mad_dump_uint},
> >>> +     {40, 2, "PortVLXmitFlowCtlUpdateErrors4", mad_dump_uint},
> >>> +     {42, 2, "PortVLXmitFlowCtlUpdateErrors5", mad_dump_uint},
> >>> +     {44, 2, "PortVLXmitFlowCtlUpdateErrors6", mad_dump_uint},
> >>> +     {46, 2, "PortVLXmitFlowCtlUpdateErrors7", mad_dump_uint},
> >>> +     {48, 2, "PortVLXmitFlowCtlUpdateErrors8", mad_dump_uint},
> >>> +     {50, 2, "PortVLXmitFlowCtlUpdateErrors9", mad_dump_uint},
> >>> +     {52, 2, "PortVLXmitFlowCtlUpdateErrors10", mad_dump_uint},
> >>> +     {54, 2, "PortVLXmitFlowCtlUpdateErrors11", mad_dump_uint},
> >>> +     {56, 2, "PortVLXmitFlowCtlUpdateErrors12", mad_dump_uint},
> >>> +     {58, 2, "PortVLXmitFlowCtlUpdateErrors13", mad_dump_uint},
> >>> +     {60, 2, "PortVLXmitFlowCtlUpdateErrors14", mad_dump_uint},
> >>> +     {62, 2, "PortVLXmitFlowCtlUpdateErrors15", mad_dump_uint},
> >>
> >> Don't these need to be BITSOFFS(nn, 2)  ?
> > 
> > Perhaps there's a subtlety I'm missing.  If these require BITSOFFS, then
> > wouldn't the 16 bit fields require them too?  There are many places
> > amongst the performance counters that BITSOFFS isn't used w/ 16 bit
> > fields.
> 
> Yes; it looks like any field less than 32 bits should use BITSOFFS so I
> think that there are some existing things to fix in fields.c (the 16 bit
> fields that are not using the macro).
> 
> -- Hal
> 
> > Al
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Ira Weiny July 22, 2011, 12:20 a.m. UTC | #1
On Wed, 20 Jul 2011 17:04:43 -0700
Albert Chu <chu11@llnl.gov> wrote:

> Hey everyone,
> 
> Here's a new patch series for the optional performance counters.  It
> fixes up the issues brought up by Hal.  I also include a new patch that
> fixes up the incorrect BITSOFFS for other fields in libibmad.

Thanks all 3 applied.

> 
> Al
> 
> On Wed, 2011-07-20 at 11:35 -0700, Hal Rosenstock wrote:
> > Hi again Al,
> > 
> > On 7/20/2011 1:38 PM, Albert Chu wrote:
> > > Hey Hal,
> > > 
> > > Thanks for the nit-catches.  As for
> > > 
> > >> +     {32, 2, "PortVLXmitFlowCtlUpdateErrors0", mad_dump_uint},
> > >>> +     {34, 2, "PortVLXmitFlowCtlUpdateErrors1", mad_dump_uint},
> > >>> +     {36, 2, "PortVLXmitFlowCtlUpdateErrors2", mad_dump_uint},
> > >>> +     {38, 2, "PortVLXmitFlowCtlUpdateErrors3", mad_dump_uint},
> > >>> +     {40, 2, "PortVLXmitFlowCtlUpdateErrors4", mad_dump_uint},
> > >>> +     {42, 2, "PortVLXmitFlowCtlUpdateErrors5", mad_dump_uint},
> > >>> +     {44, 2, "PortVLXmitFlowCtlUpdateErrors6", mad_dump_uint},
> > >>> +     {46, 2, "PortVLXmitFlowCtlUpdateErrors7", mad_dump_uint},
> > >>> +     {48, 2, "PortVLXmitFlowCtlUpdateErrors8", mad_dump_uint},
> > >>> +     {50, 2, "PortVLXmitFlowCtlUpdateErrors9", mad_dump_uint},
> > >>> +     {52, 2, "PortVLXmitFlowCtlUpdateErrors10", mad_dump_uint},
> > >>> +     {54, 2, "PortVLXmitFlowCtlUpdateErrors11", mad_dump_uint},
> > >>> +     {56, 2, "PortVLXmitFlowCtlUpdateErrors12", mad_dump_uint},
> > >>> +     {58, 2, "PortVLXmitFlowCtlUpdateErrors13", mad_dump_uint},
> > >>> +     {60, 2, "PortVLXmitFlowCtlUpdateErrors14", mad_dump_uint},
> > >>> +     {62, 2, "PortVLXmitFlowCtlUpdateErrors15", mad_dump_uint},
> > >>
> > >> Don't these need to be BITSOFFS(nn, 2)  ?
> > > 
> > > Perhaps there's a subtlety I'm missing.  If these require BITSOFFS, then
> > > wouldn't the 16 bit fields require them too?  There are many places
> > > amongst the performance counters that BITSOFFS isn't used w/ 16 bit
> > > fields.
> > 
> > Yes; it looks like any field less than 32 bits should use BITSOFFS so I
> > think that there are some existing things to fix in fields.c (the 16 bit
> > fields that are not using the macro).
> > 
> > -- Hal
> > 
> > > Al
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> -- 
> Albert Chu
> chu11@llnl.gov
> Computer Scientist
> High Performance Systems Division
> Lawrence Livermore National Laboratory
>
diff mbox

Patch

diff --git a/man/perfquery.8 b/man/perfquery.8
index e01dc2f..7acc60c 100644
--- a/man/perfquery.8
+++ b/man/perfquery.8
@@ -6,10 +6,12 @@  perfquery \- query InfiniBand port counters
 .SH SYNOPSIS
 .B perfquery
 [\-d(ebug)] [\-G(uid)] [\-x|\-\-extended] [\-X|\-\-xmtsl] [\-S|\-\-rcvsl]
-[\-D|\-\-xmtdisc] [\-E|\-\-rcverr] [\-c|\-\-smplctl]
-[-a(ll_ports)] [-l(oop_ports)] [-r(eset_after_read)] [-R(eset_only)]
-[\-C ca_name] [\-P ca_port] [\-t(imeout) timeout_ms] [\-V(ersion)] [\-h(elp)]
-[<lid|guid> [[port] [reset_mask]]]
+[\-D|\-\-xmtdisc] [\-E|\-\-rcverr] [\-\-oprcvcounters] [\-\-flowctlcounters]
+[\-\-vloppackets] [\-\-vlopdata] [\-\-vlxmitflowctlerrors] [\-\-vlxmitcounters]
+[\-\-swportvlcong] [\-\-rcvcc] [\-\-slrcvfecn] [\-\-slrcvbecn] [\-\-xmitcc]
+[\-\-vlxmittimecc] [\-c|\-\-smplctl] [-a(ll_ports)] [-l(oop_ports)]
+[-r(eset_after_read)] [-R(eset_only)] [\-C ca_name] [\-P ca_port] [\-t(imeout) timeout_ms]
+[\-V(ersion)] [\-h(elp)] [<lid|guid> [[port] [reset_mask]]]
 
 .SH DESCRIPTION
 .PP
@@ -49,6 +51,42 @@  show receive error details. This is an optional counter.
 \fB\-D\fR, \fB\-\-xmtdisc\fR
 show transmit discard details. This is an optional counter.
 .TP
+\fB\-\-oprcvcounters\fR
+show Rcv Counters per Op code. This is an optional counter.
+.TP
+\fB\-\-flowctlcounters\fR
+show flow control counters. This is an optional counter.
+.TP
+\fB\-\-vloppackets\fR
+show packets received per Op code per VL. This is an optional counter.
+.TP
+\fB\-\-vlopdata\fR
+show data received per Op code per VL. This is an optional counter.
+.TP
+\fB\-\-vlxmitflowctlerrors\fR
+show flow control update errors per VL. This is an optional counter.
+.TP
+\fB\-\-vlxmitcounters\fR
+show ticks waiting to transmit counters per VL. This is an optional counter.
+.TP
+\fB\-\-swportvlcong\fR
+show sw port VL congestion. This is an optional counter.
+.TP
+\fB\-\-rcvcc\fR
+show Rcv congestion control counters. This is an optional counter.
+.TP
+\fB\-\-slrcvfecn\fR
+show SL Rcv FECN counters. This is an optional counter.
+.TP
+\fB\-\-slrcvbecn\fR
+show SL Rcv BECN counters. This is an optional counter.
+.TP
+\fB\-\-xmitcc\fR
+show Xmit congestion control counters. This is an optional counter.
+.TP
+\fB\-\-vlxmittimecc\fR
+show VL Xmit Time congestion control counters. This is an optional counter.
+.TP
 \fB\-c\fR, \fB\-\-smplctl\fR
 show port samples control.
 .TP
diff --git a/src/perfquery.c b/src/perfquery.c
index 8923654..0ea68aa 100644
--- a/src/perfquery.c
+++ b/src/perfquery.c
@@ -368,7 +368,9 @@  static void reset_counters(int extended, int timeout, int mask,
 }
 
 static int reset, reset_only, all_ports, loop_ports, port, extended, xmt_sl,
-    rcv_sl, xmt_disc, rcv_err, smpl_ctl;
+    rcv_sl, xmt_disc, rcv_err, smpl_ctl, oprcvcounters, flowctlcounters,
+    vloppackets, vlopdata, vlxmitflowctlerrors, vlxmitcounters, swportvlcong,
+    rcvcc, slrcvfecn, slrcvbecn, xmitcc, vlxmittimecc;
 
 static void common_func(ib_portid_t * portid, int port_num, int mask,
 			unsigned query, unsigned reset,
@@ -423,6 +425,90 @@  static void rcv_err_query(ib_portid_t * portid, int port, int mask)
 		    mad_dump_perfcounters_rcv_err);
 }
 
+static void oprcvcounters_query(ib_portid_t * portid, int port, int mask)
+{
+	common_func(portid, port, mask, !reset_only, (reset_only || reset),
+		    "PortOpRcvCounters", IB_GSI_PORT_PORT_OP_RCV_COUNTERS,
+		    mad_dump_perfcounters_port_op_rcv_counters);
+}
+
+static void flowctlcounters_query(ib_portid_t * portid, int port, int mask)
+{
+	common_func(portid, port, mask, !reset_only, (reset_only || reset),
+		    "PortFlowCtlCounters", IB_GSI_PORT_PORT_FLOW_CTL_COUNTERS,
+		    mad_dump_perfcounters_port_flow_ctl_counters);
+}
+
+static void vloppackets_query(ib_portid_t * portid, int port, int mask)
+{
+	common_func(portid, port, mask, !reset_only, (reset_only || reset),
+		    "PortVLOpPackets", IB_GSI_PORT_PORT_VL_OP_PACKETS,
+		    mad_dump_perfcounters_port_vl_op_packet);
+}
+
+static void vlopdata_query(ib_portid_t * portid, int port, int mask)
+{
+	common_func(portid, port, mask, !reset_only, (reset_only || reset),
+		    "PortVLOpData", IB_GSI_PORT_PORT_VL_OP_DATA,
+		    mad_dump_perfcounters_port_vl_op_data);
+}
+
+static void vlxmitflowctlerrors_query(ib_portid_t * portid, int port, int mask)
+{
+	common_func(portid, port, mask, !reset_only, (reset_only || reset),
+		    "PortVLXmitFlowCtlUpdateErrors", IB_GSI_PORT_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS,
+		    mad_dump_perfcounters_port_vl_xmit_flow_ctl_update_errors);
+}
+
+static void vlxmitcounters_query(ib_portid_t * portid, int port, int mask)
+{
+	common_func(portid, port, mask, !reset_only, (reset_only || reset),
+		    "PortVLXmitWaitCounters", IB_GSI_PORT_PORT_VL_XMIT_WAIT_COUNTERS,
+		    mad_dump_perfcounters_port_vl_xmit_wait_counters);
+}
+
+static void swportvlcong_query(ib_portid_t * portid, int port, int mask)
+{
+	common_func(portid, port, mask, !reset_only, (reset_only || reset),
+		    "SwPortVLCongestion", IB_GSI_SW_PORT_VL_CONGESTION,
+		    mad_dump_perfcounters_sw_port_vl_congestion);
+}
+
+static void rcvcc_query(ib_portid_t * portid, int port, int mask)
+{
+	common_func(portid, port, mask, !reset_only, (reset_only || reset),
+		    "PortRcvConCtrl", IB_GSI_PORT_RCV_CON_CTRL,
+		    mad_dump_perfcounters_rcv_con_ctrl);
+}
+
+static void slrcvfecn_query(ib_portid_t * portid, int port, int mask)
+{
+	common_func(portid, port, mask, !reset_only, (reset_only || reset),
+		    "PortSLRcvFECN", IB_GSI_PORT_SL_RCV_FECN,
+		    mad_dump_perfcounters_sl_rcv_fecn);
+}
+
+static void slrcvbecn_query(ib_portid_t * portid, int port, int mask)
+{
+	common_func(portid, port, mask, !reset_only, (reset_only || reset),
+		    "PortSLRcvBECN", IB_GSI_PORT_SL_RCV_BECN,
+		    mad_dump_perfcounters_sl_rcv_becn);
+}
+
+static void xmitcc_query(ib_portid_t * portid, int port, int mask)
+{
+	common_func(portid, port, mask, !reset_only, (reset_only || reset),
+		    "PortXmitConCtrl", IB_GSI_PORT_XMIT_CON_CTRL,
+		    mad_dump_perfcounters_xmit_con_ctrl);
+}
+
+static void vlxmittimecc_query(ib_portid_t * portid, int port, int mask)
+{
+	common_func(portid, port, mask, !reset_only, (reset_only || reset),
+		    "PortVLXmitTimeCong", IB_GSI_PORT_VL_XMIT_TIME_CONG,
+		    mad_dump_perfcounters_vl_xmit_time_cong);
+}
+
 void dump_portsamples_control(ib_portid_t * portid, int port)
 {
 	char buf[1024];
@@ -458,6 +544,42 @@  static int process_opt(void *context, int ch, char *optarg)
 	case 'c':
 		smpl_ctl = 1;
 		break;
+	case 1:
+		oprcvcounters = 1;
+		break;
+	case 2:
+		flowctlcounters = 1;
+		break;
+	case 3:
+		vloppackets = 1;
+		break;
+	case 4:
+		vlopdata = 1;
+		break;
+	case 5:
+		vlxmitflowctlerrors = 1;
+		break;
+	case 6:
+		vlxmitcounters = 1;
+		break;
+	case 7:
+		swportvlcong = 1;
+		break;
+	case 8:
+		rcvcc = 1;
+		break;
+	case 9:
+		slrcvfecn = 1;
+		break;
+	case 10:
+		slrcvbecn = 1;
+		break;
+	case 11:
+		xmitcc = 1;
+		break;
+	case 12:
+		vlxmittimecc = 1;
+		break;
 	case 'a':
 		all_ports++;
 		port = ALL_PORTS;
@@ -498,6 +620,18 @@  int main(int argc, char **argv)
 		{"rcvsl", 'S', 0, NULL, "show Rcv SL port counters"},
 		{"xmtdisc", 'D', 0, NULL, "show Xmt Discard Details"},
 		{"rcverr", 'E', 0, NULL, "show Rcv Error Details"},
+		{"oprcvcounters", 1, 0, NULL, "show Rcv Counters per Op code"},
+		{"flowctlcounters", 2, 0, NULL, "show flow control counters"},
+		{"vloppackets", 3, 0, NULL, "show packets received per Op code per VL"},
+		{"vlopdata", 4, 0, NULL, "show data received per Op code per VL"},
+		{"vlxmitflowctlerrors", 5, 0, NULL, "show flow control update errors per VL"},
+		{"vlxmitcounters", 6, 0, NULL, "show ticks waiting to transmit counters per VL"},
+		{"swportvlcong", 7, 0, NULL, "show sw port VL congestion"},
+		{"rcvcc", 8, 0, NULL, "show Rcv congestion control counters"},
+		{"slrcvfecn", 9, 0, NULL, "show SL Rcv FECN counters"},
+		{"slrcvbecn", 10, 0, NULL, "show SL Rcv BECN counters"},
+		{"xmitcc", 11, 0, NULL, "show Xmit congestion control counters"},
+		{"vlxmittimecc", 12, 0, NULL, "show VL Xmit Time congestion control counters"},
 		{"smplctl", 'c', 0, NULL, "show samples control"},
 		{"all_ports", 'a', 0, NULL, "show aggregated counters"},
 		{"loop_ports", 'l', 0, NULL, "iterate through each port"},
@@ -579,11 +713,72 @@  int main(int argc, char **argv)
 		goto done;
 	}
 
+	if (oprcvcounters) {
+		oprcvcounters_query(&portid, port, mask);
+		goto done;
+	}
+
+	if (flowctlcounters) {
+		flowctlcounters_query(&portid, port, mask);
+		goto done;
+	}
+
+	if (vloppackets) {
+		vloppackets_query(&portid, port, mask);
+		goto done;
+	}
+
+	if (vlopdata) {
+		vlopdata_query(&portid, port, mask);
+		goto done;
+	}
+
+	if (vlxmitflowctlerrors) {
+		vlxmitflowctlerrors_query(&portid, port, mask);
+		goto done;
+	}
+
+	if (vlxmitcounters) {
+		vlxmitcounters_query(&portid, port, mask);
+		goto done;
+	}
+
+	if (swportvlcong) {
+		swportvlcong_query(&portid, port, mask);
+		goto done;
+	}
+
+	if (rcvcc) {
+		rcvcc_query(&portid, port, mask);
+		goto done;
+	}
+
+	if (slrcvfecn) {
+		slrcvfecn_query(&portid, port, mask);
+		goto done;
+	}
+
+	if (slrcvbecn) {
+		slrcvbecn_query(&portid, port, mask);
+		goto done;
+	}
+
+	if (xmitcc) {
+		xmitcc_query(&portid, port, mask);
+		goto done;
+	}
+
+	if (vlxmittimecc) {
+		vlxmittimecc_query(&portid, port, mask);
+		goto done;
+	}
+
 	if (smpl_ctl) {
 		dump_portsamples_control(&portid, port);
 		goto done;
 	}
 
+
 	if (all_ports_loop || (loop_ports && (all_ports || port == ALL_PORTS))) {
 		if (smp_query_via(data, &portid, IB_ATTR_NODE_INFO, 0, 0,
 				  srcport) < 0)