Message ID | 1491924533-2548-1-git-send-email-honli@redhat.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
On Tue, Apr 11, 2017 at 11:28:53PM +0800, Honggang LI wrote: > From: Honggang Li <honli@redhat.com> > > According to InfiniBand Architecture Release 1.2.1, Table 208 > Example PathRecord Request MAD Header Fields, MADHeader:Method > should setup to 0x12 (SubnAdmGetTable()). That is just an example, Table 192 shows that Get and GetTable are both valid query types. I however expect that the SM will fail when using GetTable if the query returns more than one result, could that be what is happening? When you convert it to GET_TABLE does it return more than one result? Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 4/11/2017 11:28 AM, Honggang LI wrote: > From: Honggang Li <honli@redhat.com> > > According to InfiniBand Architecture Release 1.2.1, Table 208 > Example PathRecord Request MAD Header Fields, MADHeader:Method > should setup to 0x12 (SubnAdmGetTable()). Both get and get table should be supported for Path Record. > Before send the MAD packet for PathRecord request, init_srp_mad setup > out_mad->method to SRP_MAD_METHOD_GET (0x01) for get_shared_pkeys. But > get_shared_pkeys setup the attr_id field to SRP_SA_ATTR_PATH_REC (0x35). > > Because of this incorrect field in MAD packet, upstream srptools-1.0.3 > failed with an embedded subnet manager running on an Intel True Scale > Edge Switch 12300. Sounds like a bug in the embedded SM. If it works with GetTable and not with Get, sounds like there is more than 1 path available to be returned. SA PathRecord:NumbPath says: In a SubnAdmGetTable() query request: Maximum number of paths to return for each unique SGIDDGID combination. If more paths that satisfy the PathRecord query exist for a given SGID-DGID combination, only NumbPath paths shall be returned (implementation defined). In a SubnAdmGet() query request, ignored; a value of 1 is used. So in case of multiple paths and get method, SM is supposed to pick one rather than return some error (like too many records). Is that the MAD status returned ? -- Hal > Upstream srptools-1.0.3 works with upstream opensm, because > sa_mad_ctrl_process post the MAD to the dispatcher based on the attr_id > field. As attr_id had been set to SRP_SA_ATTR_PATH_REC (0x35), > PathRecord query works with opensm. > > opensm/opensm/osm_sa_mad_ctrl.c > 292 static void sa_mad_ctrl_rcv_callback(IN osm_madw_t * p_madw, IN void *context, > ......... > 357 switch (p_sa_mad->method) { > ......... > 370 case IB_MAD_METHOD_GET: // 0x01 > 371 case IB_MAD_METHOD_GETTABLE: // 0x12 > 372 #if defined (VENDOR_RMPP_SUPPORT) && defined (DUAL_SIDED_RMPP) > 373 case IB_MAD_METHOD_GETMULTI: > 374 #endif > 375 is_get_request = TRUE; > 376 case IB_MAD_METHOD_SET: > 377 case IB_MAD_METHOD_DELETE: > 378 /* if we are closing down simply do nothing */ > 379 if (osm_exit_flag) > 380 osm_mad_pool_put(p_ctrl->p_mad_pool, p_madw); > 381 else > 382 sa_mad_ctrl_process(p_ctrl, p_madw, is_get_request); > 383 break; > 384 > > 2ad09524931dbf98d412e1912c1bdbf22f8ac81d ("srp_daemon: Work around SM > bug over non-default P_Key support") in the old git tree [1] introduced > this regression issue. Upstream srptools-0.0.4 works with the embedded > subnet manager. > > [1] git://git.openfabrics.org/~bvanassche/srptools.git > > Signed-off-by: Honggang Li <honli@redhat.com> > --- > srp_daemon/srp_daemon.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/srp_daemon/srp_daemon.c b/srp_daemon/srp_daemon.c > index 71b5f07..f905c6f 100644 > --- a/srp_daemon/srp_daemon.c > +++ b/srp_daemon/srp_daemon.c > @@ -1134,6 +1134,7 @@ static int get_shared_pkeys(struct resources *res, > continue; > > /* Mark components: DLID, SLID, PKEY */ > + out_sa_mad->method = SRP_SA_METHOD_GET_TABLE; > out_sa_mad->comp_mask = htobe64(1 << 4 | 1 << 5 | 1 << 13); > out_sa_mad->rmpp_hdr.rmpp_version = UMAD_RMPP_VERSION; > out_sa_mad->rmpp_hdr.rmpp_type = 1; > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Apr 11, 2017 at 09:49:44AM -0600, Jason Gunthorpe wrote: > On Tue, Apr 11, 2017 at 11:28:53PM +0800, Honggang LI wrote: > > From: Honggang Li <honli@redhat.com> > > > > According to InfiniBand Architecture Release 1.2.1, Table 208 > > Example PathRecord Request MAD Header Fields, MADHeader:Method > > should setup to 0x12 (SubnAdmGetTable()). > > That is just an example, Table 192 shows that Get and GetTable are > both valid query types. I don't understand this as "Table 192 PortInfoRecord" seems nothing to do with PathRecord request. > > I however expect that the SM will fail when using GetTable if the > query returns more than one result, could that be what is happening? > When you convert it to GET_TABLE does it return more than one result? strace output shows srptools always received MAD packet in 320 bytes. So the embedded subnet manager only returns one result. > > Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 4/12/2017 9:29 AM, Honggang LI wrote: > On Tue, Apr 11, 2017 at 09:49:44AM -0600, Jason Gunthorpe wrote: >> On Tue, Apr 11, 2017 at 11:28:53PM +0800, Honggang LI wrote: >>> From: Honggang Li <honli@redhat.com> >>> >>> According to InfiniBand Architecture Release 1.2.1, Table 208 >>> Example PathRecord Request MAD Header Fields, MADHeader:Method >>> should setup to 0x12 (SubnAdmGetTable()). >> >> That is just an example, Table 192 shows that Get and GetTable are >> both valid query types. > > I don't understand this as "Table 192 PortInfoRecord" seems nothing to > do with PathRecord request. In IBA 1.3 volume 1, it's Table 209 Subnet Administration Attribute/Method Map shows PathRecord methods are Get and GetTable. >> >> I however expect that the SM will fail when using GetTable if the >> query returns more than one result, could that be what is happening? >> When you convert it to GET_TABLE does it return more than one result? > > strace output shows srptools always received MAD packet in 320 bytes. So > the embedded subnet manager only returns one result. What is MAD status ? -- Hal >> >> Jason > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 4/11/2017 12:20 PM, Hal Rosenstock wrote: > On 4/11/2017 11:28 AM, Honggang LI wrote: >> From: Honggang Li <honli@redhat.com> >> >> According to InfiniBand Architecture Release 1.2.1, Table 208 >> Example PathRecord Request MAD Header Fields, MADHeader:Method >> should setup to 0x12 (SubnAdmGetTable()). > > Both get and get table should be supported for Path Record. > >> Before send the MAD packet for PathRecord request, init_srp_mad setup >> out_mad->method to SRP_MAD_METHOD_GET (0x01) for get_shared_pkeys. But >> get_shared_pkeys setup the attr_id field to SRP_SA_ATTR_PATH_REC (0x35). >> >> Because of this incorrect field in MAD packet, upstream srptools-1.0.3 >> failed with an embedded subnet manager running on an Intel True Scale >> Edge Switch 12300. > > Sounds like a bug in the embedded SM. > > If it works with GetTable and not with Get, sounds like there is more > than 1 path available to be returned. > > SA PathRecord:NumbPath says: > In a SubnAdmGetTable() query request: Maximum > number of paths to return for each unique SGIDDGID > combination. If more paths that satisfy the > PathRecord query exist for a given SGID-DGID combination, > only NumbPath paths shall be returned > (implementation defined). > In a SubnAdmGet() query request, ignored; a value of > 1 is used. > > So in case of multiple paths and get method, SM is supposed to pick one > rather than return some error (like too many records). Is that the MAD > status returned ? > > -- Hal > >> Upstream srptools-1.0.3 works with upstream opensm, because >> sa_mad_ctrl_process post the MAD to the dispatcher based on the attr_id >> field. As attr_id had been set to SRP_SA_ATTR_PATH_REC (0x35), >> PathRecord query works with opensm. >> >> opensm/opensm/osm_sa_mad_ctrl.c >> 292 static void sa_mad_ctrl_rcv_callback(IN osm_madw_t * p_madw, IN void *context, >> ......... >> 357 switch (p_sa_mad->method) { >> ......... >> 370 case IB_MAD_METHOD_GET: // 0x01 >> 371 case IB_MAD_METHOD_GETTABLE: // 0x12 >> 372 #if defined (VENDOR_RMPP_SUPPORT) && defined (DUAL_SIDED_RMPP) >> 373 case IB_MAD_METHOD_GETMULTI: >> 374 #endif >> 375 is_get_request = TRUE; >> 376 case IB_MAD_METHOD_SET: >> 377 case IB_MAD_METHOD_DELETE: >> 378 /* if we are closing down simply do nothing */ >> 379 if (osm_exit_flag) >> 380 osm_mad_pool_put(p_ctrl->p_mad_pool, p_madw); >> 381 else >> 382 sa_mad_ctrl_process(p_ctrl, p_madw, is_get_request); >> 383 break; >> 384 >> >> 2ad09524931dbf98d412e1912c1bdbf22f8ac81d ("srp_daemon: Work around SM >> bug over non-default P_Key support") in the old git tree [1] introduced >> this regression issue. Upstream srptools-0.0.4 works with the embedded >> subnet manager. >> >> [1] git://git.openfabrics.org/~bvanassche/srptools.git >> >> Signed-off-by: Honggang Li <honli@redhat.com> >> --- >> srp_daemon/srp_daemon.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/srp_daemon/srp_daemon.c b/srp_daemon/srp_daemon.c >> index 71b5f07..f905c6f 100644 >> --- a/srp_daemon/srp_daemon.c >> +++ b/srp_daemon/srp_daemon.c >> @@ -1134,6 +1134,7 @@ static int get_shared_pkeys(struct resources *res, >> continue; >> >> /* Mark components: DLID, SLID, PKEY */ >> + out_sa_mad->method = SRP_SA_METHOD_GET_TABLE; >> out_sa_mad->comp_mask = htobe64(1 << 4 | 1 << 5 | 1 << 13); Note that just changing this to GetTable is noncompliant. SA PathRecord says that for a GetTable request both SGID and NumbPath are required. -- Hal >> out_sa_mad->rmpp_hdr.rmpp_version = UMAD_RMPP_VERSION; >> out_sa_mad->rmpp_hdr.rmpp_type = 1; >> -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Apr 12, 2017 at 09:34:20AM -0400, Hal Rosenstock wrote: > On 4/12/2017 9:29 AM, Honggang LI wrote: > > On Tue, Apr 11, 2017 at 09:49:44AM -0600, Jason Gunthorpe wrote: > >> On Tue, Apr 11, 2017 at 11:28:53PM +0800, Honggang LI wrote: > >>> From: Honggang Li <honli@redhat.com> > >>> > >>> According to InfiniBand Architecture Release 1.2.1, Table 208 > >>> Example PathRecord Request MAD Header Fields, MADHeader:Method > >>> should setup to 0x12 (SubnAdmGetTable()). > >> > >> That is just an example, Table 192 shows that Get and GetTable are > >> both valid query types. > > > > I don't understand this as "Table 192 PortInfoRecord" seems nothing to > > do with PathRecord request. > > In IBA 1.3 volume 1, it's Table 209 Subnet Administration > Attribute/Method Map shows PathRecord methods are Get and GetTable. > http://www.intel.com/content/dam/www/public/us/en/documents/product-briefs/truescale-infiniband-12300-switch-brief.pdf Interoperability • Compliant with IBTA* specifications 1.0a, 1.1, 1.2, and 1.2. It seems the embedded subnet manager does not support IBA 1.3 > >> > >> I however expect that the SM will fail when using GetTable if the > >> query returns more than one result, could that be what is happening? > >> When you convert it to GET_TABLE does it return more than one result? > > > > strace output shows srptools always received MAD packet in 320 bytes. So > > the embedded subnet manager only returns one result. > > What is MAD status ? ib_user_mad->status = 0x0 I attached the MAD packet dump out with strace. Please see the attachments p4.in amd p4.decode. > > -- Hal > > >> > >> Jason > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x40\x01\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x03\x02\x81\x00\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00\x04\x00\x35\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x20\x30\x00\x00\x00\x00\x00\x00\x00\x00\xfe\x80\x00\x00\x00\x00\x00\x00\x00\x11\x75\x00\x00\x6f\x9b\xba\xfe\x80\x00\x00\x00\x00\x00\x00\x00\x11\x75\x00\x00\x6f\x9a\x34\x01\x9b\x00\xed\x00\x00\x00\x00\x00\x80\xff\xff\x00\x00\x84\x87\x8f\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 ib_user_mad->agent_id = 0x0 ib_user_mad->status = 0x0 ib_user_mad->timeout_ms = 0x0 ib_user_mad->retries = 0x0 ib_user_mad->length = 0x140 = 320 ib_mad_addr->qpn = BE32 0x1000000 ib_mad_addr->qkey = BE32 0x0 ib_mad_addr->lid = BE16 0x100 ib_mad_addr->sl = 0x0 ib_mad_addr->path_bits = 0x0 ib_mad_addr->grh_present = 0x0 ib_mad_addr->gid_index = 0x0 ib_mad_addr->hop_limit = 0x0 ib_mad_addr->traffic_class = 0x0 ib_mad_addr->gid[16]= 0x\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 ib_mad_addr->flow_label = BE32 0x0 ib_mad_addr->pkey_index = BE16 0x0 ib_mad_addr->reserved[6]= 0x\x00\x00\x00\x00\x00\x00 srp_dm_rmpp_sa_mad->base_version = 0x1 srp_dm_rmpp_sa_mad->mgmt_class = 0x3 srp_dm_rmpp_sa_mad->class_version = 0x2 srp_dm_rmpp_sa_mad->method = 0x81 //IB_MAD_METHOD_GET_RESP opensm/include/iba/ib_types.h // GetResp() Method (13.4.5) srp_dm_rmpp_sa_mad->status = 0x0 srp_dm_rmpp_sa_mad->reserved1 = 0x0 srp_dm_rmpp_sa_mad->tid = 0x400000008000000 srp_dm_rmpp_sa_mad->attr_id = 0x3500 srp_dm_rmpp_sa_mad->reserved2 = 0x0 srp_dm_rmpp_sa_mad->attr_mod = 0x0 srp_dm_rmpp_sa_mad->rmpp_version = 0x0 srp_dm_rmpp_sa_mad->rmpp_type = 0x0 srp_dm_rmpp_sa_mad->rmpp_rtime_flags = 0x0 srp_dm_rmpp_sa_mad->rmpp_status = 0x0 srp_dm_rmpp_sa_mad->seg_num = 0x0 srp_dm_rmpp_sa_mad->paylen_newwin = 0x0 srp_dm_rmpp_sa_mad->sm_key = 0x0 srp_dm_rmpp_sa_mad->attr_offset = 0x0 srp_dm_rmpp_sa_mad->reserved3 = 0x0 srp_dm_rmpp_sa_mad->comp_mask = 0x3020000000000000 ib_path_rec_t->resv0[8] = \x00\x00\x00\x00 ib_path_rec_t->dgid = \xfe\x00\x00\x00\x00\x75\x00\x9b ib_path_rec_t->sgid = \xfe\x00\x00\x00\x00\x75\x00\x9a ib_path_rec_t->dlid = ib_net16_t 0x9b01 ib_path_rec_t->slid = ib_net16_t 0xed00 ib_path_rec_t->hop_flow_raw = ib_net32_t 0x0 ib_path_rec_t->tclass = 0x0 ib_path_rec_t->num_path = 0x80 ib_path_rec_t->pkey = ib_net16_t 0xffff ib_path_rec_t->sl = ib_net16_t 0x0 ib_path_rec_t->mtu = 0x84 ib_path_rec_t->rate = 0x87 ib_path_rec_t->pkt_life = 0x8f ib_path_rec_t->preference = 0x0 ib_path_rec_t->resv2[6] = \x00\x00\x00
On Tue, Apr 11, 2017 at 12:20:29PM -0400, Hal Rosenstock wrote: > On 4/11/2017 11:28 AM, Honggang LI wrote: > > From: Honggang Li <honli@redhat.com> > > > > According to InfiniBand Architecture Release 1.2.1, Table 208 > > Example PathRecord Request MAD Header Fields, MADHeader:Method > > should setup to 0x12 (SubnAdmGetTable()). > > Both get and get table should be supported for Path Record. > > > Before send the MAD packet for PathRecord request, init_srp_mad setup > > out_mad->method to SRP_MAD_METHOD_GET (0x01) for get_shared_pkeys. But > > get_shared_pkeys setup the attr_id field to SRP_SA_ATTR_PATH_REC (0x35). > > > > Because of this incorrect field in MAD packet, upstream srptools-1.0.3 > > failed with an embedded subnet manager running on an Intel True Scale > > Edge Switch 12300. > > Sounds like a bug in the embedded SM. > > If it works with GetTable and not with Get, sounds like there is more > than 1 path available to be returned. > It is a big fat tree IB network. Some SRPT had been connected to the same leaf switch as the node srptools running. Other SRPT connected to different leaf switch. But PathRecord Request failed for all of SRPTs. > SA PathRecord:NumbPath says: > In a SubnAdmGetTable() query request: Maximum > number of paths to return for each unique SGIDDGID > combination. If more paths that satisfy the > PathRecord query exist for a given SGID-DGID combination, > only NumbPath paths shall be returned > (implementation defined). > In a SubnAdmGet() query request, ignored; a value of > 1 is used. > > So in case of multiple paths and get method, SM is supposed to pick one > rather than return some error (like too many records). Is that the MAD > status returned ? status always been zero. > > -- Hal > > > Upstream srptools-1.0.3 works with upstream opensm, because > > sa_mad_ctrl_process post the MAD to the dispatcher based on the attr_id > > field. As attr_id had been set to SRP_SA_ATTR_PATH_REC (0x35), > > PathRecord query works with opensm. > > > > opensm/opensm/osm_sa_mad_ctrl.c > > 292 static void sa_mad_ctrl_rcv_callback(IN osm_madw_t * p_madw, IN void *context, > > ......... > > 357 switch (p_sa_mad->method) { > > ......... > > 370 case IB_MAD_METHOD_GET: // 0x01 > > 371 case IB_MAD_METHOD_GETTABLE: // 0x12 > > 372 #if defined (VENDOR_RMPP_SUPPORT) && defined (DUAL_SIDED_RMPP) > > 373 case IB_MAD_METHOD_GETMULTI: > > 374 #endif > > 375 is_get_request = TRUE; > > 376 case IB_MAD_METHOD_SET: > > 377 case IB_MAD_METHOD_DELETE: > > 378 /* if we are closing down simply do nothing */ > > 379 if (osm_exit_flag) > > 380 osm_mad_pool_put(p_ctrl->p_mad_pool, p_madw); > > 381 else > > 382 sa_mad_ctrl_process(p_ctrl, p_madw, is_get_request); > > 383 break; > > 384 > > > > 2ad09524931dbf98d412e1912c1bdbf22f8ac81d ("srp_daemon: Work around SM > > bug over non-default P_Key support") in the old git tree [1] introduced > > this regression issue. Upstream srptools-0.0.4 works with the embedded > > subnet manager. > > > > [1] git://git.openfabrics.org/~bvanassche/srptools.git > > > > Signed-off-by: Honggang Li <honli@redhat.com> > > --- > > srp_daemon/srp_daemon.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/srp_daemon/srp_daemon.c b/srp_daemon/srp_daemon.c > > index 71b5f07..f905c6f 100644 > > --- a/srp_daemon/srp_daemon.c > > +++ b/srp_daemon/srp_daemon.c > > @@ -1134,6 +1134,7 @@ static int get_shared_pkeys(struct resources *res, > > continue; > > > > /* Mark components: DLID, SLID, PKEY */ > > + out_sa_mad->method = SRP_SA_METHOD_GET_TABLE; > > out_sa_mad->comp_mask = htobe64(1 << 4 | 1 << 5 | 1 << 13); > > out_sa_mad->rmpp_hdr.rmpp_version = UMAD_RMPP_VERSION; > > out_sa_mad->rmpp_hdr.rmpp_type = 1; > > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 4/12/2017 10:18 AM, Honggang LI wrote: > On Wed, Apr 12, 2017 at 09:34:20AM -0400, Hal Rosenstock wrote: >> On 4/12/2017 9:29 AM, Honggang LI wrote: >>> On Tue, Apr 11, 2017 at 09:49:44AM -0600, Jason Gunthorpe wrote: >>>> On Tue, Apr 11, 2017 at 11:28:53PM +0800, Honggang LI wrote: >>>>> From: Honggang Li <honli@redhat.com> >>>>> >>>>> According to InfiniBand Architecture Release 1.2.1, Table 208 >>>>> Example PathRecord Request MAD Header Fields, MADHeader:Method >>>>> should setup to 0x12 (SubnAdmGetTable()). >>>> >>>> That is just an example, Table 192 shows that Get and GetTable are >>>> both valid query types. >>> >>> I don't understand this as "Table 192 PortInfoRecord" seems nothing to >>> do with PathRecord request. >> >> In IBA 1.3 volume 1, it's Table 209 Subnet Administration >> Attribute/Method Map shows PathRecord methods are Get and GetTable. >> > > http://www.intel.com/content/dam/www/public/us/en/documents/product-briefs/truescale-infiniband-12300-switch-brief.pdf > Interoperability > • Compliant with IBTA* specifications 1.0a, 1.1, 1.2, and 1.2. > > It seems the embedded subnet manager does not support IBA 1.3 It's same for IBA 1.2.1. This part hasn't changed. >>>> >>>> I however expect that the SM will fail when using GetTable if the >>>> query returns more than one result, could that be what is happening? >>>> When you convert it to GET_TABLE does it return more than one result? >>> >>> strace output shows srptools always received MAD packet in 320 bytes. So >>> the embedded subnet manager only returns one result. >> >> What is MAD status ? > > ib_user_mad->status = 0x0 > > I attached the MAD packet dump out with strace. Please see the > attachments p4.in amd p4.decode. p4.decode has: rp_dm_rmpp_sa_mad->method = 0x81 //IB_MAD_METHOD_GET_RESP opensm/include/iba/ib_types.h // GetResp() Method (13.4.5) Proper response to GetTable (0x12) method is GetTableResponse (0x92) method but this is GetResp (0x81) response. -- Hal >> >> -- Hal >> >>>> >>>> Jason >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Apr 12, 2017 at 10:34:47AM -0400, Hal Rosenstock wrote: > On 4/12/2017 10:18 AM, Honggang LI wrote: > > On Wed, Apr 12, 2017 at 09:34:20AM -0400, Hal Rosenstock wrote: > >> On 4/12/2017 9:29 AM, Honggang LI wrote: > >>> On Tue, Apr 11, 2017 at 09:49:44AM -0600, Jason Gunthorpe wrote: > >>>> On Tue, Apr 11, 2017 at 11:28:53PM +0800, Honggang LI wrote: > >>>>> From: Honggang Li <honli@redhat.com> > >>>>> > >>>>> According to InfiniBand Architecture Release 1.2.1, Table 208 > >>>>> Example PathRecord Request MAD Header Fields, MADHeader:Method > >>>>> should setup to 0x12 (SubnAdmGetTable()). > >>>> > >>>> That is just an example, Table 192 shows that Get and GetTable are > >>>> both valid query types. > >>> > >>> I don't understand this as "Table 192 PortInfoRecord" seems nothing to > >>> do with PathRecord request. > >> > >> In IBA 1.3 volume 1, it's Table 209 Subnet Administration > >> Attribute/Method Map shows PathRecord methods are Get and GetTable. > >> > > > > http://www.intel.com/content/dam/www/public/us/en/documents/product-briefs/truescale-infiniband-12300-switch-brief.pdf > > Interoperability > > • Compliant with IBTA* specifications 1.0a, 1.1, 1.2, and 1.2. > > > > It seems the embedded subnet manager does not support IBA 1.3 > > It's same for IBA 1.2.1. This part hasn't changed. > > >>>> > >>>> I however expect that the SM will fail when using GetTable if the > >>>> query returns more than one result, could that be what is happening? > >>>> When you convert it to GET_TABLE does it return more than one result? > >>> > >>> strace output shows srptools always received MAD packet in 320 bytes. So > >>> the embedded subnet manager only returns one result. > >> > >> What is MAD status ? > > > > ib_user_mad->status = 0x0 > > > > I attached the MAD packet dump out with strace. Please see the > > attachments p4.in amd p4.decode. > > p4.decode has: > rp_dm_rmpp_sa_mad->method = 0x81 //IB_MAD_METHOD_GET_RESP > opensm/include/iba/ib_types.h // GetResp() Method (13.4.5) > > Proper response to GetTable (0x12) method is GetTableResponse (0x92) > method but this is GetResp (0x81) response. Yes, you are right. p4.decode was captured with unpatched upstream srptools-1.0.3. p4-184.decode was created with patched srptools-1.0.3. > > -- Hal > > >> > >> -- Hal > >> > >>>> > >>>> Jason > >>> -- > >>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > >>> the body of a message to majordomo@vger.kernel.org > >>> More majordomo info at http://vger.kernel.org/majordomo-info.html > >>> > >> -- > >> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > >> the body of a message to majordomo@vger.kernel.org > >> More majordomo info at http://vger.kernel.org/majordomo-info.html ib_user_mad->agent_id = 0x0 ib_user_mad->status = 0x0 ib_user_mad->timeout_ms = 0x0 ib_user_mad->retries = 0x0 ib_user_mad->length = 0xb8 = 184 ib_mad_addr->qpn = BE32 0x1000000 ib_mad_addr->qkey = BE32 0x0 ib_mad_addr->lid = BE16 0x100 ib_mad_addr->sl = 0x0 ib_mad_addr->path_bits = 0x0 ib_mad_addr->grh_present = 0x0 ib_mad_addr->gid_index = 0x0 ib_mad_addr->hop_limit = 0x0 ib_mad_addr->traffic_class = 0x0 ib_mad_addr->gid[16]= 0x\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 ib_mad_addr->flow_label = BE32 0x0 ib_mad_addr->pkey_index = BE16 0x0 ib_mad_addr->reserved[6]= 0x\x00\x00\x00\x00\x00\x00 srp_dm_rmpp_sa_mad->base_version = 0x1 srp_dm_rmpp_sa_mad->mgmt_class = 0x3 srp_dm_rmpp_sa_mad->class_version = 0x2 srp_dm_rmpp_sa_mad->method = 0x92 // IB_MAD_METHOD_GETTABLE_RESP opensm/include/iba/ib_types.h // SubnAdmGetTableResp() Method (15.2.2) srp_dm_rmpp_sa_mad->status = 0x0 srp_dm_rmpp_sa_mad->reserved1 = 0x0 srp_dm_rmpp_sa_mad->tid = 0x400000035000000 srp_dm_rmpp_sa_mad->attr_id = 0x3500 srp_dm_rmpp_sa_mad->reserved2 = 0x0 srp_dm_rmpp_sa_mad->attr_mod = 0x0 srp_dm_rmpp_sa_mad->rmpp_version = 0x1 srp_dm_rmpp_sa_mad->rmpp_type = 0x1 srp_dm_rmpp_sa_mad->rmpp_rtime_flags = 0xff srp_dm_rmpp_sa_mad->rmpp_status = 0x0 srp_dm_rmpp_sa_mad->seg_num = 0x1000000 srp_dm_rmpp_sa_mad->paylen_newwin = 0x54000000 srp_dm_rmpp_sa_mad->sm_key = 0x0 srp_dm_rmpp_sa_mad->attr_offset = 0x800 srp_dm_rmpp_sa_mad->reserved3 = 0x0 srp_dm_rmpp_sa_mad->comp_mask = 0x0 ib_path_rec_t->resv0[8] = \x00\x00\x00\x00 ib_path_rec_t->dgid = \xfe\x00\x00\x00\x00\x75\x00\x9b ib_path_rec_t->sgid = \xfe\x00\x00\x00\x00\x75\x00\x9a ib_path_rec_t->dlid = ib_net16_t 0x9b01 ib_path_rec_t->slid = ib_net16_t 0xed00 ib_path_rec_t->hop_flow_raw = ib_net32_t 0x0 ib_path_rec_t->tclass = 0x0 ib_path_rec_t->num_path = 0x80 ib_path_rec_t->pkey = ib_net16_t 0xffff ib_path_rec_t->sl = ib_net16_t 0x0 ib_path_rec_t->mtu = 0x84 ib_path_rec_t->rate = 0x87 ib_path_rec_t->pkt_life = 0x8f ib_path_rec_t->preference = 0x0 ib_path_rec_t->resv2[6] = \x00\x00\x00
On 4/12/2017 10:49 AM, Honggang LI wrote: > On Wed, Apr 12, 2017 at 10:34:47AM -0400, Hal Rosenstock wrote: >> On 4/12/2017 10:18 AM, Honggang LI wrote: >>> On Wed, Apr 12, 2017 at 09:34:20AM -0400, Hal Rosenstock wrote: >>>> On 4/12/2017 9:29 AM, Honggang LI wrote: >>>>> On Tue, Apr 11, 2017 at 09:49:44AM -0600, Jason Gunthorpe wrote: >>>>>> On Tue, Apr 11, 2017 at 11:28:53PM +0800, Honggang LI wrote: >>>>>>> From: Honggang Li <honli@redhat.com> >>>>>>> >>>>>>> According to InfiniBand Architecture Release 1.2.1, Table 208 >>>>>>> Example PathRecord Request MAD Header Fields, MADHeader:Method >>>>>>> should setup to 0x12 (SubnAdmGetTable()). >>>>>> >>>>>> That is just an example, Table 192 shows that Get and GetTable are >>>>>> both valid query types. >>>>> >>>>> I don't understand this as "Table 192 PortInfoRecord" seems nothing to >>>>> do with PathRecord request. >>>> >>>> In IBA 1.3 volume 1, it's Table 209 Subnet Administration >>>> Attribute/Method Map shows PathRecord methods are Get and GetTable. >>>> >>> >>> http://www.intel.com/content/dam/www/public/us/en/documents/product-briefs/truescale-infiniband-12300-switch-brief.pdf >>> Interoperability >>> • Compliant with IBTA* specifications 1.0a, 1.1, 1.2, and 1.2. >>> >>> It seems the embedded subnet manager does not support IBA 1.3 >> >> It's same for IBA 1.2.1. This part hasn't changed. >> >>>>>> >>>>>> I however expect that the SM will fail when using GetTable if the >>>>>> query returns more than one result, could that be what is happening? >>>>>> When you convert it to GET_TABLE does it return more than one result? >>>>> >>>>> strace output shows srptools always received MAD packet in 320 bytes. So >>>>> the embedded subnet manager only returns one result. >>>> >>>> What is MAD status ? >>> >>> ib_user_mad->status = 0x0 >>> >>> I attached the MAD packet dump out with strace. Please see the >>> attachments p4.in amd p4.decode. >> >> p4.decode has: >> rp_dm_rmpp_sa_mad->method = 0x81 //IB_MAD_METHOD_GET_RESP >> opensm/include/iba/ib_types.h // GetResp() Method (13.4.5) >> >> Proper response to GetTable (0x12) method is GetTableResponse (0x92) >> method but this is GetResp (0x81) response. > > Yes, you are right. p4.decode was captured with unpatched upstream srptools-1.0.3. against the embedded SM ? If so, this response looks (mostly) good to me and should be accepted by srp_daemon. I think I see the issue in srp_daemon: it's due to the check of the attribute offset which is not being set in the response: size = ib_get_attr_size(in_sa_mad->attr_offset); if (!size) { if (config->verbose) printf("PathRec Query did not find any targets " "over P_Key %x\n", pkey); continue; } p4.decode has: srp_dm_rmpp_sa_mad->attr_offset = 0x0 Since a get rather than get table is being used, a good MAD status means that a valid PR was returned and there is no need for this check. Would you try using the original get method without this code block and verify that it works with the embedded SM ? Thanks. -- Hal > p4-184.decode was created with patched srptools-1.0.3. >> >> -- Hal >> >>>> >>>> -- Hal >>>> >>>>>> >>>>>> Jason >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in >>>>> the body of a message to majordomo@vger.kernel.org >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>> >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in >>>> the body of a message to majordomo@vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 4/12/2017 12:51 PM, Hal Rosenstock wrote: > On 4/12/2017 10:49 AM, Honggang LI wrote: >> On Wed, Apr 12, 2017 at 10:34:47AM -0400, Hal Rosenstock wrote: >>> On 4/12/2017 10:18 AM, Honggang LI wrote: >>>> On Wed, Apr 12, 2017 at 09:34:20AM -0400, Hal Rosenstock wrote: >>>>> On 4/12/2017 9:29 AM, Honggang LI wrote: >>>>>> On Tue, Apr 11, 2017 at 09:49:44AM -0600, Jason Gunthorpe wrote: >>>>>>> On Tue, Apr 11, 2017 at 11:28:53PM +0800, Honggang LI wrote: >>>>>>>> From: Honggang Li <honli@redhat.com> >>>>>>>> >>>>>>>> According to InfiniBand Architecture Release 1.2.1, Table 208 >>>>>>>> Example PathRecord Request MAD Header Fields, MADHeader:Method >>>>>>>> should setup to 0x12 (SubnAdmGetTable()). >>>>>>> >>>>>>> That is just an example, Table 192 shows that Get and GetTable are >>>>>>> both valid query types. >>>>>> >>>>>> I don't understand this as "Table 192 PortInfoRecord" seems nothing to >>>>>> do with PathRecord request. >>>>> >>>>> In IBA 1.3 volume 1, it's Table 209 Subnet Administration >>>>> Attribute/Method Map shows PathRecord methods are Get and GetTable. >>>>> >>>> >>>> http://www.intel.com/content/dam/www/public/us/en/documents/product-briefs/truescale-infiniband-12300-switch-brief.pdf >>>> Interoperability >>>> • Compliant with IBTA* specifications 1.0a, 1.1, 1.2, and 1.2. >>>> >>>> It seems the embedded subnet manager does not support IBA 1.3 >>> >>> It's same for IBA 1.2.1. This part hasn't changed. >>> >>>>>>> >>>>>>> I however expect that the SM will fail when using GetTable if the >>>>>>> query returns more than one result, could that be what is happening? >>>>>>> When you convert it to GET_TABLE does it return more than one result? >>>>>> >>>>>> strace output shows srptools always received MAD packet in 320 bytes. So >>>>>> the embedded subnet manager only returns one result. >>>>> >>>>> What is MAD status ? >>>> >>>> ib_user_mad->status = 0x0 >>>> >>>> I attached the MAD packet dump out with strace. Please see the >>>> attachments p4.in amd p4.decode. >>> >>> p4.decode has: >>> rp_dm_rmpp_sa_mad->method = 0x81 //IB_MAD_METHOD_GET_RESP >>> opensm/include/iba/ib_types.h // GetResp() Method (13.4.5) >>> >>> Proper response to GetTable (0x12) method is GetTableResponse (0x92) >>> method but this is GetResp (0x81) response. >> >> Yes, you are right. p4.decode was captured with unpatched upstream srptools-1.0.3. > > against the embedded SM ? If so, this response looks (mostly) good to me and should be accepted by srp_daemon. I think I see the issue in srp_daemon: it's due to the check of the attribute offset which is not being set in the response: > > size = ib_get_attr_size(in_sa_mad->attr_offset); > if (!size) { > if (config->verbose) > printf("PathRec Query did not find any targets " > "over P_Key %x\n", pkey); > continue; > } > > p4.decode has: > srp_dm_rmpp_sa_mad->attr_offset = 0x0 > > Since a get rather than get table is being used, a good MAD status means that a valid PR was returned and there is no need for this check. > > Would you try using the original get method without this code block and verify that it works with the embedded SM ? Just submitted PR for this: https://github.com/linux-rdma/rdma-core/pull/116 I can post patch if you like. -- Hal > Thanks. > > -- Hal > >> p4-184.decode was created with patched srptools-1.0.3. > > >>> >>> -- Hal >>> >>>>> >>>>> -- Hal >>>>> >>>>>>> >>>>>>> Jason >>>>>> -- >>>>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in >>>>>> the body of a message to majordomo@vger.kernel.org >>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>>> >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in >>>>> the body of a message to majordomo@vger.kernel.org >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Apr 12, 2017 at 12:51:25PM -0400, Hal Rosenstock wrote: > On 4/12/2017 10:49 AM, Honggang LI wrote: > > On Wed, Apr 12, 2017 at 10:34:47AM -0400, Hal Rosenstock wrote: > >> On 4/12/2017 10:18 AM, Honggang LI wrote: > >>> On Wed, Apr 12, 2017 at 09:34:20AM -0400, Hal Rosenstock wrote: > >>>> On 4/12/2017 9:29 AM, Honggang LI wrote: > >>>>> On Tue, Apr 11, 2017 at 09:49:44AM -0600, Jason Gunthorpe wrote: > >>>>>> On Tue, Apr 11, 2017 at 11:28:53PM +0800, Honggang LI wrote: > >>>>>>> From: Honggang Li <honli@redhat.com> > >>>>>>> > >>>>>>> According to InfiniBand Architecture Release 1.2.1, Table 208 > >>>>>>> Example PathRecord Request MAD Header Fields, MADHeader:Method > >>>>>>> should setup to 0x12 (SubnAdmGetTable()). > >>>>>> > >>>>>> That is just an example, Table 192 shows that Get and GetTable are > >>>>>> both valid query types. > >>>>> > >>>>> I don't understand this as "Table 192 PortInfoRecord" seems nothing to > >>>>> do with PathRecord request. > >>>> > >>>> In IBA 1.3 volume 1, it's Table 209 Subnet Administration > >>>> Attribute/Method Map shows PathRecord methods are Get and GetTable. > >>>> > >>> > >>> http://www.intel.com/content/dam/www/public/us/en/documents/product-briefs/truescale-infiniband-12300-switch-brief.pdf > >>> Interoperability > >>> • Compliant with IBTA* specifications 1.0a, 1.1, 1.2, and 1.2. > >>> > >>> It seems the embedded subnet manager does not support IBA 1.3 > >> > >> It's same for IBA 1.2.1. This part hasn't changed. > >> > >>>>>> > >>>>>> I however expect that the SM will fail when using GetTable if the > >>>>>> query returns more than one result, could that be what is happening? > >>>>>> When you convert it to GET_TABLE does it return more than one result? > >>>>> > >>>>> strace output shows srptools always received MAD packet in 320 bytes. So > >>>>> the embedded subnet manager only returns one result. > >>>> > >>>> What is MAD status ? > >>> > >>> ib_user_mad->status = 0x0 > >>> > >>> I attached the MAD packet dump out with strace. Please see the > >>> attachments p4.in amd p4.decode. > >> > >> p4.decode has: > >> rp_dm_rmpp_sa_mad->method = 0x81 //IB_MAD_METHOD_GET_RESP > >> opensm/include/iba/ib_types.h // GetResp() Method (13.4.5) > >> > >> Proper response to GetTable (0x12) method is GetTableResponse (0x92) > >> method but this is GetResp (0x81) response. > > > > Yes, you are right. p4.decode was captured with unpatched upstream srptools-1.0.3. > > against the embedded SM ? Yes. > If so, this response looks (mostly) good to me and should be accepted by srp_daemon. I think I see the issue in srp_daemon: it's due to the check of the attribute offset which is not being set in the response: > > size = ib_get_attr_size(in_sa_mad->attr_offset); > if (!size) { > if (config->verbose) > printf("PathRec Query did not find any targets " > "over P_Key %x\n", pkey); > continue; > } > > p4.decode has: > srp_dm_rmpp_sa_mad->attr_offset = 0x0 > > Since a get rather than get table is being used, a good MAD status means that a valid PR was returned and there is no need for this check. > > Would you try using the original get method without this code block and verify that it works with the embedded SM ? > I sent the scratch RPM with the patch you suggested to third party for testing. Please wait. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Apr 13, 2017 at 07:08:58AM +0800, Honggang LI wrote: > On Wed, Apr 12, 2017 at 12:51:25PM -0400, Hal Rosenstock wrote: > > On 4/12/2017 10:49 AM, Honggang LI wrote: > > > On Wed, Apr 12, 2017 at 10:34:47AM -0400, Hal Rosenstock wrote: > > >> On 4/12/2017 10:18 AM, Honggang LI wrote: > > >>> On Wed, Apr 12, 2017 at 09:34:20AM -0400, Hal Rosenstock wrote: > > >>>> On 4/12/2017 9:29 AM, Honggang LI wrote: > > >>>>> On Tue, Apr 11, 2017 at 09:49:44AM -0600, Jason Gunthorpe wrote: > > >>>>>> On Tue, Apr 11, 2017 at 11:28:53PM +0800, Honggang LI wrote: > > >>>>>>> From: Honggang Li <honli@redhat.com> > > >>>>>>> > > >>>>>>> According to InfiniBand Architecture Release 1.2.1, Table 208 > > >>>>>>> Example PathRecord Request MAD Header Fields, MADHeader:Method > > >>>>>>> should setup to 0x12 (SubnAdmGetTable()). > > >>>>>> > > >>>>>> That is just an example, Table 192 shows that Get and GetTable are > > >>>>>> both valid query types. > > >>>>> > > >>>>> I don't understand this as "Table 192 PortInfoRecord" seems nothing to > > >>>>> do with PathRecord request. > > >>>> > > >>>> In IBA 1.3 volume 1, it's Table 209 Subnet Administration > > >>>> Attribute/Method Map shows PathRecord methods are Get and GetTable. > > >>>> > > >>> > > >>> http://www.intel.com/content/dam/www/public/us/en/documents/product-briefs/truescale-infiniband-12300-switch-brief.pdf > > >>> Interoperability > > >>> • Compliant with IBTA* specifications 1.0a, 1.1, 1.2, and 1.2. > > >>> > > >>> It seems the embedded subnet manager does not support IBA 1.3 > > >> > > >> It's same for IBA 1.2.1. This part hasn't changed. > > >> > > >>>>>> > > >>>>>> I however expect that the SM will fail when using GetTable if the > > >>>>>> query returns more than one result, could that be what is happening? > > >>>>>> When you convert it to GET_TABLE does it return more than one result? > > >>>>> > > >>>>> strace output shows srptools always received MAD packet in 320 bytes. So > > >>>>> the embedded subnet manager only returns one result. > > >>>> > > >>>> What is MAD status ? > > >>> > > >>> ib_user_mad->status = 0x0 > > >>> > > >>> I attached the MAD packet dump out with strace. Please see the > > >>> attachments p4.in amd p4.decode. > > >> > > >> p4.decode has: > > >> rp_dm_rmpp_sa_mad->method = 0x81 //IB_MAD_METHOD_GET_RESP > > >> opensm/include/iba/ib_types.h // GetResp() Method (13.4.5) > > >> > > >> Proper response to GetTable (0x12) method is GetTableResponse (0x92) > > >> method but this is GetResp (0x81) response. > > > > > > Yes, you are right. p4.decode was captured with unpatched upstream srptools-1.0.3. > > > > against the embedded SM ? > > Yes. > > > If so, this response looks (mostly) good to me and should be accepted by srp_daemon. I think I see the issue in srp_daemon: it's due to the check of the attribute offset which is not being set in the response: > > > > size = ib_get_attr_size(in_sa_mad->attr_offset); > > if (!size) { > > if (config->verbose) > > printf("PathRec Query did not find any targets " > > "over P_Key %x\n", pkey); > > continue; > > } > > > > p4.decode has: > > srp_dm_rmpp_sa_mad->attr_offset = 0x0 > > > > Since a get rather than get table is being used, a good MAD status means that a valid PR was returned and there is no need for this check. > > > > Would you try using the original get method without this code block and verify that it works with the embedded SM ? Yes, it works. Thanks > > > > I sent the scratch RPM with the patch you suggested to third party for testing. Please wait. > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Apr 12, 2017 at 01:39:02PM -0400, Hal Rosenstock wrote: > On 4/12/2017 12:51 PM, Hal Rosenstock wrote: > > On 4/12/2017 10:49 AM, Honggang LI wrote: > >> On Wed, Apr 12, 2017 at 10:34:47AM -0400, Hal Rosenstock wrote: > >>> On 4/12/2017 10:18 AM, Honggang LI wrote: > >>>> On Wed, Apr 12, 2017 at 09:34:20AM -0400, Hal Rosenstock wrote: > >>>>> On 4/12/2017 9:29 AM, Honggang LI wrote: > >>>>>> On Tue, Apr 11, 2017 at 09:49:44AM -0600, Jason Gunthorpe wrote: > >>>>>>> On Tue, Apr 11, 2017 at 11:28:53PM +0800, Honggang LI wrote: > >>>>>>>> From: Honggang Li <honli@redhat.com> > >>>>>>>> > >>>>>>>> According to InfiniBand Architecture Release 1.2.1, Table 208 > >>>>>>>> Example PathRecord Request MAD Header Fields, MADHeader:Method > >>>>>>>> should setup to 0x12 (SubnAdmGetTable()). > >>>>>>> > >>>>>>> That is just an example, Table 192 shows that Get and GetTable are > >>>>>>> both valid query types. > >>>>>> > >>>>>> I don't understand this as "Table 192 PortInfoRecord" seems nothing to > >>>>>> do with PathRecord request. > >>>>> > >>>>> In IBA 1.3 volume 1, it's Table 209 Subnet Administration > >>>>> Attribute/Method Map shows PathRecord methods are Get and GetTable. > >>>>> > >>>> > >>>> http://www.intel.com/content/dam/www/public/us/en/documents/product-briefs/truescale-infiniband-12300-switch-brief.pdf > >>>> Interoperability > >>>> • Compliant with IBTA* specifications 1.0a, 1.1, 1.2, and 1.2. > >>>> > >>>> It seems the embedded subnet manager does not support IBA 1.3 > >>> > >>> It's same for IBA 1.2.1. This part hasn't changed. > >>> > >>>>>>> > >>>>>>> I however expect that the SM will fail when using GetTable if the > >>>>>>> query returns more than one result, could that be what is happening? > >>>>>>> When you convert it to GET_TABLE does it return more than one result? > >>>>>> > >>>>>> strace output shows srptools always received MAD packet in 320 bytes. So > >>>>>> the embedded subnet manager only returns one result. > >>>>> > >>>>> What is MAD status ? > >>>> > >>>> ib_user_mad->status = 0x0 > >>>> > >>>> I attached the MAD packet dump out with strace. Please see the > >>>> attachments p4.in amd p4.decode. > >>> > >>> p4.decode has: > >>> rp_dm_rmpp_sa_mad->method = 0x81 //IB_MAD_METHOD_GET_RESP > >>> opensm/include/iba/ib_types.h // GetResp() Method (13.4.5) > >>> > >>> Proper response to GetTable (0x12) method is GetTableResponse (0x92) > >>> method but this is GetResp (0x81) response. > >> > >> Yes, you are right. p4.decode was captured with unpatched upstream srptools-1.0.3. > > > > against the embedded SM ? If so, this response looks (mostly) good to me and should be accepted by srp_daemon. I think I see the issue in srp_daemon: it's due to the check of the attribute offset which is not being set in the response: > > > > size = ib_get_attr_size(in_sa_mad->attr_offset); > > if (!size) { > > if (config->verbose) > > printf("PathRec Query did not find any targets " > > "over P_Key %x\n", pkey); > > continue; > > } > > > > p4.decode has: > > srp_dm_rmpp_sa_mad->attr_offset = 0x0 > > > > Since a get rather than get table is being used, a good MAD status means that a valid PR was returned and there is no need for this check. > > > > Would you try using the original get method without this code block and verify that it works with the embedded SM ? > > Just submitted PR for this: > https://github.com/linux-rdma/rdma-core/pull/116 Those three patches PASSED the test. > I can post patch if you like. Please fix the typo I mentioned in the PR and post those three patches. thanks -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/srp_daemon/srp_daemon.c b/srp_daemon/srp_daemon.c index 71b5f07..f905c6f 100644 --- a/srp_daemon/srp_daemon.c +++ b/srp_daemon/srp_daemon.c @@ -1134,6 +1134,7 @@ static int get_shared_pkeys(struct resources *res, continue; /* Mark components: DLID, SLID, PKEY */ + out_sa_mad->method = SRP_SA_METHOD_GET_TABLE; out_sa_mad->comp_mask = htobe64(1 << 4 | 1 << 5 | 1 << 13); out_sa_mad->rmpp_hdr.rmpp_version = UMAD_RMPP_VERSION; out_sa_mad->rmpp_hdr.rmpp_type = 1;