diff mbox

[1/2] ibacm: Incorrect usage of BE byte order of MLID attach/detach_mcast()

Message ID 20171013184347.28410.87093.stgit@phlsvslse11.ph.intel.com (mailing list archive)
State Changes Requested
Headers show

Commit Message

Michael J. Ruhl Oct. 13, 2017, 6:43 p.m. UTC
From: Michael J. Ruhl <michael.j.ruhl@intel.com>

The MLID value passed to ibv_attach/detach_mcast() must be in host
byte order.

acmp.c incorrectly uses the big endian format when doing a multicast
attach/detach (join). Multicast packets are used to do name resolution
by the libibacmp library.

There are two possible results because of this issue.

If a kernel has commit 00b8a3351b2b, the attach will fail with an
EINVAL.  ibacm will log this as a failure during the multicast join.

If a kernel does not have commit 00b8a3351b2b, the attach will
complete successfully.  Packets sent to this address will be dropped
because the packet dlid value and the multicast address information
given by the attach will not match.

Update MLID usage to use the correct byte order.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
---
 ibacm/prov/acmp/src/acmp.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Hefty, Sean Oct. 13, 2017, 7:37 p.m. UTC | #1
PiBUaGUgTUxJRCB2YWx1ZSBwYXNzZWQgdG8gaWJ2X2F0dGFjaC9kZXRhY2hfbWNhc3QoKSBtdXN0
IGJlIGluIGhvc3QNCj4gYnl0ZSBvcmRlci4NCj4gDQo+IGFjbXAuYyBpbmNvcnJlY3RseSB1c2Vz
IHRoZSBiaWcgZW5kaWFuIGZvcm1hdCB3aGVuIGRvaW5nIGEgbXVsdGljYXN0DQo+IGF0dGFjaC9k
ZXRhY2ggKGpvaW4pLiBNdWx0aWNhc3QgcGFja2V0cyBhcmUgdXNlZCB0byBkbyBuYW1lIHJlc29s
dXRpb24NCj4gYnkgdGhlIGxpYmliYWNtcCBsaWJyYXJ5Lg0KPiANCj4gVGhlcmUgYXJlIHR3byBw
b3NzaWJsZSByZXN1bHRzIGJlY2F1c2Ugb2YgdGhpcyBpc3N1ZS4NCj4gDQo+IElmIGEga2VybmVs
IGhhcyBjb21taXQgMDBiOGEzMzUxYjJiLCB0aGUgYXR0YWNoIHdpbGwgZmFpbCB3aXRoIGFuDQo+
IEVJTlZBTC4gIGliYWNtIHdpbGwgbG9nIHRoaXMgYXMgYSBmYWlsdXJlIGR1cmluZyB0aGUgbXVs
dGljYXN0IGpvaW4uDQo+IA0KPiBJZiBhIGtlcm5lbCBkb2VzIG5vdCBoYXZlIGNvbW1pdCAwMGI4
YTMzNTFiMmIsIHRoZSBhdHRhY2ggd2lsbA0KPiBjb21wbGV0ZSBzdWNjZXNzZnVsbHkuICBQYWNr
ZXRzIHNlbnQgdG8gdGhpcyBhZGRyZXNzIHdpbGwgYmUgZHJvcHBlZA0KPiBiZWNhdXNlIHRoZSBw
YWNrZXQgZGxpZCB2YWx1ZSBhbmQgdGhlIG11bHRpY2FzdCBhZGRyZXNzIGluZm9ybWF0aW9uDQo+
IGdpdmVuIGJ5IHRoZSBhdHRhY2ggd2lsbCBub3QgbWF0Y2guDQo+IA0KPiBVcGRhdGUgTUxJRCB1
c2FnZSB0byB1c2UgdGhlIGNvcnJlY3QgYnl0ZSBvcmRlci4NCj4gDQo+IFJldmlld2VkLWJ5OiBN
aWtlIE1hcmNpbmlzenluIDxtaWtlLm1hcmNpbmlzenluQGludGVsLmNvbT4NCj4gU2lnbmVkLW9m
Zi1ieTogTWljaGFlbCBKLiBSdWhsIDxtaWNoYWVsLmoucnVobEBpbnRlbC5jb20+DQo+IC0tLQ0K
PiAgaWJhY20vcHJvdi9hY21wL3NyYy9hY21wLmMgfCAgICA0ICsrLS0NCj4gIDEgZmlsZXMgY2hh
bmdlZCwgMiBpbnNlcnRpb25zKCspLCAyIGRlbGV0aW9ucygtKQ0KPiANCj4gZGlmZiAtLWdpdCBh
L2liYWNtL3Byb3YvYWNtcC9zcmMvYWNtcC5jIGIvaWJhY20vcHJvdi9hY21wL3NyYy9hY21wLmMN
Cj4gaW5kZXggYWE3ODQxNi4uNzhkOWEyOSAxMDA2NDQNCj4gLS0tIGEvaWJhY20vcHJvdi9hY21w
L3NyYy9hY21wLmMNCj4gKysrIGIvaWJhY20vcHJvdi9hY21wL3NyYy9hY21wLmMNCj4gQEAgLTcz
Miw3ICs3MzIsNyBAQCBzdGF0aWMgdm9pZCBhY21wX3Byb2Nlc3Nfam9pbl9yZXNwKHN0cnVjdA0K
PiBhY21fc2FfbWFkICpzYV9tYWQpDQo+ICAJCQlhY21fbG9nKDAsICJFUlJPUiAtIHVuYWJsZSB0
byBjcmVhdGUgYWhcbiIpOw0KPiAgCQkJZ290byBvdXQ7DQo+ICAJCX0NCj4gLQkJcmV0ID0gaWJ2
X2F0dGFjaF9tY2FzdChlcC0+cXAsICZtY19yZWMtPm1naWQsIG1jX3JlYy0NCj4gPm1saWQpOw0K
PiArCQlyZXQgPSBpYnZfYXR0YWNoX21jYXN0KGVwLT5xcCwgJmRlc3QtPm1naWQsIGRlc3QtDQo+
ID5hdi5kbGlkKTsNCj4gIAkJaWYgKHJldCkgew0KPiAgCQkJYWNtX2xvZygwLCAiRVJST1IgLSB1
bmFibGUgdG8gYXR0YWNoIFFQIHRvIG11bHRpY2FzdA0KPiBncm91cFxuIik7DQo+ICAJCQlpYnZf
ZGVzdHJveV9haChkZXN0LT5haCk7DQo+IEBAIC0xNDI5LDcgKzE0MjksNyBAQCBzdGF0aWMgdm9p
ZCBhY21wX2VwX2pvaW4oc3RydWN0IGFjbXBfZXAgKmVwKQ0KPiANCj4gIAlpZiAoZXAtPm1jX2Rl
c3RbMF0uc3RhdGUgPT0gQUNNUF9SRUFEWSAmJiBlcC0+bWNfZGVzdFswXS5haCkgew0KPiAgCQlp
YnZfZGV0YWNoX21jYXN0KGVwLT5xcCwgJmVwLT5tY19kZXN0WzBdLm1naWQsDQo+IC0JCQkJIGJl
MTZ0b2goZXAtPm1jX2Rlc3RbMF0uYXYuZGxpZCkpOw0KPiArCQkJCSBlcC0+bWNfZGVzdFswXS5h
di5kbGlkKTsNCj4gIAkJaWJ2X2Rlc3Ryb3lfYWgoZXAtPm1jX2Rlc3RbMF0uYWgpOw0KPiAgCQll
cC0+bWNfZGVzdFswXS5haCA9IE5VTEw7DQo+ICAJfQ0KDQpDaGFuZ2VzIGxvb2sgY29ycmVjdCBm
b3IgYm90aCBwYXRjaGVzLg0KDQpBY2tlZC1ieTogU2VhbiBIZWZ0eSA8c2Vhbi5oZWZ0eUBpbnRl
bC5jb20+DQoNCkl0IHdvdWxkIGJlIG5pY2UgdG8gdW5kZXJzdGFuZCBob3cgdGhlIGNvZGUgd2Fz
IHdvcmtpbmcgaW4gdGhlIHBhc3QuICBBdCBsZWFzdCBpYmFjbSBoYXMgYmVlbiBhYmxlIHRvIHJl
cG9ydCBjYWNoZWQgZGF0YS4gIEFsbCBub2RlcyB3b3VsZCBoYXZlIGpvaW5lZCB0aGUgd3Jvbmcg
bWNhc3QgZ3JvdXAsIGJ1dCBhcyB5b3UgbWVudGlvbiB0aGUgZGxpZCBpbiB0aGUgQVYgd291bGRu
J3QgaGF2ZSBtYXRjaGVkLiAgSSB0cmllZCBsb29raW5nIGJhY2sgdGhyb3VnaCB0aGUgaWJhY20g
aGlzdG9yeSwgYnV0IGRpZG4ndCBzZWUgYW55IHJlbGV2YW50IGNoYW5nZXMuIA0K
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael J. Ruhl Oct. 13, 2017, 8:34 p.m. UTC | #2
> -----Original Message-----

> From: Hefty, Sean

> Sent: Friday, October 13, 2017 3:37 PM

> To: Ruhl, Michael J <michael.j.ruhl@intel.com>

> Cc: linux-rdma@vger.kernel.org; hal@dev.mellanox.co.il

> Subject: RE: [PATCH 1/2] ibacm: Incorrect usage of BE byte order of MLID

> attach/detach_mcast()

> 

> > The MLID value passed to ibv_attach/detach_mcast() must be in host

> > byte order.

> >

> > acmp.c incorrectly uses the big endian format when doing a multicast

> > attach/detach (join). Multicast packets are used to do name resolution

> > by the libibacmp library.

> >

> > There are two possible results because of this issue.

> >

> > If a kernel has commit 00b8a3351b2b, the attach will fail with an

> > EINVAL.  ibacm will log this as a failure during the multicast join.

> >

> > If a kernel does not have commit 00b8a3351b2b, the attach will

> > complete successfully.  Packets sent to this address will be dropped

> > because the packet dlid value and the multicast address information

> > given by the attach will not match.

> >

> > Update MLID usage to use the correct byte order.

> >

> > Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>

> > Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>

> > ---

> >  ibacm/prov/acmp/src/acmp.c |    4 ++--

> >  1 files changed, 2 insertions(+), 2 deletions(-)

> >

> > diff --git a/ibacm/prov/acmp/src/acmp.c b/ibacm/prov/acmp/src/acmp.c

> > index aa78416..78d9a29 100644

> > --- a/ibacm/prov/acmp/src/acmp.c

> > +++ b/ibacm/prov/acmp/src/acmp.c

> > @@ -732,7 +732,7 @@ static void acmp_process_join_resp(struct

> > acm_sa_mad *sa_mad)

> >  			acm_log(0, "ERROR - unable to create ah\n");

> >  			goto out;

> >  		}

> > -		ret = ibv_attach_mcast(ep->qp, &mc_rec->mgid, mc_rec-

> > >mlid);

> > +		ret = ibv_attach_mcast(ep->qp, &dest->mgid, dest-

> > >av.dlid);

> >  		if (ret) {

> >  			acm_log(0, "ERROR - unable to attach QP to multicast

> > group\n");

> >  			ibv_destroy_ah(dest->ah);

> > @@ -1429,7 +1429,7 @@ static void acmp_ep_join(struct acmp_ep *ep)

> >

> >  	if (ep->mc_dest[0].state == ACMP_READY && ep->mc_dest[0].ah) {

> >  		ibv_detach_mcast(ep->qp, &ep->mc_dest[0].mgid,

> > -				 be16toh(ep->mc_dest[0].av.dlid));

> > +				 ep->mc_dest[0].av.dlid);

> >  		ibv_destroy_ah(ep->mc_dest[0].ah);

> >  		ep->mc_dest[0].ah = NULL;

> >  	}

> 

> Changes look correct for both patches.

> 

> Acked-by: Sean Hefty <sean.hefty@intel.com>

> 

> It would be nice to understand how the code was working in the past.  At

> least ibacm has been able to report cached data.  All nodes would have

> joined the wrong mcast group, but as you mention the dlid in the AV

> wouldn't have matched.  I tried looking back through the ibacm history, but

> didn't see any relevant changes.


I don’t believe that the MLID value was checked.  I added a patch several months ago to verify that the MLID was a mcast lid (in ib_attach_mcast).  And it may be that this is what caused this to stop working.

M
Leon Romanovsky Oct. 16, 2017, 4:56 a.m. UTC | #3
On Fri, Oct 13, 2017 at 02:43:47PM -0400, Michael J. Ruhl wrote:
> From: Michael J. Ruhl <michael.j.ruhl@intel.com>
>
> The MLID value passed to ibv_attach/detach_mcast() must be in host
> byte order.
>
> acmp.c incorrectly uses the big endian format when doing a multicast
> attach/detach (join). Multicast packets are used to do name resolution
> by the libibacmp library.
>
> There are two possible results because of this issue.
>
> If a kernel has commit 00b8a3351b2b, the attach will fail with an
> EINVAL.  ibacm will log this as a failure during the multicast join.

Kernel doesn't have such commit.
https://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma.git/commit/?h=k.o/for-next&id=00b8a3351b2b

>
> If a kernel does not have commit 00b8a3351b2b, the attach will
> complete successfully.  Packets sent to this address will be dropped
> because the packet dlid value and the multicast address information
> given by the attach will not match.
>
> Update MLID usage to use the correct byte order.
>
> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
> ---
>  ibacm/prov/acmp/src/acmp.c |    4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/ibacm/prov/acmp/src/acmp.c b/ibacm/prov/acmp/src/acmp.c
> index aa78416..78d9a29 100644
> --- a/ibacm/prov/acmp/src/acmp.c
> +++ b/ibacm/prov/acmp/src/acmp.c
> @@ -732,7 +732,7 @@ static void acmp_process_join_resp(struct acm_sa_mad *sa_mad)
>  			acm_log(0, "ERROR - unable to create ah\n");
>  			goto out;
>  		}
> -		ret = ibv_attach_mcast(ep->qp, &mc_rec->mgid, mc_rec->mlid);
> +		ret = ibv_attach_mcast(ep->qp, &dest->mgid, dest->av.dlid);
>  		if (ret) {
>  			acm_log(0, "ERROR - unable to attach QP to multicast group\n");
>  			ibv_destroy_ah(dest->ah);
> @@ -1429,7 +1429,7 @@ static void acmp_ep_join(struct acmp_ep *ep)
>
>  	if (ep->mc_dest[0].state == ACMP_READY && ep->mc_dest[0].ah) {
>  		ibv_detach_mcast(ep->qp, &ep->mc_dest[0].mgid,
> -				 be16toh(ep->mc_dest[0].av.dlid));
> +				 ep->mc_dest[0].av.dlid);
>  		ibv_destroy_ah(ep->mc_dest[0].ah);
>  		ep->mc_dest[0].ah = NULL;
>  	}
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/ibacm/prov/acmp/src/acmp.c b/ibacm/prov/acmp/src/acmp.c
index aa78416..78d9a29 100644
--- a/ibacm/prov/acmp/src/acmp.c
+++ b/ibacm/prov/acmp/src/acmp.c
@@ -732,7 +732,7 @@  static void acmp_process_join_resp(struct acm_sa_mad *sa_mad)
 			acm_log(0, "ERROR - unable to create ah\n");
 			goto out;
 		}
-		ret = ibv_attach_mcast(ep->qp, &mc_rec->mgid, mc_rec->mlid);
+		ret = ibv_attach_mcast(ep->qp, &dest->mgid, dest->av.dlid);
 		if (ret) {
 			acm_log(0, "ERROR - unable to attach QP to multicast group\n");
 			ibv_destroy_ah(dest->ah);
@@ -1429,7 +1429,7 @@  static void acmp_ep_join(struct acmp_ep *ep)
 
 	if (ep->mc_dest[0].state == ACMP_READY && ep->mc_dest[0].ah) {
 		ibv_detach_mcast(ep->qp, &ep->mc_dest[0].mgid,
-				 be16toh(ep->mc_dest[0].av.dlid));
+				 ep->mc_dest[0].av.dlid);
 		ibv_destroy_ah(ep->mc_dest[0].ah);
 		ep->mc_dest[0].ah = NULL;
 	}