Message ID | 20161222221334.GA15907@obsidianresearch.com (mailing list archive) |
---|---|
State | Accepted |
Headers | show |
On 12/22/2016 5:13 PM, Jason Gunthorpe wrote: > Valgrind reports: > > ==1196== Syscall param write(buf) points to uninitialised byte(s) > ==1196== at 0x506250D: ??? (syscall-template.S:84) > ==1196== by 0x527756F: ibv_cmd_modify_qp (cmd.c:1291) > ==1196== by 0x8008D74: mlx4_modify_qp (verbs.c:820) > ==1196== by 0x527E4F4: ibv_modify_qp@@IBVERBS_1.1 (verbs.c:561) > ==1196== by 0x4E3FAB3: ucma_modify_qp_err.isra.6 (cma.c:1115) > ==1196== by 0x4E41D56: rdma_get_cm_event.part.15 (cma.c:2180) > ==1196== by 0x402CF0: cm_thread (rping.c:576) > ==1196== by 0x5059709: start_thread (pthread_create.c:333) > ==1196== by 0x558A82C: clone (clone.S:109) > ==1196== Address 0x9847980 is on thread 2's stack > ==1196== in frame #2, created by mlx4_modify_qp (verbs.c:775) > > This is because of code like this: > > struct ibv_qp_attr qp_attr; > qp_attr.qp_state = IBV_QPS_ERR; > return rdma_seterrno(ibv_modify_qp(id->qp, &qp_attr, IBV_QP_STATE)); > > Always pass 0 into the kernel for for attributes that are not requested > to be modified. > > Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Thanks, applied.
On Thu, Dec 22, 2016 at 03:13:34PM -0700, Jason Gunthorpe wrote: > Valgrind reports: > > ==1196== Syscall param write(buf) points to uninitialised byte(s) > ==1196== at 0x506250D: ??? (syscall-template.S:84) > ==1196== by 0x527756F: ibv_cmd_modify_qp (cmd.c:1291) > ==1196== by 0x8008D74: mlx4_modify_qp (verbs.c:820) > ==1196== by 0x527E4F4: ibv_modify_qp@@IBVERBS_1.1 (verbs.c:561) > ==1196== by 0x4E3FAB3: ucma_modify_qp_err.isra.6 (cma.c:1115) > ==1196== by 0x4E41D56: rdma_get_cm_event.part.15 (cma.c:2180) > ==1196== by 0x402CF0: cm_thread (rping.c:576) > ==1196== by 0x5059709: start_thread (pthread_create.c:333) > ==1196== by 0x558A82C: clone (clone.S:109) > ==1196== Address 0x9847980 is on thread 2's stack > ==1196== in frame #2, created by mlx4_modify_qp (verbs.c:775) > > This is because of code like this: > > struct ibv_qp_attr qp_attr; > qp_attr.qp_state = IBV_QPS_ERR; > return rdma_seterrno(ibv_modify_qp(id->qp, &qp_attr, IBV_QP_STATE)); > > Always pass 0 into the kernel for for attributes that are not requested > to be modified. > > Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> > --- > libibverbs/cmd.c | 170 +++++++++++++++++++++++++++++++++++++++---------------- > 1 file changed, 121 insertions(+), 49 deletions(-) > > Shown with rping > > Please double check my if's.. I followed the man page > > I think there will be other cases where we do this wrong as well :\ > > diff --git a/libibverbs/cmd.c b/libibverbs/cmd.c > index 38061892da0de0..a702d67b05f2a3 100644 > --- a/libibverbs/cmd.c > +++ b/libibverbs/cmd.c > @@ -1221,55 +1221,127 @@ int ibv_cmd_modify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, > { > IBV_INIT_CMD(cmd, cmd_size, MODIFY_QP); I didn't check all ibv_* commands but for ibv_cmd_modify_qp callers, there are no callers which change cmd before this call. It looks like it is safe to replace all cmd->* = 0 with one global memset(). Maybe it is safe to put this memset in IBV_INIT_CMD too. > > - cmd->qp_handle = qp->handle; > - cmd->attr_mask = attr_mask; > - cmd->qkey = attr->qkey; > - cmd->rq_psn = attr->rq_psn; > - cmd->sq_psn = attr->sq_psn; > - cmd->dest_qp_num = attr->dest_qp_num; > - cmd->qp_access_flags = attr->qp_access_flags; > - cmd->pkey_index = attr->pkey_index; > - cmd->alt_pkey_index = attr->alt_pkey_index; > - cmd->qp_state = attr->qp_state; > - cmd->cur_qp_state = attr->cur_qp_state; > - cmd->path_mtu = attr->path_mtu; > - cmd->path_mig_state = attr->path_mig_state; > - cmd->en_sqd_async_notify = attr->en_sqd_async_notify; > - cmd->max_rd_atomic = attr->max_rd_atomic; > - cmd->max_dest_rd_atomic = attr->max_dest_rd_atomic; > - cmd->min_rnr_timer = attr->min_rnr_timer; > - cmd->port_num = attr->port_num; > - cmd->timeout = attr->timeout; > - cmd->retry_cnt = attr->retry_cnt; > - cmd->rnr_retry = attr->rnr_retry; > - cmd->alt_port_num = attr->alt_port_num; > - cmd->alt_timeout = attr->alt_timeout; > - > - memcpy(cmd->dest.dgid, attr->ah_attr.grh.dgid.raw, 16); > - cmd->dest.flow_label = attr->ah_attr.grh.flow_label; > - cmd->dest.dlid = attr->ah_attr.dlid; > - cmd->dest.reserved = 0; > - cmd->dest.sgid_index = attr->ah_attr.grh.sgid_index; > - cmd->dest.hop_limit = attr->ah_attr.grh.hop_limit; > - cmd->dest.traffic_class = attr->ah_attr.grh.traffic_class; > - cmd->dest.sl = attr->ah_attr.sl; > - cmd->dest.src_path_bits = attr->ah_attr.src_path_bits; > - cmd->dest.static_rate = attr->ah_attr.static_rate; > - cmd->dest.is_global = attr->ah_attr.is_global; > - cmd->dest.port_num = attr->ah_attr.port_num; > - > - memcpy(cmd->alt_dest.dgid, attr->alt_ah_attr.grh.dgid.raw, 16); > - cmd->alt_dest.flow_label = attr->alt_ah_attr.grh.flow_label; > - cmd->alt_dest.dlid = attr->alt_ah_attr.dlid; > - cmd->alt_dest.reserved = 0; > - cmd->alt_dest.sgid_index = attr->alt_ah_attr.grh.sgid_index; > - cmd->alt_dest.hop_limit = attr->alt_ah_attr.grh.hop_limit; > - cmd->alt_dest.traffic_class = attr->alt_ah_attr.grh.traffic_class; > - cmd->alt_dest.sl = attr->alt_ah_attr.sl; > - cmd->alt_dest.src_path_bits = attr->alt_ah_attr.src_path_bits; > - cmd->alt_dest.static_rate = attr->alt_ah_attr.static_rate; > - cmd->alt_dest.is_global = attr->alt_ah_attr.is_global; > - cmd->alt_dest.port_num = attr->alt_ah_attr.port_num; > + cmd->qp_handle = qp->handle; > + cmd->attr_mask = attr_mask; > + > + if (attr_mask & IBV_QP_STATE) > + cmd->qp_state = attr->qp_state; > + else > + cmd->qp_state = 0; > + > + if (attr_mask & IBV_QP_CUR_STATE) > + cmd->cur_qp_state = attr->cur_qp_state; > + else > + cmd->cur_qp_state = 0; > + > + if (attr_mask & IBV_QP_EN_SQD_ASYNC_NOTIFY) > + cmd->en_sqd_async_notify = attr->en_sqd_async_notify; > + else > + cmd->en_sqd_async_notify = 0; > + > + if (attr_mask & IBV_QP_ACCESS_FLAGS) > + cmd->qp_access_flags = attr->qp_access_flags; > + else > + cmd->qp_access_flags = 0; > + if (attr_mask & IBV_QP_PKEY_INDEX) > + cmd->pkey_index = attr->pkey_index; > + else > + cmd->pkey_index = 0; > + if (attr_mask & IBV_QP_PORT) > + cmd->port_num = attr->port_num; > + else > + cmd->port_num = 0; > + if (attr_mask & IBV_QP_QKEY) > + cmd->qkey = attr->qkey; > + else > + cmd->qkey = 0; > + > + if (attr_mask & IBV_QP_AV) { > + memcpy(cmd->dest.dgid, attr->ah_attr.grh.dgid.raw, 16); > + cmd->dest.flow_label = attr->ah_attr.grh.flow_label; > + cmd->dest.dlid = attr->ah_attr.dlid; > + cmd->dest.reserved = 0; > + cmd->dest.sgid_index = attr->ah_attr.grh.sgid_index; > + cmd->dest.hop_limit = attr->ah_attr.grh.hop_limit; > + cmd->dest.traffic_class = attr->ah_attr.grh.traffic_class; > + cmd->dest.sl = attr->ah_attr.sl; > + cmd->dest.src_path_bits = attr->ah_attr.src_path_bits; > + cmd->dest.static_rate = attr->ah_attr.static_rate; > + cmd->dest.is_global = attr->ah_attr.is_global; > + cmd->dest.port_num = attr->ah_attr.port_num; > + } else > + memset(&cmd->dest, 0, sizeof(cmd->dest)); > + > + if (attr_mask & IBV_QP_PATH_MTU) > + cmd->path_mtu = attr->path_mtu; > + else > + cmd->path_mtu = 0; > + if (attr_mask & IBV_QP_TIMEOUT) > + cmd->timeout = attr->timeout; > + else > + cmd->timeout = 0; > + if (attr_mask & IBV_QP_RETRY_CNT) > + cmd->retry_cnt = attr->retry_cnt; > + else > + cmd->retry_cnt = 0; > + if (attr_mask & IBV_QP_RNR_RETRY) > + cmd->rnr_retry = attr->rnr_retry; > + else > + cmd->rnr_retry = 0; > + if (attr_mask & IBV_QP_RQ_PSN) > + cmd->rq_psn = attr->rq_psn; > + else > + cmd->rq_psn = 0; > + if (attr_mask & IBV_QP_MAX_QP_RD_ATOMIC) > + cmd->max_rd_atomic = attr->max_rd_atomic; > + else > + cmd->max_rd_atomic = 0; > + > + if (attr_mask & IBV_QP_ALT_PATH) { > + cmd->alt_pkey_index = attr->alt_pkey_index; > + cmd->alt_port_num = attr->alt_port_num; > + cmd->alt_timeout = attr->alt_timeout; > + > + memcpy(cmd->alt_dest.dgid, attr->alt_ah_attr.grh.dgid.raw, 16); > + cmd->alt_dest.flow_label = attr->alt_ah_attr.grh.flow_label; > + cmd->alt_dest.dlid = attr->alt_ah_attr.dlid; > + cmd->alt_dest.reserved = 0; > + cmd->alt_dest.sgid_index = attr->alt_ah_attr.grh.sgid_index; > + cmd->alt_dest.hop_limit = attr->alt_ah_attr.grh.hop_limit; > + cmd->alt_dest.traffic_class = > + attr->alt_ah_attr.grh.traffic_class; > + cmd->alt_dest.sl = attr->alt_ah_attr.sl; > + cmd->alt_dest.src_path_bits = attr->alt_ah_attr.src_path_bits; > + cmd->alt_dest.static_rate = attr->alt_ah_attr.static_rate; > + cmd->alt_dest.is_global = attr->alt_ah_attr.is_global; > + cmd->alt_dest.port_num = attr->alt_ah_attr.port_num; > + } else { > + cmd->alt_pkey_index = 0; > + cmd->alt_port_num = 0; > + cmd->alt_timeout = 0; > + memset(&cmd->alt_dest, 0, sizeof(cmd->alt_dest)); > + } > + > + if (attr_mask & IBV_QP_MIN_RNR_TIMER) > + cmd->min_rnr_timer = attr->min_rnr_timer; > + else > + cmd->min_rnr_timer = 0; > + if (attr_mask & IBV_QP_SQ_PSN) > + cmd->sq_psn = attr->sq_psn; > + else > + cmd->sq_psn = 0; > + if (attr_mask & IBV_QP_MAX_DEST_RD_ATOMIC) > + cmd->max_dest_rd_atomic = attr->max_dest_rd_atomic; > + else > + cmd->max_dest_rd_atomic = 0; > + if (attr_mask & IBV_QP_PATH_MIG_STATE) > + cmd->path_mig_state = attr->path_mig_state; > + else > + cmd->path_mig_state = 0; > + if (attr_mask & IBV_QP_DEST_QPN) > + cmd->dest_qp_num = attr->dest_qp_num; > + else > + cmd->dest_qp_num = 0; > > cmd->reserved[0] = cmd->reserved[1] = 0; > > -- > 2.7.4
T24gVGh1LCAyMDE2LTEyLTIyIGF0IDE1OjEzIC0wNzAwLCBKYXNvbiBHdW50aG9ycGUgd3JvdGU6 DQo+IFZhbGdyaW5kIHJlcG9ydHM6DQo+IA0KPiA9PTExOTY9PSBTeXNjYWxsIHBhcmFtIHdyaXRl KGJ1ZikgcG9pbnRzIHRvIHVuaW5pdGlhbGlzZWQgYnl0ZShzKQ0KPiA9PTExOTY9PSAgICBhdCAw eDUwNjI1MEQ6ID8/PyAoc3lzY2FsbC10ZW1wbGF0ZS5TOjg0KQ0KPiA9PTExOTY9PSAgICBieSAw eDUyNzc1NkY6IGlidl9jbWRfbW9kaWZ5X3FwIChjbWQuYzoxMjkxKQ0KPiA9PTExOTY9PSAgICBi eSAweDgwMDhENzQ6IG1seDRfbW9kaWZ5X3FwICh2ZXJicy5jOjgyMCkNCj4gPT0xMTk2PT0gICAg YnkgMHg1MjdFNEY0OiBpYnZfbW9kaWZ5X3FwQEBJQlZFUkJTXzEuMSAodmVyYnMuYzo1NjEpDQo+ ID09MTE5Nj09ICAgIGJ5IDB4NEUzRkFCMzogdWNtYV9tb2RpZnlfcXBfZXJyLmlzcmEuNiAoY21h LmM6MTExNSkNCj4gPT0xMTk2PT0gICAgYnkgMHg0RTQxRDU2OiByZG1hX2dldF9jbV9ldmVudC5w YXJ0LjE1IChjbWEuYzoyMTgwKQ0KPiA9PTExOTY9PSAgICBieSAweDQwMkNGMDogY21fdGhyZWFk IChycGluZy5jOjU3NikNCj4gPT0xMTk2PT0gICAgYnkgMHg1MDU5NzA5OiBzdGFydF90aHJlYWQg KHB0aHJlYWRfY3JlYXRlLmM6MzMzKQ0KPiA9PTExOTY9PSAgICBieSAweDU1OEE4MkM6IGNsb25l IChjbG9uZS5TOjEwOSkNCj4gPT0xMTk2PT0gIEFkZHJlc3MgMHg5ODQ3OTgwIGlzIG9uIHRocmVh ZCAyJ3Mgc3RhY2sNCj4gPT0xMTk2PT0gIGluIGZyYW1lICMyLCBjcmVhdGVkIGJ5IG1seDRfbW9k aWZ5X3FwICh2ZXJicy5jOjc3NSkNCj4gDQo+IFRoaXMgaXMgYmVjYXVzZSBvZiBjb2RlIGxpa2Ug dGhpczoNCj4gDQo+ICAgICAgICAgc3RydWN0IGlidl9xcF9hdHRyIHFwX2F0dHI7DQo+ICAgICAg ICAgcXBfYXR0ci5xcF9zdGF0ZSA9IElCVl9RUFNfRVJSOw0KPiAgICAgICAgIHJldHVybiByZG1h X3NldGVycm5vKGlidl9tb2RpZnlfcXAoaWQtPnFwLCAmcXBfYXR0ciwgSUJWX1FQX1NUQVRFKSk7 DQo+IA0KPiBBbHdheXMgcGFzcyAwIGludG8gdGhlIGtlcm5lbCBmb3IgZm9yIGF0dHJpYnV0ZXMg dGhhdCBhcmUgbm90IHJlcXVlc3RlZA0KPiB0byBiZSBtb2RpZmllZC4NCg0KSGVsbG8gSmFzb24s DQoNCkhhdmUgeW91IGNvbnNpZGVyZWQgdG8gbW9kaWZ5IFZhbGdyaW5kPyBJdCBpcyBwb3NzaWJs ZSB0byBtb2RpZnkgVmFsZ3JpbmQNCnN1Y2ggdGhhdCBpdCBkb2Vzbid0IHJlcG9ydCBmYWxzZSBw b3NpdGl2ZXMgbGlrZSB0aGUgYWJvdmUgcmVwb3J0IHdpdGhvdXQNCmNoYW5naW5nIHRoZSByZG1h LWNvcmUgY29kZS4gU2VlIGFsc28gUFJFKHN5c19pb2N0bCkgaW4gc291cmNlIGZpbGUNCmNvcmVn cmluZC9tX3N5c3dyYXAvc3lzd3JhcC1saW51eC5jLg0KDQpCYXJ0Lg0K -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Jan 02, 2017 at 08:02:02AM +0000, Bart Van Assche wrote: > On Thu, 2016-12-22 at 15:13 -0700, Jason Gunthorpe wrote: > > Valgrind reports: > > > > ==1196== Syscall param write(buf) points to uninitialised byte(s) > > ==1196== at 0x506250D: ??? (syscall-template.S:84) > > ==1196== by 0x527756F: ibv_cmd_modify_qp (cmd.c:1291) > > ==1196== by 0x8008D74: mlx4_modify_qp (verbs.c:820) > > ==1196== by 0x527E4F4: ibv_modify_qp@@IBVERBS_1.1 (verbs.c:561) > > ==1196== by 0x4E3FAB3: ucma_modify_qp_err.isra.6 (cma.c:1115) > > ==1196== by 0x4E41D56: rdma_get_cm_event.part.15 (cma.c:2180) > > ==1196== by 0x402CF0: cm_thread (rping.c:576) > > ==1196== by 0x5059709: start_thread (pthread_create.c:333) > > ==1196== by 0x558A82C: clone (clone.S:109) > > ==1196== Address 0x9847980 is on thread 2's stack > > ==1196== in frame #2, created by mlx4_modify_qp (verbs.c:775) > > > > This is because of code like this: > > > > struct ibv_qp_attr qp_attr; > > qp_attr.qp_state = IBV_QPS_ERR; > > return rdma_seterrno(ibv_modify_qp(id->qp, &qp_attr, IBV_QP_STATE)); > > > > Always pass 0 into the kernel for for attributes that are not requested > Have you considered to modify Valgrind? It is possible to modify Valgrind > such that it doesn't report false positives like the above report without > changing the rdma-core code. See also PRE(sys_ioctl) in source file > coregrind/m_syswrap/syswrap-linux.c. I felt that passing uninitialized memory into the kernel was just in general a bad idea, and adding the branchs to copy zero instead of un-init is probably performance neutral. Even so, I don't think we can fix valgrind, ioctl is a different case as ioctls are much more well defined, this is write() and valgrind would have to first know we are writing to a uverbs FD which seems challenging to determine, can valgrind already do this? Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
T24gTW9uLCAyMDE3LTAxLTAyIGF0IDE0OjE0IC0wNzAwLCBKYXNvbiBHdW50aG9ycGUgd3JvdGU6 DQo+IEkgZmVsdCB0aGF0IHBhc3NpbmcgdW5pbml0aWFsaXplZCBtZW1vcnkgaW50byB0aGUga2Vy bmVsIHdhcyBqdXN0DQo+IGluIGdlbmVyYWwgYSBiYWQgaWRlYSwgYW5kIGFkZGluZyB0aGUgYnJh bmNocyB0byBjb3B5IHplcm8gaW5zdGVhZCBvZg0KPiB1bi1pbml0IGlzIHByb2JhYmx5IHBlcmZv cm1hbmNlIG5ldXRyYWwuDQo+IA0KPiBFdmVuIHNvLCBJIGRvbid0IHRoaW5rIHdlIGNhbiBmaXgg dmFsZ3JpbmQsIGlvY3RsIGlzIGEgZGlmZmVyZW50IGNhc2UNCj4gYXMgaW9jdGxzIGFyZSBtdWNo IG1vcmUgd2VsbCBkZWZpbmVkLCB0aGlzIGlzIHdyaXRlKCkgYW5kIHZhbGdyaW5kDQo+IHdvdWxk IGhhdmUgdG8gZmlyc3Qga25vdyB3ZSBhcmUgd3JpdGluZyB0byBhIHV2ZXJicyBGRCB3aGljaCBz ZWVtcw0KPiBjaGFsbGVuZ2luZyB0byBkZXRlcm1pbmUsIGNhbiB2YWxncmluZCBhbHJlYWR5IGRv IHRoaXM/DQoNCkhlbGxvIEphc29uLA0KDQpBcyBmYXIgYXMgSSBrbm93IHRoZXJlIGlzIG5vdCB5 ZXQgYW55IGNvZGUgaW4gVmFsZ3JpbmQgdG8gaW50ZXJwcmV0IHRoZQ0KZGF0YSBzZW50IGZyb20g dXNlciBzcGFjZSB0byBrZXJuZWwgdGhyb3VnaCB0aGUgd3JpdGUoKSBzeXN0ZW0gY2FsbC4gU2lu Y2UNCkkgZG8gbm90IGtub3cgYW55IGFwcGxpY2F0aW9uIGZvciB3aGljaCBpYnZfbW9kaWZ5X3Fw KCkgaXMgaW4gdGhlIGhvdCBwYXRoDQpJIHRoaW5rIG1vZGlmeWluZyB0aGUgaWJ2X21vZGlmeV9x cCgpIGltcGxlbWVudGF0aW9uIGlzIGZpbmUuDQoNCkJhcnQu -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/libibverbs/cmd.c b/libibverbs/cmd.c index 38061892da0de0..a702d67b05f2a3 100644 --- a/libibverbs/cmd.c +++ b/libibverbs/cmd.c @@ -1221,55 +1221,127 @@ int ibv_cmd_modify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, { IBV_INIT_CMD(cmd, cmd_size, MODIFY_QP); - cmd->qp_handle = qp->handle; - cmd->attr_mask = attr_mask; - cmd->qkey = attr->qkey; - cmd->rq_psn = attr->rq_psn; - cmd->sq_psn = attr->sq_psn; - cmd->dest_qp_num = attr->dest_qp_num; - cmd->qp_access_flags = attr->qp_access_flags; - cmd->pkey_index = attr->pkey_index; - cmd->alt_pkey_index = attr->alt_pkey_index; - cmd->qp_state = attr->qp_state; - cmd->cur_qp_state = attr->cur_qp_state; - cmd->path_mtu = attr->path_mtu; - cmd->path_mig_state = attr->path_mig_state; - cmd->en_sqd_async_notify = attr->en_sqd_async_notify; - cmd->max_rd_atomic = attr->max_rd_atomic; - cmd->max_dest_rd_atomic = attr->max_dest_rd_atomic; - cmd->min_rnr_timer = attr->min_rnr_timer; - cmd->port_num = attr->port_num; - cmd->timeout = attr->timeout; - cmd->retry_cnt = attr->retry_cnt; - cmd->rnr_retry = attr->rnr_retry; - cmd->alt_port_num = attr->alt_port_num; - cmd->alt_timeout = attr->alt_timeout; - - memcpy(cmd->dest.dgid, attr->ah_attr.grh.dgid.raw, 16); - cmd->dest.flow_label = attr->ah_attr.grh.flow_label; - cmd->dest.dlid = attr->ah_attr.dlid; - cmd->dest.reserved = 0; - cmd->dest.sgid_index = attr->ah_attr.grh.sgid_index; - cmd->dest.hop_limit = attr->ah_attr.grh.hop_limit; - cmd->dest.traffic_class = attr->ah_attr.grh.traffic_class; - cmd->dest.sl = attr->ah_attr.sl; - cmd->dest.src_path_bits = attr->ah_attr.src_path_bits; - cmd->dest.static_rate = attr->ah_attr.static_rate; - cmd->dest.is_global = attr->ah_attr.is_global; - cmd->dest.port_num = attr->ah_attr.port_num; - - memcpy(cmd->alt_dest.dgid, attr->alt_ah_attr.grh.dgid.raw, 16); - cmd->alt_dest.flow_label = attr->alt_ah_attr.grh.flow_label; - cmd->alt_dest.dlid = attr->alt_ah_attr.dlid; - cmd->alt_dest.reserved = 0; - cmd->alt_dest.sgid_index = attr->alt_ah_attr.grh.sgid_index; - cmd->alt_dest.hop_limit = attr->alt_ah_attr.grh.hop_limit; - cmd->alt_dest.traffic_class = attr->alt_ah_attr.grh.traffic_class; - cmd->alt_dest.sl = attr->alt_ah_attr.sl; - cmd->alt_dest.src_path_bits = attr->alt_ah_attr.src_path_bits; - cmd->alt_dest.static_rate = attr->alt_ah_attr.static_rate; - cmd->alt_dest.is_global = attr->alt_ah_attr.is_global; - cmd->alt_dest.port_num = attr->alt_ah_attr.port_num; + cmd->qp_handle = qp->handle; + cmd->attr_mask = attr_mask; + + if (attr_mask & IBV_QP_STATE) + cmd->qp_state = attr->qp_state; + else + cmd->qp_state = 0; + + if (attr_mask & IBV_QP_CUR_STATE) + cmd->cur_qp_state = attr->cur_qp_state; + else + cmd->cur_qp_state = 0; + + if (attr_mask & IBV_QP_EN_SQD_ASYNC_NOTIFY) + cmd->en_sqd_async_notify = attr->en_sqd_async_notify; + else + cmd->en_sqd_async_notify = 0; + + if (attr_mask & IBV_QP_ACCESS_FLAGS) + cmd->qp_access_flags = attr->qp_access_flags; + else + cmd->qp_access_flags = 0; + if (attr_mask & IBV_QP_PKEY_INDEX) + cmd->pkey_index = attr->pkey_index; + else + cmd->pkey_index = 0; + if (attr_mask & IBV_QP_PORT) + cmd->port_num = attr->port_num; + else + cmd->port_num = 0; + if (attr_mask & IBV_QP_QKEY) + cmd->qkey = attr->qkey; + else + cmd->qkey = 0; + + if (attr_mask & IBV_QP_AV) { + memcpy(cmd->dest.dgid, attr->ah_attr.grh.dgid.raw, 16); + cmd->dest.flow_label = attr->ah_attr.grh.flow_label; + cmd->dest.dlid = attr->ah_attr.dlid; + cmd->dest.reserved = 0; + cmd->dest.sgid_index = attr->ah_attr.grh.sgid_index; + cmd->dest.hop_limit = attr->ah_attr.grh.hop_limit; + cmd->dest.traffic_class = attr->ah_attr.grh.traffic_class; + cmd->dest.sl = attr->ah_attr.sl; + cmd->dest.src_path_bits = attr->ah_attr.src_path_bits; + cmd->dest.static_rate = attr->ah_attr.static_rate; + cmd->dest.is_global = attr->ah_attr.is_global; + cmd->dest.port_num = attr->ah_attr.port_num; + } else + memset(&cmd->dest, 0, sizeof(cmd->dest)); + + if (attr_mask & IBV_QP_PATH_MTU) + cmd->path_mtu = attr->path_mtu; + else + cmd->path_mtu = 0; + if (attr_mask & IBV_QP_TIMEOUT) + cmd->timeout = attr->timeout; + else + cmd->timeout = 0; + if (attr_mask & IBV_QP_RETRY_CNT) + cmd->retry_cnt = attr->retry_cnt; + else + cmd->retry_cnt = 0; + if (attr_mask & IBV_QP_RNR_RETRY) + cmd->rnr_retry = attr->rnr_retry; + else + cmd->rnr_retry = 0; + if (attr_mask & IBV_QP_RQ_PSN) + cmd->rq_psn = attr->rq_psn; + else + cmd->rq_psn = 0; + if (attr_mask & IBV_QP_MAX_QP_RD_ATOMIC) + cmd->max_rd_atomic = attr->max_rd_atomic; + else + cmd->max_rd_atomic = 0; + + if (attr_mask & IBV_QP_ALT_PATH) { + cmd->alt_pkey_index = attr->alt_pkey_index; + cmd->alt_port_num = attr->alt_port_num; + cmd->alt_timeout = attr->alt_timeout; + + memcpy(cmd->alt_dest.dgid, attr->alt_ah_attr.grh.dgid.raw, 16); + cmd->alt_dest.flow_label = attr->alt_ah_attr.grh.flow_label; + cmd->alt_dest.dlid = attr->alt_ah_attr.dlid; + cmd->alt_dest.reserved = 0; + cmd->alt_dest.sgid_index = attr->alt_ah_attr.grh.sgid_index; + cmd->alt_dest.hop_limit = attr->alt_ah_attr.grh.hop_limit; + cmd->alt_dest.traffic_class = + attr->alt_ah_attr.grh.traffic_class; + cmd->alt_dest.sl = attr->alt_ah_attr.sl; + cmd->alt_dest.src_path_bits = attr->alt_ah_attr.src_path_bits; + cmd->alt_dest.static_rate = attr->alt_ah_attr.static_rate; + cmd->alt_dest.is_global = attr->alt_ah_attr.is_global; + cmd->alt_dest.port_num = attr->alt_ah_attr.port_num; + } else { + cmd->alt_pkey_index = 0; + cmd->alt_port_num = 0; + cmd->alt_timeout = 0; + memset(&cmd->alt_dest, 0, sizeof(cmd->alt_dest)); + } + + if (attr_mask & IBV_QP_MIN_RNR_TIMER) + cmd->min_rnr_timer = attr->min_rnr_timer; + else + cmd->min_rnr_timer = 0; + if (attr_mask & IBV_QP_SQ_PSN) + cmd->sq_psn = attr->sq_psn; + else + cmd->sq_psn = 0; + if (attr_mask & IBV_QP_MAX_DEST_RD_ATOMIC) + cmd->max_dest_rd_atomic = attr->max_dest_rd_atomic; + else + cmd->max_dest_rd_atomic = 0; + if (attr_mask & IBV_QP_PATH_MIG_STATE) + cmd->path_mig_state = attr->path_mig_state; + else + cmd->path_mig_state = 0; + if (attr_mask & IBV_QP_DEST_QPN) + cmd->dest_qp_num = attr->dest_qp_num; + else + cmd->dest_qp_num = 0; cmd->reserved[0] = cmd->reserved[1] = 0;
Valgrind reports: ==1196== Syscall param write(buf) points to uninitialised byte(s) ==1196== at 0x506250D: ??? (syscall-template.S:84) ==1196== by 0x527756F: ibv_cmd_modify_qp (cmd.c:1291) ==1196== by 0x8008D74: mlx4_modify_qp (verbs.c:820) ==1196== by 0x527E4F4: ibv_modify_qp@@IBVERBS_1.1 (verbs.c:561) ==1196== by 0x4E3FAB3: ucma_modify_qp_err.isra.6 (cma.c:1115) ==1196== by 0x4E41D56: rdma_get_cm_event.part.15 (cma.c:2180) ==1196== by 0x402CF0: cm_thread (rping.c:576) ==1196== by 0x5059709: start_thread (pthread_create.c:333) ==1196== by 0x558A82C: clone (clone.S:109) ==1196== Address 0x9847980 is on thread 2's stack ==1196== in frame #2, created by mlx4_modify_qp (verbs.c:775) This is because of code like this: struct ibv_qp_attr qp_attr; qp_attr.qp_state = IBV_QPS_ERR; return rdma_seterrno(ibv_modify_qp(id->qp, &qp_attr, IBV_QP_STATE)); Always pass 0 into the kernel for for attributes that are not requested to be modified. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> --- libibverbs/cmd.c | 170 +++++++++++++++++++++++++++++++++++++++---------------- 1 file changed, 121 insertions(+), 49 deletions(-) Shown with rping Please double check my if's.. I followed the man page I think there will be other cases where we do this wrong as well :\