Message ID | 201107141058.29879.jackm@dev.mellanox.co.il (mailing list archive) |
---|---|
State | RFC, archived |
Headers | show |
> 1. In procedure cma_ib_listen (and elsewhere), if the call to ib_create_cm_id > fails, > id_priv->cm_id.ib is left with the (non-zero) error value instead of NULL. > However, if there is a failure later in the procedure, you set id_priv- > >cm_id.ib to NULL. > > Is there a reason for this difference in behavior? > (code snippet from current code is below) > int cma_ib_listen(struct rdma_id_ > > id_priv->cm_id.ib = ib_create_cm_id(id_priv->id.device, > cma_req_handler, > id_priv); > if (IS_ERR(id_priv->cm_id.ib)) > return PTR_ERR(id_priv->cm_id.ib); This looks like a bug. > > .... > if (ret) { > ib_destroy_cm_id(id_priv->cm_id.ib); > id_priv->cm_id.ib = NULL; > } > > > 2. procedure cma_has_cm_dev looks as follows: > > static int cma_has_cm_dev(struct rdma_id_private *id_priv) > { > return (id_priv->id.device && id_priv->cm_id.ib); > } > > Shouldn't the line be: > return (id_priv->id.device && id_priv->cm_id.ib && > !IS_ERR(id_priv->cm_id.ib)); Given the current code, adding IS_ERR makes sense, but see below. Thinking about this, we shouldn't need to check id_priv->id.device. If we have id_priv->cm_id.ib, then the device pointer must be valid. (fyi: cma_has_cm_dev() was added by commit 6c719f5c6c823901fac2d46b83db5a69ba7e9152. It replaced a state check with the above device check to handle device removal.) > 3. There are several places where the value of id_priv->cm_id.ib > is checked to be not NULL. > > Wouldn't it be better to call cma_has_cm_dev in these places > (when cma_has_cm_dev has been fixed, as I suggested in 2 above). Although it's a minor performance gain, I'm leaning towards keeping id_priv->cm_id.ib = NULL on failure, either by resetting it or using a local variable. cma_has_cm_dev() can then be replaced by checking id_priv->cm_id.ib for non-NULL, and checks for IS_ERR(id_priv->cm_id.ib) are removed. > Please consider the patch below as a starting point (I did not touch the iwarp > code). > Please let me know (ASAP) what you think. > (I still leave a window here where id_priv->cm_id.ib is ERR. Is this a > problem? > Would it be better to use a local variable instead of id_priv->cm_id.ib, and > only assign to id_priv->cm_id.ib when all the error checks have passed? > This would leave a window where a successful cm_id creation is not > immediately assigned > to the CMA object -- would this be a problem?). Without adding more synchronization, we need to ensure that id_priv->cm_id.ib is set before a user can receive any callbacks. This restricts our use of a local variable to the create_cm_id calls only. See cma_connect_iw() for an example of using this approach. Thanks, - Sean -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thursday 14 July 2011 20:46, Hefty, Sean wrote: > Although it's a minor performance gain, I'm leaning towards keeping id_priv->cm_id.ib = NULL on failure, either by resetting it or using a local variable. cma_has_cm_dev() can then be replaced by checking id_priv->cm_id.ib for non-NULL, and checks for IS_ERR(id_priv->cm_id.ib) are removed. > > > Please consider the patch below as a starting point (I did not touch the iwarp > > code). > > Please let me know (ASAP) what you think. > > (I still leave a window here where id_priv->cm_id.ib is ERR. Is this a > > problem? > > Would it be better to use a local variable instead of id_priv->cm_id.ib, and > > only assign to id_priv->cm_id.ib when all the error checks have passed? > > This would leave a window where a successful cm_id creation is not > > immediately assigned > > to the CMA object -- would this be a problem?). > > Without adding more synchronization, we need to ensure that id_priv->cm_id.ib is set before a user can receive any callbacks. > This restricts our use of a local variable to the create_cm_id calls only. > See cma_connect_iw() for an example of using this approach. > Agreed. That way we never have to test for IS_ERR. I'll send a patch. -Jack -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
==================================================================== --- cma.c 2011-07-13 09:54:09.000000000 +0300 +++ cma_fixed.c 2011-07-14 10:51:23.000000000 +0300 @@ -424,7 +424,8 @@ static int cma_disable_callback(struct r static int cma_has_cm_dev(struct rdma_id_private *id_priv) { - return (id_priv->id.device && id_priv->cm_id.ib); + return (id_priv->id.device && id_priv->cm_id.ib && + !IS_ERR(id_priv->cm_id.ib)); } struct rdma_cm_id *rdma_create_id(rdma_cm_event_handler event_handler, @@ -658,7 +659,7 @@ int rdma_init_qp_attr(struct rdma_cm_id id_priv = container_of(id, struct rdma_id_private, id); switch (rdma_node_get_transport(id_priv->id.device->node_type)) { case RDMA_TRANSPORT_IB: - if (!id_priv->cm_id.ib || cma_is_ud_ps(id_priv->id.ps)) + if (!cma_has_cm_dev(id_priv) || cma_is_ud_ps(id_priv->id.ps)) ret = cma_ib_init_qp_attr(id_priv, qp_attr, qp_attr_mask); else ret = ib_cm_init_qp_attr(id_priv->cm_id.ib, qp_attr, @@ -918,7 +919,7 @@ void rdma_destroy_id(struct rdma_cm_id * if (id_priv->cma_dev) { switch (rdma_node_get_transport(id_priv->id.device->node_type)) { case RDMA_TRANSPORT_IB: - if (id_priv->cm_id.ib && !IS_ERR(id_priv->cm_id.ib)) + if (cma_has_cm_dev(id_priv)) ib_destroy_cm_id(id_priv->cm_id.ib); break; case RDMA_TRANSPORT_IWARP: @@ -1471,8 +1472,10 @@ static int cma_ib_listen(struct rdma_id_ id_priv->cm_id.ib = ib_create_cm_id(id_priv->id.device, cma_req_handler, id_priv); - if (IS_ERR(id_priv->cm_id.ib)) - return PTR_ERR(id_priv->cm_id.ib); + if (IS_ERR(id_priv->cm_id.ib)) { + ret = PTR_ERR(id_priv->cm_id.ib); + goto out; + } addr = (struct sockaddr *) &id_priv->id.route.addr.src_addr; svc_id = cma_get_service_id(id_priv->id.ps, addr); @@ -1482,9 +1485,10 @@ static int cma_ib_listen(struct rdma_id_ cma_set_compare_data(id_priv->id.ps, addr, &compare_data); ret = ib_cm_listen(id_priv->cm_id.ib, svc_id, 0, &compare_data); } - +out: if (ret) { - ib_destroy_cm_id(id_priv->cm_id.ib); + if (!IS_ERR(id_priv->cm_id.ib)) + ib_destroy_cm_id(id_priv->cm_id.ib); id_priv->cm_id.ib = NULL; } @@ -2454,11 +2458,12 @@ static int cma_resolve_ib_udp(struct rdm req.max_cm_retries = CMA_MAX_CM_RETRIES; ret = ib_send_cm_sidr_req(id_priv->cm_id.ib, &req); +out: if (ret) { - ib_destroy_cm_id(id_priv->cm_id.ib); + if (!IS_ERR(id_priv->cm_id.ib)) + ib_destroy_cm_id(id_priv->cm_id.ib); id_priv->cm_id.ib = NULL; } -out: kfree(req.private_data); return ret; } @@ -2516,8 +2521,9 @@ static int cma_connect_ib(struct rdma_id ret = ib_send_cm_req(id_priv->cm_id.ib, &req); out: - if (ret && !IS_ERR(id_priv->cm_id.ib)) { - ib_destroy_cm_id(id_priv->cm_id.ib); + if (ret) { + if (!IS_ERR(id_priv->cm_id.ib)) + ib_destroy_cm_id(id_priv->cm_id.ib); id_priv->cm_id.ib = NULL; }