diff mbox

Questions regarding CMA

Message ID 201107141058.29879.jackm@dev.mellanox.co.il (mailing list archive)
State RFC, archived
Headers show

Commit Message

jackm July 14, 2011, 7:58 a.m. UTC
I am currently reviewing/cleaning up our CMA LAP patches for submission.

I have several questions regarding the file cma.c, which I would like you to clarify for me.

1. In procedure cma_ib_listen (and elsewhere), if the call to ib_create_cm_id fails,
   id_priv->cm_id.ib is left with the (non-zero) error value instead of NULL.
   However, if there is a failure later in the procedure, you set id_priv->cm_id.ib to NULL.

   Is there a reason for this difference in behavior? 
   (code snippet from current code is below)
	int cma_ib_listen(struct rdma_id_

       		id_priv->cm_id.ib = ib_create_cm_id(id_priv->id.device, cma_req_handler,
                                            	    id_priv);
	       	if (IS_ERR(id_priv->cm_id.ib))
               		return PTR_ERR(id_priv->cm_id.ib);

		....
	        if (ret) {
                	ib_destroy_cm_id(id_priv->cm_id.ib);
                	id_priv->cm_id.ib = NULL;
        	}

   
2. procedure cma_has_cm_dev looks as follows:

	static int cma_has_cm_dev(struct rdma_id_private *id_priv)
	{
		return (id_priv->id.device && id_priv->cm_id.ib);
	}

   Shouldn't the line be:
		return (id_priv->id.device && id_priv->cm_id.ib &&
			!IS_ERR(id_priv->cm_id.ib));

3. There are several places where the value of id_priv->cm_id.ib
   is checked to be not NULL.

   Wouldn't it be better to call cma_has_cm_dev in these places
   (when cma_has_cm_dev has been fixed, as I suggested in 2 above).
		
Please consider the patch below as a starting point (I did not touch the iwarp code).
Please let me know (ASAP) what you think.
(I still leave a window here where id_priv->cm_id.ib is ERR. Is this a problem?
 Would it be better to use a local variable instead of id_priv->cm_id.ib, and
 only assign to id_priv->cm_id.ib when all the error checks have passed?
 This would leave a window where a successful cm_id creation is not immediately assigned
 to the CMA object -- would this be a problem?).

Thanks!

-Jack
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Hefty, Sean July 14, 2011, 5:46 p.m. UTC | #1
> 1. In procedure cma_ib_listen (and elsewhere), if the call to ib_create_cm_id
> fails,
>    id_priv->cm_id.ib is left with the (non-zero) error value instead of NULL.
>    However, if there is a failure later in the procedure, you set id_priv-
> >cm_id.ib to NULL.
> 
>    Is there a reason for this difference in behavior?
>    (code snippet from current code is below)
> 	int cma_ib_listen(struct rdma_id_
> 
>        		id_priv->cm_id.ib = ib_create_cm_id(id_priv->id.device,
> cma_req_handler,
>                                             	    id_priv);
> 	       	if (IS_ERR(id_priv->cm_id.ib))
>                		return PTR_ERR(id_priv->cm_id.ib);

This looks like a bug.
 
> 
> 		....
> 	        if (ret) {
>                 	ib_destroy_cm_id(id_priv->cm_id.ib);
>                 	id_priv->cm_id.ib = NULL;
>         	}
> 
> 
> 2. procedure cma_has_cm_dev looks as follows:
> 
> 	static int cma_has_cm_dev(struct rdma_id_private *id_priv)
> 	{
> 		return (id_priv->id.device && id_priv->cm_id.ib);
> 	}
> 
>    Shouldn't the line be:
> 		return (id_priv->id.device && id_priv->cm_id.ib &&
> 			!IS_ERR(id_priv->cm_id.ib));

Given the current code, adding IS_ERR makes sense, but see below.  Thinking about this, we shouldn't need to check id_priv->id.device.  If we have id_priv->cm_id.ib, then the device pointer must be valid.

(fyi: cma_has_cm_dev() was added by commit 6c719f5c6c823901fac2d46b83db5a69ba7e9152.  It replaced a state check with the above device check to handle device removal.)

> 3. There are several places where the value of id_priv->cm_id.ib
>    is checked to be not NULL.
> 
>    Wouldn't it be better to call cma_has_cm_dev in these places
>    (when cma_has_cm_dev has been fixed, as I suggested in 2 above).

Although it's a minor performance gain, I'm leaning towards keeping id_priv->cm_id.ib = NULL on failure, either by resetting it or using a local variable.  cma_has_cm_dev() can then be replaced by checking id_priv->cm_id.ib for non-NULL, and checks for IS_ERR(id_priv->cm_id.ib) are removed.

> Please consider the patch below as a starting point (I did not touch the iwarp
> code).
> Please let me know (ASAP) what you think.
> (I still leave a window here where id_priv->cm_id.ib is ERR. Is this a
> problem?
>  Would it be better to use a local variable instead of id_priv->cm_id.ib, and
>  only assign to id_priv->cm_id.ib when all the error checks have passed?
>  This would leave a window where a successful cm_id creation is not
> immediately assigned
>  to the CMA object -- would this be a problem?).

Without adding more synchronization, we need to ensure that id_priv->cm_id.ib is set before a user can receive any callbacks.  This restricts our use of a local variable to the create_cm_id calls only.  See cma_connect_iw() for an example of using this approach.

Thanks,
- Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
jackm July 17, 2011, 7:08 a.m. UTC | #2
On Thursday 14 July 2011 20:46, Hefty, Sean wrote:
> Although it's a minor performance gain, I'm leaning towards keeping id_priv->cm_id.ib = NULL on failure, either by resetting it or using a local variable.  cma_has_cm_dev() can then be replaced by checking id_priv->cm_id.ib for non-NULL, and checks for IS_ERR(id_priv->cm_id.ib) are removed.
> 
> > Please consider the patch below as a starting point (I did not touch the iwarp
> > code).
> > Please let me know (ASAP) what you think.
> > (I still leave a window here where id_priv->cm_id.ib is ERR. Is this a
> > problem?
> >  Would it be better to use a local variable instead of id_priv->cm_id.ib, and
> >  only assign to id_priv->cm_id.ib when all the error checks have passed?
> >  This would leave a window where a successful cm_id creation is not
> > immediately assigned
> >  to the CMA object -- would this be a problem?).
> 
> Without adding more synchronization, we need to ensure that id_priv->cm_id.ib is set before a user can receive any callbacks.
> This restricts our use of a local variable to the create_cm_id calls only.
> See cma_connect_iw() for an example of using this approach.  
> 
Agreed.  That way we never have to test for IS_ERR. I'll send a patch.

-Jack
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

====================================================================
--- cma.c	2011-07-13 09:54:09.000000000 +0300
+++ cma_fixed.c	2011-07-14 10:51:23.000000000 +0300
@@ -424,7 +424,8 @@  static int cma_disable_callback(struct r
 
 static int cma_has_cm_dev(struct rdma_id_private *id_priv)
 {
-	return (id_priv->id.device && id_priv->cm_id.ib);
+	return (id_priv->id.device && id_priv->cm_id.ib &&
+		!IS_ERR(id_priv->cm_id.ib));
 }
 
 struct rdma_cm_id *rdma_create_id(rdma_cm_event_handler event_handler,
@@ -658,7 +659,7 @@  int rdma_init_qp_attr(struct rdma_cm_id 
 	id_priv = container_of(id, struct rdma_id_private, id);
 	switch (rdma_node_get_transport(id_priv->id.device->node_type)) {
 	case RDMA_TRANSPORT_IB:
-		if (!id_priv->cm_id.ib || cma_is_ud_ps(id_priv->id.ps))
+		if (!cma_has_cm_dev(id_priv) || cma_is_ud_ps(id_priv->id.ps))
 			ret = cma_ib_init_qp_attr(id_priv, qp_attr, qp_attr_mask);
 		else
 			ret = ib_cm_init_qp_attr(id_priv->cm_id.ib, qp_attr,
@@ -918,7 +919,7 @@  void rdma_destroy_id(struct rdma_cm_id *
 	if (id_priv->cma_dev) {
 		switch (rdma_node_get_transport(id_priv->id.device->node_type)) {
 		case RDMA_TRANSPORT_IB:
-			if (id_priv->cm_id.ib && !IS_ERR(id_priv->cm_id.ib))
+			if (cma_has_cm_dev(id_priv))
 				ib_destroy_cm_id(id_priv->cm_id.ib);
 			break;
 		case RDMA_TRANSPORT_IWARP:
@@ -1471,8 +1472,10 @@  static int cma_ib_listen(struct rdma_id_
 
 	id_priv->cm_id.ib = ib_create_cm_id(id_priv->id.device, cma_req_handler,
 					    id_priv);
-	if (IS_ERR(id_priv->cm_id.ib))
-		return PTR_ERR(id_priv->cm_id.ib);
+	if (IS_ERR(id_priv->cm_id.ib)) {
+		ret = PTR_ERR(id_priv->cm_id.ib);
+		goto out;
+	}
 
 	addr = (struct sockaddr *) &id_priv->id.route.addr.src_addr;
 	svc_id = cma_get_service_id(id_priv->id.ps, addr);
@@ -1482,9 +1485,10 @@  static int cma_ib_listen(struct rdma_id_
 		cma_set_compare_data(id_priv->id.ps, addr, &compare_data);
 		ret = ib_cm_listen(id_priv->cm_id.ib, svc_id, 0, &compare_data);
 	}
-
+out:
 	if (ret) {
-		ib_destroy_cm_id(id_priv->cm_id.ib);
+		if (!IS_ERR(id_priv->cm_id.ib))
+			ib_destroy_cm_id(id_priv->cm_id.ib);
 		id_priv->cm_id.ib = NULL;
 	}
 
@@ -2454,11 +2458,12 @@  static int cma_resolve_ib_udp(struct rdm
 	req.max_cm_retries = CMA_MAX_CM_RETRIES;
 
 	ret = ib_send_cm_sidr_req(id_priv->cm_id.ib, &req);
+out:
 	if (ret) {
-		ib_destroy_cm_id(id_priv->cm_id.ib);
+		if (!IS_ERR(id_priv->cm_id.ib))
+			ib_destroy_cm_id(id_priv->cm_id.ib);
 		id_priv->cm_id.ib = NULL;
 	}
-out:
 	kfree(req.private_data);
 	return ret;
 }
@@ -2516,8 +2521,9 @@  static int cma_connect_ib(struct rdma_id
 
 	ret = ib_send_cm_req(id_priv->cm_id.ib, &req);
 out:
-	if (ret && !IS_ERR(id_priv->cm_id.ib)) {
-		ib_destroy_cm_id(id_priv->cm_id.ib);
+	if (ret) {
+	      	if (!IS_ERR(id_priv->cm_id.ib))
+			ib_destroy_cm_id(id_priv->cm_id.ib);
 		id_priv->cm_id.ib = NULL;
 	}