diff mbox

[v3,10/28] IB/Verbs: Reform cm related part in IB-core cma

Message ID 552BB60F.60109@profitbricks.com (mailing list archive)
State Rejected
Headers show

Commit Message

Michael Wang April 13, 2015, 12:26 p.m. UTC
Use raw management helpers to reform cm related part in IB-core cma.

Cc: Steve Wise <swise@opengridcomputing.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Michael Wang <yun.wang@profitbricks.com>
---
 drivers/infiniband/core/cma.c | 79 ++++++++++++++-----------------------------
 1 file changed, 25 insertions(+), 54 deletions(-)

Comments

Hefty, Sean April 13, 2015, 7:25 p.m. UTC | #1
> @@ -1037,17 +1033,13 @@ void rdma_destroy_id(struct rdma_cm_id *id)

>  	mutex_unlock(&id_priv->handler_mutex);

> 

>  	if (id_priv->cma_dev) {

> -		switch (rdma_node_get_transport(id_priv->id.device-

> >node_type)) {

> -		case RDMA_TRANSPORT_IB:

> +		if (rdma_ib_or_iboe(id_priv->id.device, id_priv->id.port_num))


A listen id can be associated with a device without being associated with a port (see the listen_any_list).  Some other check is needed to handle this case.  I guess the code could check the first port on the device (replace port_num with hardcoded value 1).  Then we wouldn't be any more broken than the code already is.  (The 'break' is conceptual, not practical.)

This appears to be highlighting an architectural flaw in the iboe integration.

> {

>  			if (id_priv->cm_id.ib)

>  				ib_destroy_cm_id(id_priv->cm_id.ib);

> -			break;

> -		case RDMA_TRANSPORT_IWARP:

> +		} else if (rdma_tech_iwarp(id_priv->id.device,

> +					   id_priv->id.port_num)) {

>  			if (id_priv->cm_id.iw)

>  				iw_destroy_cm_id(id_priv->cm_id.iw);

> -			break;

> -		default:

> -			break;

>  		}

>  		cma_leave_mc_groups(id_priv);

>  		cma_release_dev(id_priv);


 - Sean
Jason Gunthorpe April 13, 2015, 7:50 p.m. UTC | #2
On Mon, Apr 13, 2015 at 07:25:48PM +0000, Hefty, Sean wrote:
> > @@ -1037,17 +1033,13 @@ void rdma_destroy_id(struct rdma_cm_id *id)
> >  	mutex_unlock(&id_priv->handler_mutex);
> > 
> >  	if (id_priv->cma_dev) {
> > -		switch (rdma_node_get_transport(id_priv->id.device-
> > >node_type)) {
> > -		case RDMA_TRANSPORT_IB:
> > +		if (rdma_ib_or_iboe(id_priv->id.device, id_priv->id.port_num))
> 
> A listen id can be associated with a device without being associated
> with a port (see the listen_any_list).  Some other check is needed
> to handle this case.  I guess the code could check the first port on
> the device (replace port_num with hardcoded value 1).  Then we
> wouldn't be any more broken than the code already is.  (The 'break'
> is conceptual, not practical.)

Hum. So, devices on a port must have some compatibility when it comes
to these invariants. It looks like all ports must have the same
iwarpyness, for multiple reasons.

Less clear is how rocee vs ib work within a device... Can you APM
between those two kinds of ports?

All these switches are so ugly :| Function pointers setup in
iw_/ib_create_cm_id would be a lot clearer and safer.

> This appears to be highlighting an architectural flaw in the iboe integration.

You mean iwarp?

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hefty, Sean April 13, 2015, 8:31 p.m. UTC | #3
> On Mon, Apr 13, 2015 at 07:25:48PM +0000, Hefty, Sean wrote:
> > > @@ -1037,17 +1033,13 @@ void rdma_destroy_id(struct rdma_cm_id *id)
> > >  	mutex_unlock(&id_priv->handler_mutex);
> > >
> > >  	if (id_priv->cma_dev) {
> > > -		switch (rdma_node_get_transport(id_priv->id.device-
> > > >node_type)) {
> > > -		case RDMA_TRANSPORT_IB:
> > > +		if (rdma_ib_or_iboe(id_priv->id.device, id_priv->id.port_num))
> >
> > A listen id can be associated with a device without being associated
> > with a port (see the listen_any_list).  Some other check is needed
> > to handle this case.  I guess the code could check the first port on
> > the device (replace port_num with hardcoded value 1).  Then we
> > wouldn't be any more broken than the code already is.  (The 'break'
> > is conceptual, not practical.)
> 
> Hum. So, devices on a port must have some compatibility when it comes
> to these invariants. It looks like all ports must have the same
> iwarpyness, for multiple reasons.
> 
> Less clear is how rocee vs ib work within a device... Can you APM
> between those two kinds of ports?

No idea

> All these switches are so ugly :| Function pointers setup in
> iw_/ib_create_cm_id would be a lot clearer and safer.

I noticed this too.  The if checks throughout the cma are becoming unmaintainable.  It may be cleaner if all devices adopted using the cm device function pointers.

> > This appears to be highlighting an architectural flaw in the iboe
> integration.
> 
> You mean iwarp?

I meant iboe.  Wildcard listens map to multiple listens, one per device.  The assumption being that all ports on the device are the same.  IBoE changed that assumption.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael Wang April 14, 2015, 8:35 a.m. UTC | #4
On 04/13/2015 09:25 PM, Hefty, Sean wrote:
>> @@ -1037,17 +1033,13 @@ void rdma_destroy_id(struct rdma_cm_id *id)
>>  	mutex_unlock(&id_priv->handler_mutex);
>>
>>  	if (id_priv->cma_dev) {
>> -		switch (rdma_node_get_transport(id_priv->id.device-
>>> node_type)) {
>> -		case RDMA_TRANSPORT_IB:
>> +		if (rdma_ib_or_iboe(id_priv->id.device, id_priv->id.port_num))
> 
> A listen id can be associated with a device without being associated with a port (see the listen_any_list). 
Some other check is needed to handle this case.  I guess the code could check the first port on the device
(replace port_num with hardcoded value 1).  Then we wouldn't be any more broken than the code already is.
(The 'break' is conceptual, not practical.)

Agree, seems like this is very similar to the case of cma_listen_on_dev() which
do not associated with any particular port in #24.

If the port 1 is enough to present the whole device's cm capability, maybe we can
get rid of cap_ib_cm_dev() too?

And maybe cap_ib_cm(device, device->node_type == RDMA_NODE_IB_SWITCH ? 0:1) would
be safer?

Regards,
Michael Wang


> 
> This appears to be highlighting an architectural flaw in the iboe integration.
> 
>> {
>>  			if (id_priv->cm_id.ib)
>>  				ib_destroy_cm_id(id_priv->cm_id.ib);
>> -			break;
>> -		case RDMA_TRANSPORT_IWARP:
>> +		} else if (rdma_tech_iwarp(id_priv->id.device,
>> +					   id_priv->id.port_num)) {
>>  			if (id_priv->cm_id.iw)
>>  				iw_destroy_cm_id(id_priv->cm_id.iw);
>> -			break;
>> -		default:
>> -			break;
>>  		}
>>  		cma_leave_mc_groups(id_priv);
>>  		cma_release_dev(id_priv);
> 
>  - Sean
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ira Weiny April 14, 2015, 3:50 p.m. UTC | #5
On Tue, Apr 14, 2015 at 10:35:34AM +0200, Michael Wang wrote:
> 
> 
> On 04/13/2015 09:25 PM, Hefty, Sean wrote:
> >> @@ -1037,17 +1033,13 @@ void rdma_destroy_id(struct rdma_cm_id *id)
> >>  	mutex_unlock(&id_priv->handler_mutex);
> >>
> >>  	if (id_priv->cma_dev) {
> >> -		switch (rdma_node_get_transport(id_priv->id.device-
> >>> node_type)) {
> >> -		case RDMA_TRANSPORT_IB:
> >> +		if (rdma_ib_or_iboe(id_priv->id.device, id_priv->id.port_num))
> > 
> > A listen id can be associated with a device without being associated with a port (see the listen_any_list). 
> Some other check is needed to handle this case.  I guess the code could check the first port on the device
> (replace port_num with hardcoded value 1).  Then we wouldn't be any more broken than the code already is.
> (The 'break' is conceptual, not practical.)
> 
> Agree, seems like this is very similar to the case of cma_listen_on_dev() which
> do not associated with any particular port in #24.
> 
> If the port 1 is enough to present the whole device's cm capability, maybe we can
> get rid of cap_ib_cm_dev() too?
> 
> And maybe cap_ib_cm(device, device->node_type == RDMA_NODE_IB_SWITCH ? 0:1) would
> be safer?

I don't see support for switch port 0 in cm_add_one() now.  Are switches supposed
to be supported?

Ira

> 
> Regards,
> Michael Wang
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael Wang April 14, 2015, 3:58 p.m. UTC | #6
On 04/14/2015 05:50 PM, ira.weiny wrote:
> On Tue, Apr 14, 2015 at 10:35:34AM +0200, Michael Wang wrote:
>>
>>
>> On 04/13/2015 09:25 PM, Hefty, Sean wrote:
>>>> @@ -1037,17 +1033,13 @@ void rdma_destroy_id(struct rdma_cm_id *id)
>>>>  	mutex_unlock(&id_priv->handler_mutex);
>>>>
>>>>  	if (id_priv->cma_dev) {
>>>> -		switch (rdma_node_get_transport(id_priv->id.device-
>>>>> node_type)) {
>>>> -		case RDMA_TRANSPORT_IB:
>>>> +		if (rdma_ib_or_iboe(id_priv->id.device, id_priv->id.port_num))
>>>
>>> A listen id can be associated with a device without being associated with a port (see the listen_any_list). 
>> Some other check is needed to handle this case.  I guess the code could check the first port on the device
>> (replace port_num with hardcoded value 1).  Then we wouldn't be any more broken than the code already is.
>> (The 'break' is conceptual, not practical.)
>>
>> Agree, seems like this is very similar to the case of cma_listen_on_dev() which
>> do not associated with any particular port in #24.
>>
>> If the port 1 is enough to present the whole device's cm capability, maybe we can
>> get rid of cap_ib_cm_dev() too?
>>
>> And maybe cap_ib_cm(device, device->node_type == RDMA_NODE_IB_SWITCH ? 0:1) would
>> be safer?
> 
> I don't see support for switch port 0 in cm_add_one() now.  Are switches supposed
> to be supported?

Just concern about the validation of port... is it possible that the device we check
in here don't have port 1? (forgive me if the question is too silly :-P)

Regards,
Michael Wang

> 
> Ira
> 
>>
>> Regards,
>> Michael Wang
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hal Rosenstock April 15, 2015, 6:36 p.m. UTC | #7
On 4/13/2015 3:50 PM, Jason Gunthorpe wrote:
> Less clear is how rocee vs ib work within a device... Can you APM
> between those two kinds of ports?

The specs allow this to work but AFAIK it's not implemented.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index d570030..8ba5553 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -735,8 +735,7 @@  int rdma_init_qp_attr(struct rdma_cm_id *id, struct ib_qp_attr *qp_attr,
 	int ret = 0;
 
 	id_priv = container_of(id, struct rdma_id_private, id);
-	switch (rdma_node_get_transport(id_priv->id.device->node_type)) {
-	case RDMA_TRANSPORT_IB:
+	if (rdma_ib_or_iboe(id_priv->id.device, id_priv->id.port_num)) {
 		if (!id_priv->cm_id.ib || (id_priv->id.qp_type == IB_QPT_UD))
 			ret = cma_ib_init_qp_attr(id_priv, qp_attr, qp_attr_mask);
 		else
@@ -745,19 +744,16 @@  int rdma_init_qp_attr(struct rdma_cm_id *id, struct ib_qp_attr *qp_attr,
 
 		if (qp_attr->qp_state == IB_QPS_RTR)
 			qp_attr->rq_psn = id_priv->seq_num;
-		break;
-	case RDMA_TRANSPORT_IWARP:
+	} else if (rdma_tech_iwarp(id_priv->id.device,
+				   id_priv->id.port_num)) {
 		if (!id_priv->cm_id.iw) {
 			qp_attr->qp_access_flags = 0;
 			*qp_attr_mask = IB_QP_STATE | IB_QP_ACCESS_FLAGS;
 		} else
 			ret = iw_cm_init_qp_attr(id_priv->cm_id.iw, qp_attr,
 						 qp_attr_mask);
-		break;
-	default:
+	} else
 		ret = -ENOSYS;
-		break;
-	}
 
 	return ret;
 }
@@ -1037,17 +1033,13 @@  void rdma_destroy_id(struct rdma_cm_id *id)
 	mutex_unlock(&id_priv->handler_mutex);
 
 	if (id_priv->cma_dev) {
-		switch (rdma_node_get_transport(id_priv->id.device->node_type)) {
-		case RDMA_TRANSPORT_IB:
+		if (rdma_ib_or_iboe(id_priv->id.device, id_priv->id.port_num)) {
 			if (id_priv->cm_id.ib)
 				ib_destroy_cm_id(id_priv->cm_id.ib);
-			break;
-		case RDMA_TRANSPORT_IWARP:
+		} else if (rdma_tech_iwarp(id_priv->id.device,
+					   id_priv->id.port_num)) {
 			if (id_priv->cm_id.iw)
 				iw_destroy_cm_id(id_priv->cm_id.iw);
-			break;
-		default:
-			break;
 		}
 		cma_leave_mc_groups(id_priv);
 		cma_release_dev(id_priv);
@@ -2060,7 +2052,7 @@  port_found:
 		goto out;
 
 	id_priv->id.route.addr.dev_addr.dev_type =
-		(rdma_port_get_link_layer(cma_dev->device, p) == IB_LINK_LAYER_INFINIBAND) ?
+		(rdma_tech_ib(cma_dev->device, p)) ?
 		ARPHRD_INFINIBAND : ARPHRD_ETHER;
 
 	rdma_addr_set_sgid(&id_priv->id.route.addr.dev_addr, &gid);
@@ -2537,18 +2529,15 @@  int rdma_listen(struct rdma_cm_id *id, int backlog)
 
 	id_priv->backlog = backlog;
 	if (id->device) {
-		switch (rdma_node_get_transport(id->device->node_type)) {
-		case RDMA_TRANSPORT_IB:
+		if (rdma_ib_or_iboe(id->device, id->port_num)) {
 			ret = cma_ib_listen(id_priv);
 			if (ret)
 				goto err;
-			break;
-		case RDMA_TRANSPORT_IWARP:
+		} else if (rdma_tech_iwarp(id->device, id->port_num)) {
 			ret = cma_iw_listen(id_priv, backlog);
 			if (ret)
 				goto err;
-			break;
-		default:
+		} else {
 			ret = -ENOSYS;
 			goto err;
 		}
@@ -2884,20 +2873,15 @@  int rdma_connect(struct rdma_cm_id *id, struct rdma_conn_param *conn_param)
 		id_priv->srq = conn_param->srq;
 	}
 
-	switch (rdma_node_get_transport(id->device->node_type)) {
-	case RDMA_TRANSPORT_IB:
+	if (rdma_ib_or_iboe(id->device, id->port_num)) {
 		if (id->qp_type == IB_QPT_UD)
 			ret = cma_resolve_ib_udp(id_priv, conn_param);
 		else
 			ret = cma_connect_ib(id_priv, conn_param);
-		break;
-	case RDMA_TRANSPORT_IWARP:
+	} else if (rdma_tech_iwarp(id->device, id->port_num))
 		ret = cma_connect_iw(id_priv, conn_param);
-		break;
-	default:
+	else
 		ret = -ENOSYS;
-		break;
-	}
 	if (ret)
 		goto err;
 
@@ -3000,8 +2984,7 @@  int rdma_accept(struct rdma_cm_id *id, struct rdma_conn_param *conn_param)
 		id_priv->srq = conn_param->srq;
 	}
 
-	switch (rdma_node_get_transport(id->device->node_type)) {
-	case RDMA_TRANSPORT_IB:
+	if (rdma_ib_or_iboe(id->device, id->port_num)) {
 		if (id->qp_type == IB_QPT_UD) {
 			if (conn_param)
 				ret = cma_send_sidr_rep(id_priv, IB_SIDR_SUCCESS,
@@ -3017,14 +3000,10 @@  int rdma_accept(struct rdma_cm_id *id, struct rdma_conn_param *conn_param)
 			else
 				ret = cma_rep_recv(id_priv);
 		}
-		break;
-	case RDMA_TRANSPORT_IWARP:
+	} else if (rdma_tech_iwarp(id->device, id->port_num))
 		ret = cma_accept_iw(id_priv, conn_param);
-		break;
-	default:
+	else
 		ret = -ENOSYS;
-		break;
-	}
 
 	if (ret)
 		goto reject;
@@ -3068,8 +3047,7 @@  int rdma_reject(struct rdma_cm_id *id, const void *private_data,
 	if (!id_priv->cm_id.ib)
 		return -EINVAL;
 
-	switch (rdma_node_get_transport(id->device->node_type)) {
-	case RDMA_TRANSPORT_IB:
+	if (rdma_ib_or_iboe(id->device, id->port_num)) {
 		if (id->qp_type == IB_QPT_UD)
 			ret = cma_send_sidr_rep(id_priv, IB_SIDR_REJECT, 0,
 						private_data, private_data_len);
@@ -3077,15 +3055,12 @@  int rdma_reject(struct rdma_cm_id *id, const void *private_data,
 			ret = ib_send_cm_rej(id_priv->cm_id.ib,
 					     IB_CM_REJ_CONSUMER_DEFINED, NULL,
 					     0, private_data, private_data_len);
-		break;
-	case RDMA_TRANSPORT_IWARP:
+	} else if (rdma_tech_iwarp(id->device, id->port_num)) {
 		ret = iw_cm_reject(id_priv->cm_id.iw,
 				   private_data, private_data_len);
-		break;
-	default:
+	} else
 		ret = -ENOSYS;
-		break;
-	}
+
 	return ret;
 }
 EXPORT_SYMBOL(rdma_reject);
@@ -3099,22 +3074,18 @@  int rdma_disconnect(struct rdma_cm_id *id)
 	if (!id_priv->cm_id.ib)
 		return -EINVAL;
 
-	switch (rdma_node_get_transport(id->device->node_type)) {
-	case RDMA_TRANSPORT_IB:
+	if (rdma_ib_or_iboe(id->device, id->port_num)) {
 		ret = cma_modify_qp_err(id_priv);
 		if (ret)
 			goto out;
 		/* Initiate or respond to a disconnect. */
 		if (ib_send_cm_dreq(id_priv->cm_id.ib, NULL, 0))
 			ib_send_cm_drep(id_priv->cm_id.ib, NULL, 0);
-		break;
-	case RDMA_TRANSPORT_IWARP:
+	} else if (rdma_tech_iwarp(id->device, id->port_num)) {
 		ret = iw_cm_disconnect(id_priv->cm_id.iw, 0);
-		break;
-	default:
+	} else
 		ret = -EINVAL;
-		break;
-	}
+
 out:
 	return ret;
 }