From patchwork Fri Oct 22 20:59:07 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arlin Davis X-Patchwork-Id: 277981 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by demeter1.kernel.org (8.14.4/8.14.3) with ESMTP id o9MKxZk0024336 for ; Fri, 22 Oct 2010 20:59:35 GMT Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758416Ab0JVU7M (ORCPT ); Fri, 22 Oct 2010 16:59:12 -0400 Received: from mga11.intel.com ([192.55.52.93]:62238 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1759221Ab0JVU7K convert rfc822-to-8bit (ORCPT ); Fri, 22 Oct 2010 16:59:10 -0400 Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga102.fm.intel.com with ESMTP; 22 Oct 2010 13:59:09 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.58,225,1286175600"; d="scan'208";a="619441339" Received: from orsmsx604.amr.corp.intel.com ([10.22.226.87]) by fmsmga002.fm.intel.com with ESMTP; 22 Oct 2010 13:59:09 -0700 Received: from orsmsx601.amr.corp.intel.com (10.22.226.213) by orsmsx604.amr.corp.intel.com (10.22.226.87) with Microsoft SMTP Server (TLS) id 8.2.254.0; Fri, 22 Oct 2010 13:59:08 -0700 Received: from orsmsx506.amr.corp.intel.com ([10.22.226.44]) by orsmsx601.amr.corp.intel.com ([10.22.226.213]) with mapi; Fri, 22 Oct 2010 13:59:07 -0700 From: "Davis, Arlin R" To: linux-rdma , "ofw@lists.openfabrics.org" Date: Fri, 22 Oct 2010 13:59:07 -0700 Subject: [PATCH] DAPL v2.0: scm, ucm: MPI spawn test on oversubcribed server taking excessive time to complete Thread-Topic: [PATCH] DAPL v2.0: scm, ucm: MPI spawn test on oversubcribed server taking excessive time to complete Thread-Index: ActyK/eXoVZocnkTSi6O4wAKBa6Dqg== Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by milter-greylist-4.2.3 (demeter1.kernel.org [140.211.167.41]); Fri, 22 Oct 2010 20:59:36 +0000 (UTC) diff --git a/dapl/openib_scm/cm.c b/dapl/openib_scm/cm.c index 56d4c73..f82d0ff 100644 --- a/dapl/openib_scm/cm.c +++ b/dapl/openib_scm/cm.c @@ -463,10 +463,8 @@ DAT_RETURN dapli_socket_disconnect(dp_ib_cm_handle_t cm_ptr) return DAT_SUCCESS; } cm_ptr->state = DCM_DISCONNECTED; - dapl_os_unlock(&cm_ptr->lock); - - /* send disc date, close socket, schedule destroy */ send(cm_ptr->socket, (char *)&disc_data, sizeof(disc_data), 0); + dapl_os_unlock(&cm_ptr->lock); /* disconnect events for RC's only */ if (cm_ptr->ep->param.ep_attr.service_type == DAT_SERVICE_TYPE_RC) { @@ -1812,7 +1810,13 @@ void cr_thread(void *arg) dapl_os_unlock(&cr->lock); dapli_socket_disconnect(cr); break; + case DCM_DISCONNECTED: + cr->state = DCM_FREE; + dapl_os_unlock(&cr->lock); + break; default: + if (ret == DAPL_FD_ERROR) + cr->state = DCM_FREE; dapl_os_unlock(&cr->lock); break; } diff --git a/dapl/openib_ucm/cm.c b/dapl/openib_ucm/cm.c index 3a518c3..0fe5e2e 100644 --- a/dapl/openib_ucm/cm.c +++ b/dapl/openib_ucm/cm.c @@ -544,8 +544,9 @@ retry: msg = (ib_cm_msg_t*) (uintptr_t) wc[i].wr_id; dapl_dbg_log(DAPL_DBG_TYPE_CM, - " ucm_recv: wc status=%d, ln=%d id=%p sqp=%x\n", - wc[i].status, wc[i].byte_len, + " ucm_recv: stat=%d op=%s ln=%d id=%p sqp=%x\n", + wc[i].status, dapl_cm_op_str(ntohs(msg->op)), + wc[i].byte_len, (void*)wc[i].wr_id, wc[i].src_qp); /* validate CM message, version */ @@ -609,7 +610,7 @@ static int ucm_send(ib_hca_transport_t *tp, ib_cm_msg_t *msg, DAT_PVOID p_data, sge.addr = (uintptr_t)smsg; dapl_dbg_log(DAPL_DBG_TYPE_CM, - " ucm_send: op %s ln %d lid %x c_qpn %x rport %s\n", + " ucm_send: op %s ln %d lid %x c_qpn %x rport %x\n", dapl_cm_op_str(ntohs(smsg->op)), sge.length, htons(smsg->daddr.ib.lid), htonl(smsg->dqpn), htons(smsg->dport)); @@ -818,7 +819,7 @@ static void ucm_disconnect_final(dp_ib_cm_handle_t cm) return; dapl_os_lock(&cm->lock); - if (cm->state == DCM_DISCONNECTED) { + if ((cm->state == DCM_DISCONNECTED) || (cm->state == DCM_FREE)) { dapl_os_unlock(&cm->lock); return; }