From patchwork Mon Jul 22 16:10:56 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sasha Khapyorsky X-Patchwork-Id: 2831495 X-Patchwork-Delegate: hal@mellanox.com Return-Path: X-Original-To: patchwork-linux-rdma@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 821FEC0319 for ; Mon, 22 Jul 2013 16:22:07 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 2AFCD202EE for ; Mon, 22 Jul 2013 16:22:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id E482B202E5 for ; Mon, 22 Jul 2013 16:22:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932592Ab3GVQWB (ORCPT ); Mon, 22 Jul 2013 12:22:01 -0400 Received: from mail-we0-f170.google.com ([74.125.82.170]:46332 "EHLO mail-we0-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932554Ab3GVQLE (ORCPT ); Mon, 22 Jul 2013 12:11:04 -0400 Received: by mail-we0-f170.google.com with SMTP id w60so17995wes.15 for ; Mon, 22 Jul 2013 09:11:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:mime-version:content-type :content-disposition:user-agent; bh=d7y1+jkqtvsrgjEdBGJKYF4H0f3XmkFLVJ5NCdWTUzw=; b=aWEqj1jS8Yx7jpwFG7ErijJ3VNfyWyWZHExJxdlmxS3nUl9T2G/4OxMYxwd6sdy+m/ WJnPd8GLWipsDZLY3UQIWcZz/br3h5xKEl17aYLzmRtuyX9taZpsQ1QV8LSHkZyE0a9a VKXqUGaj7+n2MYtrHlia2BGhyLSl4nSfV9eenZuzdu7mBrssOjRHVXoQ10EfCf2C7f1R ifh0lHHk/5uPGJl+Mahgaoz12YcRMn5QbhplpdzPD9UGGJv7fhqYjByc+4b/puUZjb6c RQ8uS86qfmySK5WuoQ3SswyrNax311tn3GI1kMESGJ5XHNWF6phPwK3CHr+kvrKBiTGe BUtQ== X-Received: by 10.180.104.10 with SMTP id ga10mr18360105wib.35.1374509462494; Mon, 22 Jul 2013 09:11:02 -0700 (PDT) Received: from gmail.com ([2.54.251.234]) by mx.google.com with ESMTPSA id f8sm15463962wiv.0.2013.07.22.09.11.00 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Mon, 22 Jul 2013 09:11:01 -0700 (PDT) Date: Mon, 22 Jul 2013 19:10:56 +0300 From: Sasha Khapyorsky To: Hal Rosenstock Cc: linux-rdma@vger.kernel.org, Alex Netes , Roy.Koren@emc.com Subject: [PATCH 3/3] OpenSM: single port sweep Message-ID: <20130722161055.GC24222@gmail.com> MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Spam-Status: No, score=-8.2 required=5.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, T_DKIM_INVALID, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This provides possibility to keep SM/SA operational even in case when the local SM port was disconnected. It is needed in order to not break existing loopback connections. As side effect it let us to startup OpenSM on disconnected port. Signed-off-by: Sasha Khapyorsky --- opensm/osm_state_mgr.c | 95 +++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 87 insertions(+), 8 deletions(-) diff --git a/opensm/osm_state_mgr.c b/opensm/osm_state_mgr.c index 1b73834..c586e64 100644 --- a/opensm/osm_state_mgr.c +++ b/opensm/osm_state_mgr.c @@ -1075,6 +1075,90 @@ int wait_for_pending_transactions(osm_stats_t * stats) return osm_exit_flag; } +static void single_node_sweep(osm_sm_t *sm) +{ + osm_opensm_report_event(sm->p_subn->p_osm, + OSM_EVENT_ID_HEAVY_SWEEP_DONE, NULL); + + OSM_LOG_MSG_BOX(sm->p_log, OSM_LOG_VERBOSE, "HEAVY SWEEP COMPLETE"); + + osm_drop_mgr_process(sm); + + /* + * If we are not MASTER already - this means that we are + * in discovery state. call osm_sm_state_mgr with signal + * DISCOVERY_COMPLETED + */ + if (sm->p_subn->sm_state == IB_SMINFO_STATE_DISCOVERING) + osm_sm_state_mgr_process(sm, OSM_SM_SIGNAL_DISCOVERY_COMPLETED); + + osm_pkey_mgr_process(sm->p_subn->p_osm); + + /* try to restore SA DB (this should be before lid_mgr + because we may want to disable clients reregistration + when SA DB is restored) */ + osm_sa_db_file_load(sm->p_subn->p_osm); + + if (wait_for_pending_transactions(&sm->p_subn->p_osm->stats)) + return; + + OSM_LOG_MSG_BOX(sm->p_log, OSM_LOG_VERBOSE, + "PKEY setup completed - STARTING SM LID CONFIG"); + + osm_lid_mgr_process_sm(&sm->lid_mgr); + if (wait_for_pending_transactions(&sm->p_subn->p_osm->stats)) + return; + + state_mgr_notify_lid_change(sm); + + /* At this point we need to check the consistency of + * the port_lid_tbl under the subnet. There might be + * errors in it if PortInfo Set requests didn't reach + * their destination. */ + state_mgr_check_tbl_consistency(sm); + + OSM_LOG_MSG_BOX(sm->p_log, OSM_LOG_VERBOSE, "LID ASSIGNMENT COMPLETE"); + + /* in any case we zero this flag */ + sm->p_subn->coming_out_of_standby = FALSE; + + /* If there were errors - then the subnet is not really up */ + if (sm->p_subn->subnet_initialization_error == TRUE) { + osm_log_v2(sm->p_log, OSM_LOG_SYS, FILE_ID, + "Errors during initialization\n"); + OSM_LOG_MSG_BOX(sm->p_log, OSM_LOG_ERROR, + "ERRORS DURING INITIALIZATION"); + } else { + sm->p_subn->need_update = 0; + osm_dump_all(sm->p_subn->p_osm); + state_mgr_up_msg(sm); + sm->p_subn->first_time_master_sweep = FALSE; + sm->p_subn->set_client_rereg_on_sweep = FALSE; + + if (OSM_LOG_IS_ACTIVE_V2(sm->p_log, OSM_LOG_VERBOSE) || + sm->p_subn->opt.sa_db_dump) + osm_sa_db_file_dump(sm->p_subn->p_osm); + } + + /* + * Finally signal the subnet up event + */ + cl_event_signal(&sm->subnet_up_event); + + osm_opensm_report_event(sm->p_subn->p_osm, OSM_EVENT_ID_SUBNET_UP, + NULL); + + /* if we got a signal to force heavy sweep or errors + * in the middle of the sweep - try another sweep. */ + if (sm->p_subn->force_heavy_sweep + || sm->p_subn->subnet_initialization_error) + osm_sm_signal(sm, OSM_SIGNAL_SWEEP); + + /* Write a new copy of our persistent guid2mkey database */ + osm_db_store(sm->p_subn->p_g2m); + osm_db_store(sm->p_subn->p_neighbor); +} + static void do_sweep(osm_sm_t * sm) { ib_api_status_t status; @@ -1234,15 +1318,10 @@ repeat_discovery: "SM PORT DOWN"); } - /* Run the drop manager - we want to clear all records */ - osm_drop_mgr_process(sm); - - /* Move to DISCOVERING state */ - if (sm->p_subn->sm_state != IB_SMINFO_STATE_DISCOVERING) - osm_sm_state_mgr_process(sm, OSM_SM_SIGNAL_DISCOVER); - osm_opensm_report_event(sm->p_subn->p_osm, - OSM_EVENT_ID_STATE_CHANGE, NULL); + /* special case - just loopback on disconnected node */ + single_node_sweep(sm); return; + } else { if (!sm->p_subn->last_sm_port_state) { sm->p_subn->last_sm_port_state = 1;