diff mbox

opensm/osm_sminfo_rcv.c: send trap144 to a newly found MASTER SM when in MASTER state

Message ID 1394110861-6128-1-git-send-email-alexne@mellanox.com (mailing list archive)
State Accepted
Delegated to: Hal Rosenstock
Headers show

Commit Message

Alex Netes March 6, 2014, 1:01 p.m. UTC
Before this patch, when SM in Master state finds other Master SM, it
sends trap144 to previously found Master SM/SM with higher priority when
it was in Discovering/Standby state.
This can lead to wrong behaviour in a multi-SM topolgy:

Setup: SM1 with priority 1, SM2 with priority 2, SM3 with priority 3.
Flow:
1. setting SM3 to ignore SMInfo MADs -> SM2 become master
2. setting SM2 to ignore SMInfo MADs -> SM1 become master
3. setting SM2 to accept SMInfo MADs
4. SM2 sends SMInfo to SM1 -> finds that SM1 is master
5. SM2 sends trap144 to SM3 instead of sending it to SM1

Signed-off-by: Alex Netes <alexne@mellanox.com>
---
 opensm/osm_sminfo_rcv.c |    7 ++++++-
 1 files changed, 6 insertions(+), 1 deletions(-)

Comments

Hal Rosenstock March 6, 2014, 3:28 p.m. UTC | #1
On 3/6/2014 8:01 AM, Alex Netes wrote:
> Before this patch, when SM in Master state finds other Master SM, it
> sends trap144 to previously found Master SM/SM with higher priority when
> it was in Discovering/Standby state.
> This can lead to wrong behaviour in a multi-SM topolgy:
> 
> Setup: SM1 with priority 1, SM2 with priority 2, SM3 with priority 3.
> Flow:
> 1. setting SM3 to ignore SMInfo MADs -> SM2 become master
> 2. setting SM2 to ignore SMInfo MADs -> SM1 become master
> 3. setting SM2 to accept SMInfo MADs
> 4. SM2 sends SMInfo to SM1 -> finds that SM1 is master
> 5. SM2 sends trap144 to SM3 instead of sending it to SM1
> 
> Signed-off-by: Alex Netes <alexne@mellanox.com>

Thanks. Applied.

-- Hal
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/opensm/osm_sminfo_rcv.c b/opensm/osm_sminfo_rcv.c
index 9f62f9f..100a82d 100644
--- a/opensm/osm_sminfo_rcv.c
+++ b/opensm/osm_sminfo_rcv.c
@@ -395,8 +395,13 @@  static void smi_rcv_process_get_sm(IN osm_sm_t * sm,
 			if (sm->polling_sm_guid) {
 				if (smi_rcv_remote_sm_is_higher(sm, p_smi))
 					sm->p_subn->force_heavy_sweep = TRUE;
-				else
+				else {
+					/* Update master_sm_guid to the GUID of the newly
+					 * found MASTER SM and send trap 144 to it.
+					 */
+					sm->master_sm_guid = sm->polling_sm_guid;
 					osm_send_trap144(sm, TRAP_144_MASK_SM_PRIORITY_CHANGE);
+				}
 				osm_sm_state_mgr_signal_master_is_alive(sm);
 			} else {
 				/* This is a response we got while sweeping the subnet.