From patchwork Sat Mar 9 13:26:54 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hal Rosenstock X-Patchwork-Id: 2241581 X-Patchwork-Delegate: hal@mellanox.com Return-Path: X-Original-To: patchwork-linux-rdma@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork1.kernel.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by patchwork1.kernel.org (Postfix) with ESMTP id AE25D3FCF2 for ; Sat, 9 Mar 2013 13:27:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932449Ab3CIN06 (ORCPT ); Sat, 9 Mar 2013 08:26:58 -0500 Received: from mail-we0-f177.google.com ([74.125.82.177]:56756 "EHLO mail-we0-f177.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932446Ab3CIN06 (ORCPT ); Sat, 9 Mar 2013 08:26:58 -0500 Received: by mail-we0-f177.google.com with SMTP id d7so2047851wer.22 for ; Sat, 09 Mar 2013 05:26:56 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:message-id:date:from:user-agent:mime-version:to:cc :subject:content-type:content-transfer-encoding:x-gm-message-state; bh=LEoNrZfgtNNZcCh+FhBq1sOOhCw+XoIhcVZ/iZsVM/0=; b=FzD3VjCDtWMpzqlUG9n/NVxorAu39jnJzGvWLzuFfYeLg4AyVnlsDBkVGJJaS082Hm UKdBOMj437WSpyrWWPMalmnzk1DOCZeJdllhIemnBVr9SQFxZz7eD2E25HODyVakU6he AJzdACp87eb/Z2qydIs4rzR2HsRPuSZQtQA0k7YVgwmh/p6a3vH3tcQAGFcrdqhJ80IB F2XqQqPlufZ9Ff1LvSh8Il3XymqcZtLJb7b7ROxfal4+LIdDGyPJeYp5GopuRSKBWmss Btlv4pD9chMX4wdNzh98+Jmvx1M/H3JZ/KBbQQed+WgiNAidwLCuPbyt0Xy0SrOQps6u jpoA== X-Received: by 10.194.57.137 with SMTP id i9mr9963960wjq.18.1362835616302; Sat, 09 Mar 2013 05:26:56 -0800 (PST) Received: from [192.168.1.102] (c-71-234-225-85.hsd1.ct.comcast.net. [71.234.225.85]) by mx.google.com with ESMTPS id j4sm4151445wiz.10.2013.03.09.05.26.55 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Sat, 09 Mar 2013 05:26:55 -0800 (PST) Message-ID: <513B389E.30506@dev.mellanox.co.il> Date: Sat, 09 Mar 2013 08:26:54 -0500 From: Hal Rosenstock User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:9.0) Gecko/20111222 Thunderbird/9.0.1 MIME-Version: 1.0 To: "linux-rdma (linux-rdma@vger.kernel.org)" CC: Jim Schutt Subject: [PATCH 1/2] opensm/osm_torus.c: avoid the possibility of following stale ->priv pointers X-Gm-Message-State: ALoCoQm4lKKXjYR1cvPxwQ1HN2YSRatoehDFVdJojNA2m9kGj6n5ks78mOPmY94UPRxMEAdHN8pb Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org From 557b01eb8857a197a973af76d1322f76d25495ed Mon Sep 17 00:00:00 2001 From: Jim Schutt Date: Fri, 8 Mar 2013 08:59:01 -0700 Subject: [PATCH 1/2] opensm/osm_torus.c: avoid the possibility of following stale ->priv pointers Torus-2QoS makes persistent use of osm_switch_t:priv and osm_port_t:priv to speed its routing calculations. In some situations t2q is unable to determine the location in the torus topology of a discovered switch. An example of this occurs when all the inter-switch links but one connected to a switch have failed. When this happens, at the end of a full sweep opensm will attempt to program the VLArb tables for that switch. Torus_update_osm_vlarb() will follow a stale osm_switch_t:priv pointer, left over from the previous routing cycle when switch was discovered in the torus, ultimately resulting in a segmentation fault. Prevent this by clearing stale values of osm_switch_t:priv as part of the first step in torus topology discovery. Do the same thing for osm_port_t:priv, even though it has not yet been implicated in this sort of issue yet. Signed-off-by: Jim Schutt --- opensm/osm_torus.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/opensm/osm_torus.c b/opensm/osm_torus.c index 757a32a..bc3c0b8 100644 --- a/opensm/osm_torus.c +++ b/opensm/osm_torus.c @@ -1173,6 +1173,7 @@ bool capture_fabric(struct fabric *fabric) osm_sw = (osm_switch_t *)item; item = cl_qmap_next(item); + osm_sw->priv = NULL; /* avoid stale pointer dereferencing */ osm_node = osm_sw->p_node; if (osm_node_get_type(osm_node) != IB_NODE_TYPE_SWITCH) @@ -1193,6 +1194,7 @@ bool capture_fabric(struct fabric *fabric) lport = (osm_port_t *)item; item = cl_qmap_next(item); + lport->priv = NULL; /* avoid stale pointer dereferencing */ lphysp = lport->p_physp; if (!(lphysp && osm_physp_is_valid(lphysp)))