From patchwork Fri Apr 27 03:36:18 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Parav Pandit X-Patchwork-Id: 10367407 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 07D9E601D3 for ; Fri, 27 Apr 2018 03:36:24 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EBD98292D9 for ; Fri, 27 Apr 2018 03:36:23 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E03EB292E0; Fri, 27 Apr 2018 03:36:23 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 301E3292D9 for ; Fri, 27 Apr 2018 03:36:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757184AbeD0DgV (ORCPT ); Thu, 26 Apr 2018 23:36:21 -0400 Received: from mail-ve1eur01on0060.outbound.protection.outlook.com ([104.47.1.60]:17309 "EHLO EUR01-VE1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1757165AbeD0DgV (ORCPT ); Thu, 26 Apr 2018 23:36:21 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=qovtNnV9azfJkXZvYfohQXc7xYq39PvXGFa7tafzHIU=; b=WReBVXCc2slSq9t2LC1WZMn7xB9tW005t1V52Qsvgo+ajwosayCPJV9oAo/40GJtkeFB1WhILEMbfIVqBPCOhG8rN5j+BiTn7caSCFwzIe5onaNHpKIloFyHuomkaMwqGzDy7bJbLJCDlFSEW3DG9U5T8OtyKj6AeIqgzT1BKsk= Received: from VI1PR0502MB3008.eurprd05.prod.outlook.com (10.175.21.22) by VI1PR0502MB3757.eurprd05.prod.outlook.com (52.134.8.156) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.696.13; Fri, 27 Apr 2018 03:36:18 +0000 Received: from VI1PR0502MB3008.eurprd05.prod.outlook.com ([fe80::71f5:210e:e8bc:42db]) by VI1PR0502MB3008.eurprd05.prod.outlook.com ([fe80::71f5:210e:e8bc:42db%13]) with mapi id 15.20.0696.020; Fri, 27 Apr 2018 03:36:18 +0000 From: Parav Pandit To: Shiraz Saleem , Raju Rangoju CC: "linux-rdma@vger.kernel.org" , SWise OGC , "sean.hefty@intel.com" Subject: RE: iwarp kernel mode applications are broken with commit f35faa4ba Thread-Topic: iwarp kernel mode applications are broken with commit f35faa4ba Thread-Index: AdPdlw4i7EIVMfVVRjmPIwMoSkYm/AAMsS8AAAOCKUA= Date: Fri, 27 Apr 2018 03:36:18 +0000 Message-ID: References: <20180427014822.GA50020@ssaleem-MOBL4.amr.corp.intel.com> In-Reply-To: <20180427014822.GA50020@ssaleem-MOBL4.amr.corp.intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=parav@mellanox.com; x-originating-ip: [68.203.16.89] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; VI1PR0502MB3757; 7:K5t9q98PXO17QMb32Mxu0J/Z6AQpJA3zmwiyImReoNJAw/hT1J5WTbQlDS8P7kYsSqrS7zBYT9hbCRtdJPDW4TYD0HGFTUSFsWzguK28i2P7ezT6NpagEQ7uwc0hqy5GiW5lHT9zayeakDGVVZuK2zEzJX6Q4ChUtrrkthLQNz3eFaSRjyXEHQ/yefMalAyLDSraIqJvwChXDXo2EH3q6naRP7vizFFipwXl2ITTp4E6ia+rL+IoJ9xD2BW35Au/ x-ms-exchange-antispam-srfa-diagnostics: SOS; x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(4534165)(4627221)(201703031133081)(201702281549075)(5600026)(48565401081)(2017052603328)(7153060)(7193020); SRVR:VI1PR0502MB3757; x-ms-traffictypediagnostic: VI1PR0502MB3757: x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(278428928389397)(9452136761055)(228905959029699); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(8121501046)(5005006)(93006095)(93001095)(3002001)(3231232)(944501410)(52105095)(10201501046)(6055026)(6041310)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123558120)(20161123562045)(20161123564045)(20161123560045)(6072148)(201708071742011); SRVR:VI1PR0502MB3757; BCL:0; PCL:0; RULEID:; SRVR:VI1PR0502MB3757; x-forefront-prvs: 0655F9F006 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(39860400002)(366004)(39380400002)(346002)(396003)(376002)(13464003)(199004)(189003)(7696005)(446003)(8936002)(5250100002)(26005)(74316002)(6436002)(66066001)(305945005)(53936002)(3660700001)(105586002)(3280700002)(110136005)(54906003)(9686003)(55016002)(99286004)(86362001)(575784001)(486006)(97736004)(6246003)(33656002)(7736002)(106356001)(6116002)(14454004)(476003)(53546011)(316002)(81166006)(2900100001)(102836004)(11346002)(76176011)(229853002)(25786009)(68736007)(5660300001)(81156014)(186003)(6506007)(2906002)(59450400001)(478600001)(4326008)(8676002)(3846002); DIR:OUT; SFP:1101; SCL:1; SRVR:VI1PR0502MB3757; H:VI1PR0502MB3008.eurprd05.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1; received-spf: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: T3sr4X0eXZcX3dZW0Klu/XGXeab/mAlw2IENu1+3IrVfhV6C5BHa3GY3fcvs8SisYTO+FjSIXtzlZQcYxCwy1TPzl0Q2c001nCxLNHoUIe2zwyri41bcNS5JaTEyha6B4wBSM4G7IA5C+FvTtPzJmyX32hIuqbxQ5gjmdetimLIi2L7EJd9/Wv3pxYVKN2sq spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-MS-Office365-Filtering-Correlation-Id: bfb8d2de-91de-4e66-8d73-08d5abf00965 X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-Network-Message-Id: bfb8d2de-91de-4e66-8d73-08d5abf00965 X-MS-Exchange-CrossTenant-originalarrivaltime: 27 Apr 2018 03:36:18.0475 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR0502MB3757 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Hi Shiraz, Raju, > -----Original Message----- > From: Shiraz Saleem [mailto:shiraz.saleem@intel.com] > Sent: Thursday, April 26, 2018 8:48 PM > To: Raju Rangoju > Cc: Parav Pandit ; linux-rdma@vger.kernel.org; SWise > OGC ; sean.hefty@intel.com > Subject: Re: iwarp kernel mode applications are broken with commit f35faa4ba > > On Thu, Apr 26, 2018 at 07:46:38PM +0000, Raju Rangoju wrote: > > Hi Parav, > > > > The following commit f35faa4ba broke iWARP kernel mode applications. > > > > commit f35faa4ba9568138eea1c58abb92e8ef415dce41 > > Author: Parav Pandit > > Date: Sun Apr 1 15:08:20 2018 +0300 > > > > IB/core: Simplify ib_query_gid to always refer to cache > > > > [root@bhumthang]# nvme discover -t rdma -a 102.1.1.17 Failed to write > > to /dev/nvme-fabrics: Invalid argument > > > > [root@bhumthang]# dmesg > > [55961.151787] nvme nvme0: rdma_connect failed (-22). > > [55961.151971] nvme nvme0: rdma connection establishment failed (-22) > > > > ------------ > > iser > > ------------- > > [54714.834984] iw_cxgb4: Chelsio T4/T5 RDMA Driver - version 0.1 > > [54714.834987] iw_cxgb4: 0000:04:00.4: Up [54714.834987] iw_cxgb4: > > 0000:04:00.4: On-Chip Queues not supported on this device > > [54714.855963] ib_srpt MAD registration failed for cxgb4_0-1. > > [54714.855972] ib_srpt srpt_add_one(cxgb4_0) failed. > > [54715.123119] iw_cxgb4: 0000:07:00.4: Up [54715.123121] iw_cxgb4: > > 0000:07:00.4: On-Chip Queues not supported on this device > > [54715.125977] cxgb4 0000:07:00.4 enp7s0f4: port module unplugged > > [54715.166076] ib_srpt MAD registration failed for cxgb4_1-1. > > [54715.166080] ib_srpt srpt_add_one(cxgb4_1) failed. > > [54834.322675] iser: iser_route_handler: failure connecting: -22 > > [54835.326918] iser: iser_route_handler: failure connecting: -22 > > [54836.331221] iser: iser_route_handler: failure connecting: -22 > > [54837.335625] iser: iser_route_handler: failure connecting: -22 > > [54838.339980] iser: iser_route_handler: failure connecting: -22 > > [54839.343882] iser: iser_route_handler: failure connecting: -22 > > > > > My validation team reported the same issue on i40iw with 4.17-rc kernels. > > Some more data. Looks like the failure is because we can't find the cached gid > due to the gid idx being wrong in query_gid. > > rdma_connect > cma_connect_iw > cma_modify_qp_rtr > ib_query_gid > ib_get_cached_gid > __ib_cache_gid_get (EINVAL) > This call trace is helpful. Can you please run ibv_devinfo -v | grep GID and see that you are getting the expected GID. For iWarp we have only one entry GID table. So want to make sure that GID table is build correctly. From the above call trace, is appears that, cma_modify_qp_rtr() contains, struct ib_qp_attr qp_attr; ah_attr from above qp_attr remains uninitialized by iw_cm_init_qp_attr() and iwcm_init_qp_init_attr(). Before my fix, gid_index was always ignored by the query_gid() callback such as i40iw_query_gid() and c4iw_query_gid(). So it used to work. Now my fix expects all values to be correct; due to uninitialized ib_qp_attr it is likely failing. So can you please try below hunk and see if ib_query_gid() progresses for you? If it works, I will send the proper patch shortly. --- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index 8512f63..e119cff 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -863,7 +863,7 @@ void rdma_destroy_qp(struct rdma_cm_id *id) static int cma_modify_qp_rtr(struct rdma_id_private *id_priv, struct rdma_conn_param *conn_param) { - struct ib_qp_attr qp_attr; + struct ib_qp_attr qp_attr = {}; int qp_attr_mask, ret; union ib_gid sgid;