From patchwork Sun Mar 12 20:20:41 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Parav Pandit X-Patchwork-Id: 9619537 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 5ECBC60417 for ; Sun, 12 Mar 2017 20:20:50 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4E3022846C for ; Sun, 12 Mar 2017 20:20:50 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4187E28480; Sun, 12 Mar 2017 20:20:50 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 958B42846C for ; Sun, 12 Mar 2017 20:20:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935404AbdCLUUs (ORCPT ); Sun, 12 Mar 2017 16:20:48 -0400 Received: from mail-ve1eur01on0082.outbound.protection.outlook.com ([104.47.1.82]:9874 "EHLO EUR01-VE1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S934858AbdCLUUr (ORCPT ); Sun, 12 Mar 2017 16:20:47 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=yTlc0PwLzNemlHMt/nwaOVmKJOeg1Mj7XHWICW8Xa0M=; b=U7/CH6iMPJ7Zh09lsZMIJwNNPCRFPavsYt7h4XpTkCBJyDT4ucUwJggiUBJM4BOifQt66hz33zxZ2/4/VEkKk0wiK/3Qzke/IjWLXgn3m5A8aC98nQju6NfgmUnYjOcN2KUNw5Z9e/Fc4Cu/8z5Rtd/y92pouT7M+cYSVkLRbcs= Received: from VI1PR0502MB3008.eurprd05.prod.outlook.com (10.175.21.22) by VI1PR05MB1215.eurprd05.prod.outlook.com (10.162.15.23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.947.12; Sun, 12 Mar 2017 20:20:42 +0000 Received: from VI1PR0502MB3008.eurprd05.prod.outlook.com ([10.175.21.22]) by VI1PR0502MB3008.eurprd05.prod.outlook.com ([10.175.21.22]) with mapi id 15.01.0947.023; Sun, 12 Mar 2017 20:20:41 +0000 From: Parav Pandit To: Ursula Braun , Eli Cohen , Matan Barak CC: Saeed Mahameed , Leon Romanovsky , "linux-rdma@vger.kernel.org" Subject: RE: Fwd: mlx5_ib_post_send panic on s390x Thread-Topic: Fwd: mlx5_ib_post_send panic on s390x Thread-Index: AQHSjoOiFb4IoaWQM0CKpJ3BIhxHX6GH1v0KgAABMYCABIEBgIAFZPgA Date: Sun, 12 Mar 2017 20:20:41 +0000 Message-ID: References: <56246ac0-a706-291c-7baa-a6dd2c6331cd@linux.vnet.ibm.com> <20e4f31e-b2a7-89fb-d4c0-583c0dc1efb6@mellanox.com> <491cf3e1-b2f8-3695-ecd4-3d34b0ae9e25@linux.vnet.ibm.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: linux.vnet.ibm.com; dkim=none (message not signed) header.d=none;linux.vnet.ibm.com; dmarc=none action=none header.from=mellanox.com; x-originating-ip: [68.203.16.89] x-ms-office365-filtering-correlation-id: 1b0a1b00-3226-4190-6c7b-08d469854199 x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(22001)(48565401081); SRVR:VI1PR05MB1215; x-microsoft-exchange-diagnostics: 1; VI1PR05MB1215; 7:57KShkdTstcLya/oytREBJtDWm8TcExxhGTMHfhrP0SSH7RuKoNfnkrNmSKFmV57tW0DFDPYbXq5jeYYxDTlwILCcwbSab5GDlA1I86CjdDwo7CFH7GpK8XBjfq+RAFRIeAS6LWePbadkN2C7uQXDDBYcB7BxGl6dFnRSSyMcCiLvOBWqX5/e2snzBJDZNZYRtxx0tgOthr/TC3dlGoP+sOt0ORCFEO3RuAgsJ1KXk4LX1Nsn5J8WRIzDeIltAzHGFDKLLnNNvoqqLg/HBXrKES6PXqmh+bbJ78fIOf2lK4+Nxp/lGqZQrgBRA24TbMmuY2eq+DFxQBT36n82HEe0g== x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(9452136761055)(21532816269658); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(6040375)(601004)(2401047)(8121501046)(5005006)(3002001)(10201501046)(6055026)(6041248)(20161123564025)(20161123562025)(20161123560025)(20161123558025)(20161123555025)(6072148); SRVR:VI1PR05MB1215; BCL:0; PCL:0; RULEID:; SRVR:VI1PR05MB1215; x-forefront-prvs: 0244637DEA x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(6009001)(39410400002)(39850400002)(39860400002)(39840400002)(39450400003)(377454003)(13464003)(24454002)(66066001)(33656002)(229853002)(6636002)(2950100002)(54356999)(76176999)(50986999)(189998001)(8676002)(5660300001)(122556002)(86362001)(81166006)(8936002)(53936002)(7736002)(106116001)(6116002)(3846002)(102836003)(6306002)(99286003)(9686003)(55016002)(6506006)(6436002)(305945005)(53546006)(3660700001)(966004)(2906002)(2900100001)(25786008)(77096006)(4326008)(74316002)(93886004)(5890100001)(3280700002)(38730400002)(6246003)(15760500002)(422495003); DIR:OUT; SFP:1101; SCL:1; SRVR:VI1PR05MB1215; H:VI1PR0502MB3008.eurprd05.prod.outlook.com; FPR:; SPF:None; MLV:sfv; LANG:en; spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-originalarrivaltime: 12 Mar 2017 20:20:41.5781 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR05MB1215 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Hi Ursula, > -----Original Message----- > From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma- > owner@vger.kernel.org] On Behalf Of Ursula Braun > Sent: Thursday, March 9, 2017 3:54 AM > To: Eli Cohen ; Matan Barak > Cc: Saeed Mahameed ; Leon Romanovsky > ; linux-rdma@vger.kernel.org > Subject: Re: Fwd: mlx5_ib_post_send panic on s390x > > > > On 03/06/2017 02:08 PM, Eli Cohen wrote: > >>> > >>> The problem seems to be caused by the usage of plain memcpy in > set_data_inl_seg(). > >>> The address provided by SMC-code in struct ib_send_wr *wr is an > >>> address belonging to an area mapped with the ib_dma_map_single() > >>> call. On s390x those kind of addresses require extra access functions (see > arch/s390/include/asm/io.h). > >>> > > > > By definition, when you are posting a send request with inline, the address > must be mapped to the cpu so plain memcpy should work. > > > In the past I run SMC-R with Connect X3 cards. The mlx4 driver does not seem to > contain extra coding for IB_SEND_INLINE flag for ib_post_send. Does this mean > for SMC-R to run on Connect X3 cards the IB_SEND_INLINE flag is ignored, and > thus I needed the ib_dma_map_single() call for the area used with > ib_post_send()? Does this mean I should stay away from the IB_SEND_INLINE > flag, if I want to run the same SMC-R code with both, Connect X3 cards and > Connect X4 cards? > I had encountered the same kernel panic that you mentioned last week on ConnectX-4 adapters with smc-r on x86_64. Shall I submit below fix to netdev mailing list? I have tested above change. I also have optimization that avoids dma mapping for wr_tx_dma_addr. - lnk->wr_tx_sges[i].addr = - lnk->wr_tx_dma_addr + i * SMC_WR_BUF_SIZE; + lnk->wr_tx_sges[i].addr = (uintptr_t)(lnk->wr_tx_bufs + i); I also have fix for processing IB_SEND_INLINE in mlx4 driver on little older kernel base. I have attached below. I can rebase my kernel and provide fix in mlx5_ib driver. Let me know. Regards, Parav Pandit > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body > of a message to majordomo@vger.kernel.org More majordomo info at > http://vger.kernel.org/majordomo-info.html diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c index a2e4ca5..0d984f5 100644 --- a/drivers/infiniband/hw/mlx4/qp.c +++ b/drivers/infiniband/hw/mlx4/qp.c @@ -2748,6 +2748,7 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr, unsigned long flags; int nreq; int err = 0; + int inl = 0; unsigned ind; int uninitialized_var(stamp); int uninitialized_var(size); @@ -2958,30 +2959,97 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr, default: break; } + if (wr->send_flags & IB_SEND_INLINE && wr->num_sge) { + struct mlx4_wqe_inline_seg *seg; + void *addr; + int len, seg_len; + int num_seg; + int off, to_copy; + + inl = 0; + + seg = wqe; + wqe += sizeof *seg; + off = ((uintptr_t) wqe) & (MLX4_INLINE_ALIGN - 1); + num_seg = 0; + seg_len = 0; + + for (i = 0; i < wr->num_sge; ++i) { + addr = (void *) (uintptr_t) wr->sg_list[i].addr; + len = wr->sg_list[i].length; + inl += len; + + if (inl > 16) { + inl = 0; + err = ENOMEM; + *bad_wr = wr; + goto out; + } - /* - * Write data segments in reverse order, so as to - * overwrite cacheline stamp last within each - * cacheline. This avoids issues with WQE - * prefetching. - */ + while (len >= MLX4_INLINE_ALIGN - off) { + to_copy = MLX4_INLINE_ALIGN - off; + memcpy(wqe, addr, to_copy); + len -= to_copy; + wqe += to_copy; + addr += to_copy; + seg_len += to_copy; + wmb(); /* see comment below */ + seg->byte_count = htonl(MLX4_INLINE_SEG | seg_len); + seg_len = 0; + seg = wqe; + wqe += sizeof *seg; + off = sizeof *seg; + ++num_seg; + } - dseg = wqe; - dseg += wr->num_sge - 1; - size += wr->num_sge * (sizeof (struct mlx4_wqe_data_seg) / 16); + memcpy(wqe, addr, len); + wqe += len; + seg_len += len; + off += len; + } - /* Add one more inline data segment for ICRC for MLX sends */ - if (unlikely(qp->mlx4_ib_qp_type == MLX4_IB_QPT_SMI || - qp->mlx4_ib_qp_type == MLX4_IB_QPT_GSI || - qp->mlx4_ib_qp_type & - (MLX4_IB_QPT_PROXY_SMI_OWNER | MLX4_IB_QPT_TUN_SMI_OWNER))) { - set_mlx_icrc_seg(dseg + 1); - size += sizeof (struct mlx4_wqe_data_seg) / 16; - } + if (seg_len) { + ++num_seg; + /* + * Need a barrier here to make sure + * all the data is visible before the + * byte_count field is set. Otherwise + * the HCA prefetcher could grab the + * 64-byte chunk with this inline + * segment and get a valid (!= + * 0xffffffff) byte count but stale + * data, and end up sending the wrong + * data. + */ + wmb(); + seg->byte_count = htonl(MLX4_INLINE_SEG | seg_len); + } - for (i = wr->num_sge - 1; i >= 0; --i, --dseg) - set_data_seg(dseg, wr->sg_list + i); + size += (inl + num_seg * sizeof (*seg) + 15) / 16; + } else { + /* + * Write data segments in reverse order, so as to + * overwrite cacheline stamp last within each + * cacheline. This avoids issues with WQE + * prefetching. + */ + + dseg = wqe; + dseg += wr->num_sge - 1; + size += wr->num_sge * (sizeof (struct mlx4_wqe_data_seg) / 16); + + /* Add one more inline data segment for ICRC for MLX sends */ + if (unlikely(qp->mlx4_ib_qp_type == MLX4_IB_QPT_SMI || + qp->mlx4_ib_qp_type == MLX4_IB_QPT_GSI || + qp->mlx4_ib_qp_type & + (MLX4_IB_QPT_PROXY_SMI_OWNER | MLX4_IB_QPT_TUN_SMI_OWNER))) { + set_mlx_icrc_seg(dseg + 1); + size += sizeof (struct mlx4_wqe_data_seg) / 16; + } + for (i = wr->num_sge - 1; i >= 0; --i, --dseg) + set_data_seg(dseg, wr->sg_list + i); + } /* * Possibly overwrite stamping in cacheline with LSO * segment only after making sure all data segments