From patchwork Thu Oct 13 01:47:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Daisuke Matsuda (Fujitsu)" X-Patchwork-Id: 13005552 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 124F6C433FE for ; Thu, 13 Oct 2022 01:59:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229604AbiJMB72 (ORCPT ); Wed, 12 Oct 2022 21:59:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60386 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229569AbiJMB71 (ORCPT ); Wed, 12 Oct 2022 21:59:27 -0400 X-Greylist: delayed 63 seconds by postgrey-1.37 at lindbergh.monkeyblade.net; Wed, 12 Oct 2022 18:59:26 PDT Received: from esa10.hc1455-7.c3s2.iphmx.com (esa10.hc1455-7.c3s2.iphmx.com [139.138.36.225]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1C9C712C8A8 for ; Wed, 12 Oct 2022 18:59:25 -0700 (PDT) X-IronPort-AV: E=McAfee;i="6500,9779,10498"; a="79798302" X-IronPort-AV: E=Sophos;i="5.95,180,1661785200"; d="scan'208";a="79798302" Received: from unknown (HELO oym-r2.gw.nic.fujitsu.com) ([210.162.30.90]) by esa10.hc1455-7.c3s2.iphmx.com with ESMTP; 13 Oct 2022 10:48:01 +0900 Received: from oym-m3.gw.nic.fujitsu.com (oym-nat-oym-m3.gw.nic.fujitsu.com [192.168.87.60]) by oym-r2.gw.nic.fujitsu.com (Postfix) with ESMTP id D288FD432A for ; Thu, 13 Oct 2022 10:48:00 +0900 (JST) Received: from m3002.s.css.fujitsu.com (msm3.b.css.fujitsu.com [10.128.233.104]) by oym-m3.gw.nic.fujitsu.com (Postfix) with ESMTP id 0FB18D9498 for ; Thu, 13 Oct 2022 10:48:00 +0900 (JST) Received: from localhost.localdomain (unknown [10.19.3.107]) by m3002.s.css.fujitsu.com (Postfix) with ESMTP id C4665200B396; Thu, 13 Oct 2022 10:47:59 +0900 (JST) From: Daisuke Matsuda To: linux-rdma@vger.kernel.org, leonro@nvidia.com, jgg@nvidia.com, zyjzyj2000@gmail.com Cc: Daisuke Matsuda , Li Zhijian Subject: [PATCH 1/2] RDMA/rxe: Make responder handle RDMA Read failures Date: Thu, 13 Oct 2022 10:47:23 +0900 Message-Id: <20221013014724.3786212-1-matsuda-daisuke@fujitsu.com> X-Mailer: git-send-email 2.31.1 MIME-Version: 1.0 X-TM-AS-GCONF: 00 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org Currently, responder can reply packets with invalid payloads if it fails to copy messages to the packets. Add an error handling in read_reply() to inform a requesting node of the failure. Suggested-by: Li Zhijian Signed-off-by: Daisuke Matsuda Reviewed-by: Li Zhijian --- FOR REVIEWERS: I referred to IB Specification Vol 1-Revision-1.5 to make this patch. Please see Ch.9.7.5.2.4, Ch.9.7.4.1.5 and Ch.9.7.5.1.3. drivers/infiniband/sw/rxe/rxe_resp.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c index ed5a09e86417..82b74e926e09 100644 --- a/drivers/infiniband/sw/rxe/rxe_resp.c +++ b/drivers/infiniband/sw/rxe/rxe_resp.c @@ -809,10 +809,14 @@ static enum resp_states read_reply(struct rxe_qp *qp, if (!skb) return RESPST_ERR_RNR; - rxe_mr_copy(mr, res->read.va, payload_addr(&ack_pkt), - payload, RXE_FROM_MR_OBJ); + err = rxe_mr_copy(mr, res->read.va, payload_addr(&ack_pkt), + payload, RXE_FROM_MR_OBJ); if (mr) rxe_put(mr); + if (err) { + kfree_skb(skb); + return RESPST_ERR_RKEY_VIOLATION; + } if (bth_pad(&ack_pkt)) { u8 *pad = payload_addr(&ack_pkt) + payload; From patchwork Thu Oct 13 01:47:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Daisuke Matsuda (Fujitsu)" X-Patchwork-Id: 13005551 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 75BE7C4332F for ; Thu, 13 Oct 2022 01:49:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229864AbiJMBt0 (ORCPT ); Wed, 12 Oct 2022 21:49:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37368 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229847AbiJMBtZ (ORCPT ); Wed, 12 Oct 2022 21:49:25 -0400 X-Greylist: delayed 63 seconds by postgrey-1.37 at lindbergh.monkeyblade.net; Wed, 12 Oct 2022 18:49:24 PDT Received: from esa12.hc1455-7.c3s2.iphmx.com (esa12.hc1455-7.c3s2.iphmx.com [139.138.37.100]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3157712C882 for ; Wed, 12 Oct 2022 18:49:23 -0700 (PDT) X-IronPort-AV: E=McAfee;i="6500,9779,10498"; a="71572157" X-IronPort-AV: E=Sophos;i="5.95,180,1661785200"; d="scan'208";a="71572157" Received: from unknown (HELO yto-r4.gw.nic.fujitsu.com) ([218.44.52.220]) by esa12.hc1455-7.c3s2.iphmx.com with ESMTP; 13 Oct 2022 10:48:18 +0900 Received: from yto-m2.gw.nic.fujitsu.com (yto-nat-yto-m2.gw.nic.fujitsu.com [192.168.83.65]) by yto-r4.gw.nic.fujitsu.com (Postfix) with ESMTP id 43938AB655 for ; Thu, 13 Oct 2022 10:48:18 +0900 (JST) Received: from m3002.s.css.fujitsu.com (msm3.b.css.fujitsu.com [10.128.233.104]) by yto-m2.gw.nic.fujitsu.com (Postfix) with ESMTP id 7AE6DD9BFD for ; Thu, 13 Oct 2022 10:48:17 +0900 (JST) Received: from localhost.localdomain (unknown [10.19.3.107]) by m3002.s.css.fujitsu.com (Postfix) with ESMTP id 5E675200B3B4; Thu, 13 Oct 2022 10:48:17 +0900 (JST) From: Daisuke Matsuda To: linux-rdma@vger.kernel.org, leonro@nvidia.com, jgg@nvidia.com, zyjzyj2000@gmail.com Cc: Daisuke Matsuda Subject: [PATCH 2/2] RDMA/rxe: Handle remote errors in the midst of a Read reply sequence Date: Thu, 13 Oct 2022 10:47:24 +0900 Message-Id: <20221013014724.3786212-2-matsuda-daisuke@fujitsu.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20221013014724.3786212-1-matsuda-daisuke@fujitsu.com> References: <20221013014724.3786212-1-matsuda-daisuke@fujitsu.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org Requesting nodes do not handle a reported error correctly if it is generated in the middle of multi-packet Read responses, and the node tries to resend the request endlessly. Let completer terminate the connection in that case. Signed-off-by: Daisuke Matsuda Reviewed-by: Li Zhijian --- FOR REVIEWERS: I referred to IB Specification Vol 1-Revision-1.5 to make this patch. Please see Ch.9.9.2.2, Ch.9.9.2.4.2 and Table 59. drivers/infiniband/sw/rxe/rxe_comp.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/drivers/infiniband/sw/rxe/rxe_comp.c b/drivers/infiniband/sw/rxe/rxe_comp.c index fb0c008af78c..c9170dd99f3a 100644 --- a/drivers/infiniband/sw/rxe/rxe_comp.c +++ b/drivers/infiniband/sw/rxe/rxe_comp.c @@ -200,6 +200,10 @@ static inline enum comp_state check_psn(struct rxe_qp *qp, */ if (pkt->psn == wqe->last_psn) return COMPST_COMP_ACK; + else if (pkt->opcode == IB_OPCODE_RC_ACKNOWLEDGE && + (qp->comp.opcode == IB_OPCODE_RC_RDMA_READ_RESPONSE_FIRST || + qp->comp.opcode == IB_OPCODE_RC_RDMA_READ_RESPONSE_MIDDLE)) + return COMPST_CHECK_ACK; else return COMPST_DONE; } else if ((diff > 0) && (wqe->mask & WR_ATOMIC_OR_READ_MASK)) { @@ -228,6 +232,10 @@ static inline enum comp_state check_ack(struct rxe_qp *qp, case IB_OPCODE_RC_RDMA_READ_RESPONSE_FIRST: case IB_OPCODE_RC_RDMA_READ_RESPONSE_MIDDLE: + /* Check NAK code to handle a remote error */ + if (pkt->opcode == IB_OPCODE_RC_ACKNOWLEDGE) + break; + if (pkt->opcode != IB_OPCODE_RC_RDMA_READ_RESPONSE_MIDDLE && pkt->opcode != IB_OPCODE_RC_RDMA_READ_RESPONSE_LAST) { /* read retries of partial data may restart from