From patchwork Sat Jun 13 16:27:05 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11602897 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 444CE138C for ; Sat, 13 Jun 2020 16:28:06 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 2C5582078A for ; Sat, 13 Jun 2020 16:28:06 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2C5582078A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id AFD9721F9F2; Sat, 13 Jun 2020 09:27:42 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 2260521E04D for ; Sat, 13 Jun 2020 09:27:25 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id 9701327A; Sat, 13 Jun 2020 12:27:19 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 8F04A2A4; Sat, 13 Jun 2020 12:27:19 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Sat, 13 Jun 2020 12:27:05 -0400 Message-Id: <1592065636-28333-10-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1592065636-28333-1-git-send-email-jsimmons@infradead.org> References: <1592065636-28333-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 09/20] lustre: ptlrpc: Clear bd_registered in ptlrpc_unregister_bulk X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Chris Horn The patch for LU-12816 https://review.whamcloud.com/36309 has us clearing the bd_registered flag in ptl_send_rpc(). This flag is set in ptlrpc_register_bulk(), so it makes sense for us to clear it in ptlrpc_unregister_bulk(). When we're cleaning up in ptl_send_rpc() we can be sure the flag is cleared with the call to ptlrpc_unregister_bulk(). This commit also adds a test case for the LU-12816 bug. Fixes: 1010595a9bb6 ("lustre: ptlrpc: ptlrpc_register_bulk LBUG on ENOMEM") WC-bug-id: https://jira.whamcloud.com/browse/LU-13509 Lustre-commit: 15057a17ca1e2 ("LU-13509 ptlrpc: Clear bd_registered in ptlrpc_unregister_bulk") Signed-off-by: Chris Horn Reviewed-on: https://review.whamcloud.com/38457 Reviewed-by: Shaun Tancheff Reviewed-by: Wang Shilong Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/include/obd_support.h | 1 + fs/lustre/ptlrpc/niobuf.c | 18 +++++++++++++----- 2 files changed, 14 insertions(+), 5 deletions(-) diff --git a/fs/lustre/include/obd_support.h b/fs/lustre/include/obd_support.h index b706a20..d1d21e0 100644 --- a/fs/lustre/include/obd_support.h +++ b/fs/lustre/include/obd_support.h @@ -362,6 +362,7 @@ #define OBD_FAIL_PTLRPC_LONG_REQ_UNLINK 0x51b #define OBD_FAIL_PTLRPC_LONG_BOTH_UNLINK 0x51c #define OBD_FAIL_PTLRPC_BULK_ATTACH 0x521 +#define OBD_FAIL_PTLRPC_BULK_REPLY_ATTACH 0x522 #define OBD_FAIL_PTLRPC_ROUND_XID 0x530 #define OBD_FAIL_PTLRPC_CONNECT_RACE 0x531 diff --git a/fs/lustre/ptlrpc/niobuf.c b/fs/lustre/ptlrpc/niobuf.c index d62629a..331a0c8 100644 --- a/fs/lustre/ptlrpc/niobuf.c +++ b/fs/lustre/ptlrpc/niobuf.c @@ -250,6 +250,9 @@ int ptlrpc_unregister_bulk(struct ptlrpc_request *req, int async) LASSERT(!in_interrupt()); /* might sleep */ + if (desc) + desc->bd_registered = 0; + /* Let's setup deadline for reply unlink. */ if (OBD_FAIL_CHECK(OBD_FAIL_PTLRPC_LONG_BULK_UNLINK) && async && req->rq_bulk_deadline == 0 && cfs_fail_val == 0) @@ -614,9 +617,16 @@ int ptl_send_rpc(struct ptlrpc_request *request, int noreply) request->rq_repmsg = NULL; } - reply_me = LNetMEAttach(request->rq_reply_portal, - connection->c_peer, request->rq_xid, 0, - LNET_UNLINK, LNET_INS_AFTER); + if (request->rq_bulk && + OBD_FAIL_CHECK(OBD_FAIL_PTLRPC_BULK_REPLY_ATTACH)) { + reply_me = ERR_PTR(-ENOMEM); + } else { + reply_me = LNetMEAttach(request->rq_reply_portal, + connection->c_peer, + request->rq_xid, 0, + LNET_UNLINK, LNET_INS_AFTER); + } + if (IS_ERR(reply_me)) { rc = PTR_ERR(reply_me); CERROR("LNetMEAttach failed: %d\n", rc); @@ -724,8 +734,6 @@ int ptl_send_rpc(struct ptlrpc_request *request, int noreply) * the chance to have long unlink to sluggish net is smaller here. */ ptlrpc_unregister_bulk(request, 0); - if (request->rq_bulk) - request->rq_bulk->bd_registered = 0; out: if (rc == -ENOMEM) { /*