From patchwork Thu Feb 27 21:16:02 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11410717 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7FDDB1580 for ; Thu, 27 Feb 2020 21:45:09 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6843A24690 for ; Thu, 27 Feb 2020 21:45:09 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6843A24690 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 550FA34AF6C; Thu, 27 Feb 2020 13:36:01 -0800 (PST) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 23B91348840 for ; Thu, 27 Feb 2020 13:20:52 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id 16F749187; Thu, 27 Feb 2020 16:18:19 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 15E4146D; Thu, 27 Feb 2020 16:18:19 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 27 Feb 2020 16:16:02 -0500 Message-Id: <1582838290-17243-495-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1582838290-17243-1-git-send-email-jsimmons@infradead.org> References: <1582838290-17243-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 494/622] lnet: Use alternate ping processing for non-mr peers X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Chris Horn Router peers without multi-rail capabilities (i.e. older Lustre versions) or router peers that have discovery disabled need to use the alternate ping processing introduced by LU-12422. Otherwise, these peers go through the normal discovery processing, but their remote network interfaces are never added to the peer object. This causes routes through these peers to be considered down when avoid_asym_router_failure is enabled. Cray-bug-id: LUS-7866 WC-bug-id: https://jira.whamcloud.com/browse/LU-12763 Lustre-commit: 010f6b1819b9 ("LU-12763 lnet: Use alternate ping processing for non-mr peers") Signed-off-by: Chris Horn Reviewed-on: https://review.whamcloud.com/36182 Reviewed-by: Alexandr Boyko Reviewed-by: Amir Shehata Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- include/linux/lnet/lib-lnet.h | 1 + net/lnet/lnet/peer.c | 1 + net/lnet/lnet/router.c | 9 ++++++--- 3 files changed, 8 insertions(+), 3 deletions(-) diff --git a/include/linux/lnet/lib-lnet.h b/include/linux/lnet/lib-lnet.h index db1b7e5..56556fd 100644 --- a/include/linux/lnet/lib-lnet.h +++ b/include/linux/lnet/lib-lnet.h @@ -878,6 +878,7 @@ int lnet_get_peer_ni_info(u32 peer_index, u64 *nid, bool lnet_peer_is_uptodate(struct lnet_peer *lp); bool lnet_peer_is_uptodate_locked(struct lnet_peer *lp); bool lnet_is_discovery_disabled(struct lnet_peer *lp); +bool lnet_is_discovery_disabled_locked(struct lnet_peer *lp); bool lnet_peer_gw_discovery(struct lnet_peer *lp); static inline bool diff --git a/net/lnet/lnet/peer.c b/net/lnet/lnet/peer.c index 0d33ade..a067136 100644 --- a/net/lnet/lnet/peer.c +++ b/net/lnet/lnet/peer.c @@ -1141,6 +1141,7 @@ struct lnet_peer_ni * bool lnet_is_discovery_disabled_locked(struct lnet_peer *lp) +__must_hold(&lp->lp_lock) { if (lnet_peer_discovery_disabled) return true; diff --git a/net/lnet/lnet/router.c b/net/lnet/lnet/router.c index 7246eea..a5e4af0 100644 --- a/net/lnet/lnet/router.c +++ b/net/lnet/lnet/router.c @@ -227,7 +227,7 @@ bool lnet_is_route_alive(struct lnet_route *route) * aliveness information can only be obtained when discovery is * enabled. */ - if (lnet_peer_discovery_disabled) + if (lnet_is_discovery_disabled(gw)) return route->lr_alive; /* check the gateway's interfaces on the route rnet to make sure @@ -316,11 +316,14 @@ bool lnet_is_route_alive(struct lnet_route *route) spin_lock(&lp->lp_lock); lp_state = lp->lp_state; - spin_unlock(&lp->lp_lock); /* only handle replies if discovery is disabled. */ - if (!lnet_peer_discovery_disabled) + if (!lnet_is_discovery_disabled_locked(lp)) { + spin_unlock(&lp->lp_lock); return; + } + + spin_unlock(&lp->lp_lock); if (lp_state & LNET_PEER_PING_FAILED) { CDEBUG(D_NET,