From patchwork Thu Feb 27 21:17:30 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11410547 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 32984138D for ; Thu, 27 Feb 2020 21:41:01 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 1B62324690 for ; Thu, 27 Feb 2020 21:41:01 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1B62324690 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 43996349962; Thu, 27 Feb 2020 13:33:22 -0800 (PST) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 340983489CC for ; Thu, 27 Feb 2020 13:21:20 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id 20B86A141; Thu, 27 Feb 2020 16:18:20 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 1F9E546A; Thu, 27 Feb 2020 16:18:20 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 27 Feb 2020 16:17:30 -0500 Message-Id: <1582838290-17243-583-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1582838290-17243-1-git-send-email-jsimmons@infradead.org> References: <1582838290-17243-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 582/622] lnet: Fix source specified route selection X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Chris Horn If lnet_send() is called with a specific src_nid, but rtr_nid == LNET_NID_ANY and the message needs to be routed, then we need to ensure that the lnet_peer_ni of our next hop is on the same network as the lnet_ni associated with the src_nid. Otherwise we may end up choosing an lnet_peer_ni that cannot be reached from the specified source. WC-bug-id: https://jira.whamcloud.com/browse/LU-12919 Lustre-commit: f0aa632d4255 ("LU-12919 lnet: Fix source specified route selection") Signed-off-by: Chris Horn Reviewed-on: https://review.whamcloud.com/36622 Reviewed-by: Alexandr Boyko Reviewed-by: Amir Shehata Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- net/lnet/lnet/lib-move.c | 41 +++++++++++++++++++++++++++++------------ 1 file changed, 29 insertions(+), 12 deletions(-) diff --git a/net/lnet/lnet/lib-move.c b/net/lnet/lnet/lib-move.c index 269b2d5..ca292a6 100644 --- a/net/lnet/lnet/lib-move.c +++ b/net/lnet/lnet/lib-move.c @@ -1290,7 +1290,7 @@ void lnet_usr_translate_stats(struct lnet_ioctl_element_msg_stats *msg_stats, } static struct lnet_route * -lnet_find_route_locked(struct lnet_remotenet *rnet, +lnet_find_route_locked(struct lnet_remotenet *rnet, u32 src_net, struct lnet_route **prev_route, struct lnet_peer_ni **gwni) { @@ -1299,6 +1299,8 @@ void lnet_usr_translate_stats(struct lnet_ioctl_element_msg_stats *msg_stats, struct lnet_route *last_route; struct lnet_route *route; int rc; + u32 restrict_net; + u32 any_net = LNET_NIDNET(LNET_NID_ANY); best_route = NULL; last_route = NULL; @@ -1306,14 +1308,23 @@ void lnet_usr_translate_stats(struct lnet_ioctl_element_msg_stats *msg_stats, if (!lnet_is_route_alive(route)) continue; + /* If the src_net is specified then we need to find an lpni + * on that network + */ + restrict_net = src_net == any_net ? route->lr_lnet : src_net; if (!best_route) { - best_route = route; - last_route = route; - best_gw_ni = lnet_find_best_lpni_on_net(NULL, - LNET_NID_ANY, - route->lr_gateway, - route->lr_lnet); - LASSERT(best_gw_ni); + lpni = lnet_find_best_lpni_on_net(NULL, LNET_NID_ANY, + route->lr_gateway, + restrict_net); + if (lpni) { + best_route = route; + last_route = route; + best_gw_ni = lpni; + } else { + CERROR("Gateway %s does not have a peer NI on net %s\n", + libcfs_nid2str(route->lr_gateway->lp_primary_nid), + libcfs_net2str(restrict_net)); + } continue; } @@ -1327,8 +1338,13 @@ void lnet_usr_translate_stats(struct lnet_ioctl_element_msg_stats *msg_stats, lpni = lnet_find_best_lpni_on_net(NULL, LNET_NID_ANY, route->lr_gateway, - route->lr_lnet); - LASSERT(lpni); + restrict_net); + if (!lpni) { + CERROR("Gateway %s does not have a peer NI on net %s\n", + libcfs_nid2str(route->lr_gateway->lp_primary_nid), + libcfs_net2str(restrict_net)); + continue; + } if (rc == 1) { best_route = route; @@ -1868,8 +1884,9 @@ struct lnet_ni * return -EHOSTUNREACH; } - best_route = lnet_find_route_locked(best_rnet, &last_route, - &gwni); + best_route = lnet_find_route_locked(best_rnet, + LNET_NIDNET(src_nid), + &last_route, &gwni); if (!best_route) { CERROR("no route to %s from %s\n", libcfs_nid2str(dst_nid),