From patchwork Tue Sep 25 01:07:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10613181 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 373B4157B for ; Tue, 25 Sep 2018 01:11:31 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3C1322A052 for ; Tue, 25 Sep 2018 01:11:31 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 309D92A05D; Tue, 25 Sep 2018 01:11:31 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id BE1DA2A052 for ; Tue, 25 Sep 2018 01:11:30 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 66BAA4C3FB3; Mon, 24 Sep 2018 18:11:30 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 8871C4C3CD1 for ; Mon, 24 Sep 2018 18:11:28 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id A69A1B037; Tue, 25 Sep 2018 01:11:27 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Tue, 25 Sep 2018 11:07:15 +1000 Message-ID: <153783763544.32103.10828937501297905304.stgit@noble> In-Reply-To: <153783752960.32103.8394391715843917125.stgit@noble> References: <153783752960.32103.8394391715843917125.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 15/34] LU-7734 lnet: handle N NIs to 1 LND peer X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Amir Shehata This patch changes o2iblnd only, as socklnd already handles this case. In the new design you can have multiple NIs communicating to one peer. In the o2ilbnd the kib_peer has a pointer to the NI which implies a 1:1 relationship. This patch changes kiblnd_find_peer_locked() to use the peer NID and the NI NID as the key. This way a new peer will be created for each unique NI/peer_NI pair. This is similar to how socklnd handles this case. Signed-off-by: Amir Shehata Change-Id: Ifab7764489757ea473b15c46c1a22ef9ceeeceea Reviewed-on: http://review.whamcloud.com/19306 Reviewed-by: Doug Oucharek Tested-by: Doug Oucharek Signed-off-by: NeilBrown --- .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c | 13 ++++++++++--- .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h | 2 +- .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c | 8 ++++---- 3 files changed, 15 insertions(+), 8 deletions(-) diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c index 2e71abbf8a0c..64df49146413 100644 --- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c +++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c @@ -379,7 +379,7 @@ void kiblnd_destroy_peer(struct kib_peer *peer) atomic_dec(&net->ibn_npeers); } -struct kib_peer *kiblnd_find_peer_locked(lnet_nid_t nid) +struct kib_peer *kiblnd_find_peer_locked(struct lnet_ni *ni, lnet_nid_t nid) { /* * the caller is responsible for accounting the additional reference @@ -391,7 +391,14 @@ struct kib_peer *kiblnd_find_peer_locked(lnet_nid_t nid) list_for_each_entry(peer, peer_list, ibp_list) { LASSERT(!kiblnd_peer_idle(peer)); - if (peer->ibp_nid != nid) + /* + * Match a peer if its NID and the NID of the local NI it + * communicates over are the same. Otherwise don't match + * the peer, which will result in a new lnd peer being + * created. + */ + if (peer->ibp_nid != nid || + peer->ibp_ni->ni_nid != ni->ni_nid) continue; CDEBUG(D_NET, "got peer [%p] -> %s (%d) version: %x\n", @@ -1041,7 +1048,7 @@ static void kiblnd_query(struct lnet_ni *ni, lnet_nid_t nid, time64_t *when) read_lock_irqsave(glock, flags); - peer = kiblnd_find_peer_locked(nid); + peer = kiblnd_find_peer_locked(ni, nid); if (peer) last_alive = peer->ibp_last_alive; diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h index 522eb150d9a6..520f586015f4 100644 --- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h +++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h @@ -1019,7 +1019,7 @@ void kiblnd_destroy_peer(struct kib_peer *peer); bool kiblnd_reconnect_peer(struct kib_peer *peer); void kiblnd_destroy_dev(struct kib_dev *dev); void kiblnd_unlink_peer_locked(struct kib_peer *peer); -struct kib_peer *kiblnd_find_peer_locked(lnet_nid_t nid); +struct kib_peer *kiblnd_find_peer_locked(struct lnet_ni *ni, lnet_nid_t nid); int kiblnd_close_stale_conns_locked(struct kib_peer *peer, int version, __u64 incarnation); int kiblnd_close_peer_conns_locked(struct kib_peer *peer, int why); diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c index af8f863b6a68..f4b76347e1c6 100644 --- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c +++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c @@ -1370,7 +1370,7 @@ kiblnd_launch_tx(struct lnet_ni *ni, struct kib_tx *tx, lnet_nid_t nid) */ read_lock_irqsave(g_lock, flags); - peer = kiblnd_find_peer_locked(nid); + peer = kiblnd_find_peer_locked(ni, nid); if (peer && !list_empty(&peer->ibp_conns)) { /* Found a peer with an established connection */ conn = kiblnd_get_conn_locked(peer); @@ -1388,7 +1388,7 @@ kiblnd_launch_tx(struct lnet_ni *ni, struct kib_tx *tx, lnet_nid_t nid) /* Re-try with a write lock */ write_lock(g_lock); - peer = kiblnd_find_peer_locked(nid); + peer = kiblnd_find_peer_locked(ni, nid); if (peer) { if (list_empty(&peer->ibp_conns)) { /* found a peer, but it's still connecting... */ @@ -1426,7 +1426,7 @@ kiblnd_launch_tx(struct lnet_ni *ni, struct kib_tx *tx, lnet_nid_t nid) write_lock_irqsave(g_lock, flags); - peer2 = kiblnd_find_peer_locked(nid); + peer2 = kiblnd_find_peer_locked(ni, nid); if (peer2) { if (list_empty(&peer2->ibp_conns)) { /* found a peer, but it's still connecting... */ @@ -2388,7 +2388,7 @@ kiblnd_passive_connect(struct rdma_cm_id *cmid, void *priv, int priv_nob) write_lock_irqsave(g_lock, flags); - peer2 = kiblnd_find_peer_locked(nid); + peer2 = kiblnd_find_peer_locked(ni, nid); if (peer2) { if (!peer2->ibp_version) { peer2->ibp_version = version;