From patchwork Fri Sep 7 00:49:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10591331 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E1022921 for ; Fri, 7 Sep 2018 00:51:42 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D1EF32B17D for ; Fri, 7 Sep 2018 00:51:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C3C142B181; Fri, 7 Sep 2018 00:51:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 824C92B17D for ; Fri, 7 Sep 2018 00:51:42 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id F21724E30DB; Thu, 6 Sep 2018 17:51:41 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 8BD9A4E306E for ; Thu, 6 Sep 2018 17:51:39 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id C2016AF6A; Fri, 7 Sep 2018 00:51:38 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Fri, 07 Sep 2018 10:49:31 +1000 Message-ID: <153628137125.8267.7825080417267355121.stgit@noble> In-Reply-To: <153628058697.8267.6056114844033479774.stgit@noble> References: <153628058697.8267.6056114844033479774.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 01/34] struct lnet_ni - reformat comments. X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP This is part of 8cbb8cd3e771e7f7e0f99cafc19fad32770dc015 LU-7734 lnet: Multi-Rail local NI split Reviewed-by: Doug Oucharek > Reviewed-by: Doug Oucharek <dougso@me.com> Signed-off-by: Amir Shehata Reviewed-by: Doug Oucharek Reviewed-by: Olaf Weber Signed-off-by: NeilBrown --- .../staging/lustre/include/linux/lnet/lib-types.h | 38 +++++++++++++++----- 1 file changed, 29 insertions(+), 9 deletions(-) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h index 6d4106fd9039..078bc97a9ebf 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-types.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h @@ -263,18 +263,38 @@ struct lnet_ni { int ni_peerrtrcredits; /* seconds to consider peer dead */ int ni_peertimeout; - int ni_ncpts; /* number of CPTs */ - __u32 *ni_cpts; /* bond NI on some CPTs */ - lnet_nid_t ni_nid; /* interface's NID */ - void *ni_data; /* instance-specific data */ + /* number of CPTs */ + int ni_ncpts; + + /* bond NI on some CPTs */ + __u32 *ni_cpts; + + /* interface's NID */ + lnet_nid_t ni_nid; + + /* instance-specific data */ + void *ni_data; + struct lnet_lnd *ni_lnd; /* procedural interface */ - struct lnet_tx_queue **ni_tx_queues; /* percpt TX queues */ - int **ni_refs; /* percpt reference count */ - time64_t ni_last_alive;/* when I was last alive */ - struct lnet_ni_status *ni_status; /* my health status */ + + /* percpt TX queues */ + struct lnet_tx_queue **ni_tx_queues; + + /* percpt reference count */ + int **ni_refs; + + /* when I was last alive */ + time64_t ni_last_alive; + + /* my health status */ + struct lnet_ni_status *ni_status; + /* per NI LND tunables */ struct lnet_ioctl_config_lnd_tunables *ni_lnd_tunables; - /* equivalent interfaces to use */ + /* + * equivalent interfaces to use + * This is an array because socklnd bonding can still be configured + */ char *ni_interfaces[LNET_MAX_INTERFACES]; /* original net namespace */ struct net *ni_net_ns; From patchwork Fri Sep 7 00:49:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10591333 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 820A7112B for ; Fri, 7 Sep 2018 00:51:50 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7336D2B17D for ; Fri, 7 Sep 2018 00:51:50 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 67A042B181; Fri, 7 Sep 2018 00:51:50 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 483362B17D for ; Fri, 7 Sep 2018 00:51:49 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 037C74E30DB; Thu, 6 Sep 2018 17:51:49 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 0ED334E309D for ; Thu, 6 Sep 2018 17:51:47 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 310D8AEF1; Fri, 7 Sep 2018 00:51:46 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Fri, 07 Sep 2018 10:49:31 +1000 Message-ID: <153628137129.8267.345070695068208597.stgit@noble> In-Reply-To: <153628058697.8267.6056114844033479774.stgit@noble> References: <153628058697.8267.6056114844033479774.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 02/34] lnet: Create struct lnet_net X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP This will contain some fields from lnet_ni, to be shared between multiple ni on the one network. For now, only tunables are moved across, using struct lnet_ioctl_config_lnd_cmn_tunables which is changed to use signed values so -1 can be stored. -1 means "no value" If the tunables haven't been initialised, then net_tunables_set is false. Previously a NULL pointer had this meaning. A 'struct lnet_net' is allocated as part of lnet_ni_alloc(), and freed by lnet_ni_free(). This is part of 8cbb8cd3e771e7f7e0f99cafc19fad32770dc015 LU-7734 lnet: Multi-Rail local NI split Signed-off-by: NeilBrown Reviewed-by: Doug Oucharek > Signed-off-by: NeilBrown > Reviewed-by: Doug Oucharek <dougso@me.com> Signed-off-by: NeilBrown <neilb@suse.com>
Acked-by: James Simmons --- .../staging/lustre/include/linux/lnet/lib-types.h | 25 ++++++-- .../lustre/include/uapi/linux/lnet/lnet-dlc.h | 8 +-- .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c | 2 - .../lustre/lnet/klnds/o2iblnd/o2iblnd_modparams.c | 61 +++++++++++--------- .../staging/lustre/lnet/klnds/socklnd/socklnd.c | 19 ++++-- drivers/staging/lustre/lnet/lnet/api-ni.c | 45 +++++++++------ drivers/staging/lustre/lnet/lnet/config.c | 24 ++++++-- drivers/staging/lustre/lnet/lnet/lib-move.c | 5 +- drivers/staging/lustre/lnet/lnet/peer.c | 9 ++- drivers/staging/lustre/lnet/lnet/router.c | 8 ++- drivers/staging/lustre/lnet/lnet/router_proc.c | 6 +- 11 files changed, 129 insertions(+), 83 deletions(-) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h index 078bc97a9ebf..ead8a4e1125a 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-types.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h @@ -43,6 +43,7 @@ #include #include +#include /* Max payload size */ #define LNET_MAX_PAYLOAD CONFIG_LNET_MAX_PAYLOAD @@ -252,17 +253,22 @@ struct lnet_tx_queue { struct list_head tq_delayed; /* delayed TXs */ }; +struct lnet_net { + /* network tunables */ + struct lnet_ioctl_config_lnd_cmn_tunables net_tunables; + + /* + * boolean to indicate that the tunables have been set and + * shouldn't be reset + */ + bool net_tunables_set; +}; + struct lnet_ni { spinlock_t ni_lock; struct list_head ni_list; /* chain on ln_nis */ struct list_head ni_cptlist; /* chain on ln_nis_cpt */ - int ni_maxtxcredits; /* # tx credits */ - /* # per-peer send credits */ - int ni_peertxcredits; - /* # per-peer router buffer credits */ - int ni_peerrtrcredits; - /* seconds to consider peer dead */ - int ni_peertimeout; + /* number of CPTs */ int ni_ncpts; @@ -286,6 +292,9 @@ struct lnet_ni { /* when I was last alive */ time64_t ni_last_alive; + /* pointer to parent network */ + struct lnet_net *ni_net; + /* my health status */ struct lnet_ni_status *ni_status; @@ -397,7 +406,7 @@ struct lnet_peer_table { * lnet_ni::ni_peertimeout has been set to a positive value */ #define lnet_peer_aliveness_enabled(lp) (the_lnet.ln_routing && \ - (lp)->lp_ni->ni_peertimeout > 0) + (lp)->lp_ni->ni_net->net_tunables.lct_peer_timeout > 0) struct lnet_route { struct list_head lr_list; /* chain on net */ diff --git a/drivers/staging/lustre/include/uapi/linux/lnet/lnet-dlc.h b/drivers/staging/lustre/include/uapi/linux/lnet/lnet-dlc.h index c1619f411d81..a8eb3b8f9fd7 100644 --- a/drivers/staging/lustre/include/uapi/linux/lnet/lnet-dlc.h +++ b/drivers/staging/lustre/include/uapi/linux/lnet/lnet-dlc.h @@ -39,10 +39,10 @@ struct lnet_ioctl_config_lnd_cmn_tunables { __u32 lct_version; - __u32 lct_peer_timeout; - __u32 lct_peer_tx_credits; - __u32 lct_peer_rtr_credits; - __u32 lct_max_tx_credits; + __s32 lct_peer_timeout; + __s32 lct_peer_tx_credits; + __s32 lct_peer_rtr_credits; + __s32 lct_max_tx_credits; }; struct lnet_ioctl_config_o2iblnd_tunables { diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c index f496e6fcc416..0d17e22c4401 100644 --- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c +++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c @@ -337,7 +337,7 @@ int kiblnd_create_peer(struct lnet_ni *ni, struct kib_peer **peerp, peer->ibp_error = 0; peer->ibp_last_alive = 0; peer->ibp_max_frags = kiblnd_cfg_rdma_frags(peer->ibp_ni); - peer->ibp_queue_depth = ni->ni_peertxcredits; + peer->ibp_queue_depth = ni->ni_net->net_tunables.lct_peer_tx_credits; atomic_set(&peer->ibp_refcount, 1); /* 1 ref for caller */ INIT_LIST_HEAD(&peer->ibp_list); /* not in the peer table yet */ diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_modparams.c b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_modparams.c index 39d07926d603..a1aca4dda38f 100644 --- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_modparams.c +++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_modparams.c @@ -171,7 +171,7 @@ int kiblnd_msg_queue_size(int version, struct lnet_ni *ni) if (version == IBLND_MSG_VERSION_1) return IBLND_MSG_QUEUE_SIZE_V1; else if (ni) - return ni->ni_peertxcredits; + return ni->ni_net->net_tunables.lct_peer_tx_credits; else return peer_credits; } @@ -179,6 +179,7 @@ int kiblnd_msg_queue_size(int version, struct lnet_ni *ni) int kiblnd_tunables_setup(struct lnet_ni *ni) { struct lnet_ioctl_config_o2iblnd_tunables *tunables; + struct lnet_ioctl_config_lnd_cmn_tunables *net_tunables; /* * if there was no tunables specified, setup the tunables to be @@ -204,35 +205,39 @@ int kiblnd_tunables_setup(struct lnet_ni *ni) return -EINVAL; } - if (!ni->ni_peertimeout) - ni->ni_peertimeout = peer_timeout; + net_tunables = &ni->ni_net->net_tunables; - if (!ni->ni_maxtxcredits) - ni->ni_maxtxcredits = credits; + if (net_tunables->lct_peer_timeout == -1) + net_tunables->lct_peer_timeout = peer_timeout; - if (!ni->ni_peertxcredits) - ni->ni_peertxcredits = peer_credits; + if (net_tunables->lct_max_tx_credits == -1) + net_tunables->lct_max_tx_credits = credits; - if (!ni->ni_peerrtrcredits) - ni->ni_peerrtrcredits = peer_buffer_credits; + if (net_tunables->lct_peer_tx_credits == -1) + net_tunables->lct_peer_tx_credits = peer_credits; - if (ni->ni_peertxcredits < IBLND_CREDITS_DEFAULT) - ni->ni_peertxcredits = IBLND_CREDITS_DEFAULT; + if (net_tunables->lct_peer_rtr_credits == -1) + net_tunables->lct_peer_rtr_credits = peer_buffer_credits; - if (ni->ni_peertxcredits > IBLND_CREDITS_MAX) - ni->ni_peertxcredits = IBLND_CREDITS_MAX; + if (net_tunables->lct_peer_tx_credits < IBLND_CREDITS_DEFAULT) + net_tunables->lct_peer_tx_credits = IBLND_CREDITS_DEFAULT; - if (ni->ni_peertxcredits > credits) - ni->ni_peertxcredits = credits; + if (net_tunables->lct_peer_tx_credits > IBLND_CREDITS_MAX) + net_tunables->lct_peer_tx_credits = IBLND_CREDITS_MAX; + + if (net_tunables->lct_peer_tx_credits > + net_tunables->lct_max_tx_credits) + net_tunables->lct_peer_tx_credits = + net_tunables->lct_max_tx_credits; if (!tunables->lnd_peercredits_hiw) tunables->lnd_peercredits_hiw = peer_credits_hiw; - if (tunables->lnd_peercredits_hiw < ni->ni_peertxcredits / 2) - tunables->lnd_peercredits_hiw = ni->ni_peertxcredits / 2; + if (tunables->lnd_peercredits_hiw < net_tunables->lct_peer_tx_credits / 2) + tunables->lnd_peercredits_hiw = net_tunables->lct_peer_tx_credits / 2; - if (tunables->lnd_peercredits_hiw >= ni->ni_peertxcredits) - tunables->lnd_peercredits_hiw = ni->ni_peertxcredits - 1; + if (tunables->lnd_peercredits_hiw >= net_tunables->lct_peer_tx_credits) + tunables->lnd_peercredits_hiw = net_tunables->lct_peer_tx_credits - 1; if (tunables->lnd_map_on_demand <= 0 || tunables->lnd_map_on_demand > IBLND_MAX_RDMA_FRAGS) { @@ -252,21 +257,23 @@ int kiblnd_tunables_setup(struct lnet_ni *ni) if (tunables->lnd_map_on_demand > 0 && tunables->lnd_map_on_demand <= IBLND_MAX_RDMA_FRAGS / 8) { tunables->lnd_concurrent_sends = - ni->ni_peertxcredits * 2; + net_tunables->lct_peer_tx_credits * 2; } else { - tunables->lnd_concurrent_sends = ni->ni_peertxcredits; + tunables->lnd_concurrent_sends = + net_tunables->lct_peer_tx_credits; } } - if (tunables->lnd_concurrent_sends > ni->ni_peertxcredits * 2) - tunables->lnd_concurrent_sends = ni->ni_peertxcredits * 2; + if (tunables->lnd_concurrent_sends > net_tunables->lct_peer_tx_credits * 2) + tunables->lnd_concurrent_sends = net_tunables->lct_peer_tx_credits * 2; - if (tunables->lnd_concurrent_sends < ni->ni_peertxcredits / 2) - tunables->lnd_concurrent_sends = ni->ni_peertxcredits / 2; + if (tunables->lnd_concurrent_sends < net_tunables->lct_peer_tx_credits / 2) + tunables->lnd_concurrent_sends = net_tunables->lct_peer_tx_credits / 2; - if (tunables->lnd_concurrent_sends < ni->ni_peertxcredits) { + if (tunables->lnd_concurrent_sends < net_tunables->lct_peer_tx_credits) { CWARN("Concurrent sends %d is lower than message queue size: %d, performance may drop slightly.\n", - tunables->lnd_concurrent_sends, ni->ni_peertxcredits); + tunables->lnd_concurrent_sends, + net_tunables->lct_peer_tx_credits); } if (!tunables->lnd_fmr_pool_size) diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c index 4dde158451ea..4ad885f10235 100644 --- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c +++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c @@ -2739,12 +2739,19 @@ ksocknal_startup(struct lnet_ni *ni) goto fail_0; spin_lock_init(&net->ksnn_lock); - net->ksnn_incarnation = ktime_get_real_ns(); - ni->ni_data = net; - ni->ni_peertimeout = *ksocknal_tunables.ksnd_peertimeout; - ni->ni_maxtxcredits = *ksocknal_tunables.ksnd_credits; - ni->ni_peertxcredits = *ksocknal_tunables.ksnd_peertxcredits; - ni->ni_peerrtrcredits = *ksocknal_tunables.ksnd_peerrtrcredits; + net->ksnn_incarnation = ktime_get_real_ns(); + ni->ni_data = net; + if (!ni->ni_net->net_tunables_set) { + ni->ni_net->net_tunables.lct_peer_timeout = + *ksocknal_tunables.ksnd_peertimeout; + ni->ni_net->net_tunables.lct_max_tx_credits = + *ksocknal_tunables.ksnd_credits; + ni->ni_net->net_tunables.lct_peer_tx_credits = + *ksocknal_tunables.ksnd_peertxcredits; + ni->ni_net->net_tunables.lct_peer_rtr_credits = + *ksocknal_tunables.ksnd_peerrtrcredits; + ni->ni_net->net_tunables_set = true; + } net->ksnn_ninterfaces = 0; if (!ni->ni_interfaces[0]) { diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index f9fcce2a5643..cd4189fa7acb 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -1036,11 +1036,11 @@ lnet_ni_tq_credits(struct lnet_ni *ni) LASSERT(ni->ni_ncpts >= 1); if (ni->ni_ncpts == 1) - return ni->ni_maxtxcredits; + return ni->ni_net->net_tunables.lct_max_tx_credits; - credits = ni->ni_maxtxcredits / ni->ni_ncpts; - credits = max(credits, 8 * ni->ni_peertxcredits); - credits = min(credits, ni->ni_maxtxcredits); + credits = ni->ni_net->net_tunables.lct_max_tx_credits / ni->ni_ncpts; + credits = max(credits, 8 * ni->ni_net->net_tunables.lct_peer_tx_credits); + credits = min(credits, ni->ni_net->net_tunables.lct_max_tx_credits); return credits; } @@ -1271,16 +1271,16 @@ lnet_startup_lndni(struct lnet_ni *ni, struct lnet_ioctl_config_data *conf) */ if (conf) { if (conf->cfg_config_u.cfg_net.net_peer_rtr_credits >= 0) - ni->ni_peerrtrcredits = + ni->ni_net->net_tunables.lct_peer_rtr_credits = conf->cfg_config_u.cfg_net.net_peer_rtr_credits; if (conf->cfg_config_u.cfg_net.net_peer_timeout >= 0) - ni->ni_peertimeout = + ni->ni_net->net_tunables.lct_peer_timeout = conf->cfg_config_u.cfg_net.net_peer_timeout; if (conf->cfg_config_u.cfg_net.net_peer_tx_credits != -1) - ni->ni_peertxcredits = + ni->ni_net->net_tunables.lct_peer_tx_credits = conf->cfg_config_u.cfg_net.net_peer_tx_credits; if (conf->cfg_config_u.cfg_net.net_max_tx_credits >= 0) - ni->ni_maxtxcredits = + ni->ni_net->net_tunables.lct_max_tx_credits = conf->cfg_config_u.cfg_net.net_max_tx_credits; } @@ -1297,8 +1297,6 @@ lnet_startup_lndni(struct lnet_ni *ni, struct lnet_ioctl_config_data *conf) goto failed0; } - LASSERT(ni->ni_peertimeout <= 0 || lnd->lnd_query); - lnet_net_lock(LNET_LOCK_EX); /* refcount for ln_nis */ lnet_ni_addref_locked(ni, 0); @@ -1314,13 +1312,18 @@ lnet_startup_lndni(struct lnet_ni *ni, struct lnet_ioctl_config_data *conf) lnet_ni_addref(ni); LASSERT(!the_lnet.ln_loni); the_lnet.ln_loni = ni; + ni->ni_net->net_tunables.lct_peer_tx_credits = 0; + ni->ni_net->net_tunables.lct_peer_rtr_credits = 0; + ni->ni_net->net_tunables.lct_max_tx_credits = 0; + ni->ni_net->net_tunables.lct_peer_timeout = 0; return 0; } - if (!ni->ni_peertxcredits || !ni->ni_maxtxcredits) { + if (!ni->ni_net->net_tunables.lct_peer_tx_credits || + !ni->ni_net->net_tunables.lct_max_tx_credits) { LCONSOLE_ERROR_MSG(0x107, "LNI %s has no %scredits\n", libcfs_lnd2str(lnd->lnd_type), - !ni->ni_peertxcredits ? + !ni->ni_net->net_tunables.lct_peer_tx_credits ? "" : "per-peer "); /* * shutdown the NI since if we get here then it must've already @@ -1343,9 +1346,11 @@ lnet_startup_lndni(struct lnet_ni *ni, struct lnet_ioctl_config_data *conf) add_device_randomness(&seed, sizeof(seed)); CDEBUG(D_LNI, "Added LNI %s [%d/%d/%d/%d]\n", - libcfs_nid2str(ni->ni_nid), ni->ni_peertxcredits, + libcfs_nid2str(ni->ni_nid), + ni->ni_net->net_tunables.lct_peer_tx_credits, lnet_ni_tq_credits(ni) * LNET_CPT_NUMBER, - ni->ni_peerrtrcredits, ni->ni_peertimeout); + ni->ni_net->net_tunables.lct_peer_rtr_credits, + ni->ni_net->net_tunables.lct_peer_timeout); return 0; failed0: @@ -1667,10 +1672,14 @@ lnet_fill_ni_info(struct lnet_ni *ni, struct lnet_ioctl_config_data *config) } config->cfg_nid = ni->ni_nid; - config->cfg_config_u.cfg_net.net_peer_timeout = ni->ni_peertimeout; - config->cfg_config_u.cfg_net.net_max_tx_credits = ni->ni_maxtxcredits; - config->cfg_config_u.cfg_net.net_peer_tx_credits = ni->ni_peertxcredits; - config->cfg_config_u.cfg_net.net_peer_rtr_credits = ni->ni_peerrtrcredits; + config->cfg_config_u.cfg_net.net_peer_timeout = + ni->ni_net->net_tunables.lct_peer_timeout; + config->cfg_config_u.cfg_net.net_max_tx_credits = + ni->ni_net->net_tunables.lct_max_tx_credits; + config->cfg_config_u.cfg_net.net_peer_tx_credits = + ni->ni_net->net_tunables.lct_peer_tx_credits; + config->cfg_config_u.cfg_net.net_peer_rtr_credits = + ni->ni_net->net_tunables.lct_peer_rtr_credits; net_config->ni_status = ni->ni_status->ns_status; diff --git a/drivers/staging/lustre/lnet/lnet/config.c b/drivers/staging/lustre/lnet/lnet/config.c index 091c4f714e84..86a53854e427 100644 --- a/drivers/staging/lustre/lnet/lnet/config.c +++ b/drivers/staging/lustre/lnet/lnet/config.c @@ -114,29 +114,38 @@ lnet_ni_free(struct lnet_ni *ni) if (ni->ni_net_ns) put_net(ni->ni_net_ns); + kvfree(ni->ni_net); kfree(ni); } struct lnet_ni * -lnet_ni_alloc(__u32 net, struct cfs_expr_list *el, struct list_head *nilist) +lnet_ni_alloc(__u32 net_id, struct cfs_expr_list *el, struct list_head *nilist) { struct lnet_tx_queue *tq; struct lnet_ni *ni; int rc; int i; + struct lnet_net *net; - if (!lnet_net_unique(net, nilist)) { + if (!lnet_net_unique(net_id, nilist)) { LCONSOLE_ERROR_MSG(0x111, "Duplicate network specified: %s\n", - libcfs_net2str(net)); + libcfs_net2str(net_id)); return NULL; } ni = kzalloc(sizeof(*ni), GFP_NOFS); - if (!ni) { + net = kzalloc(sizeof(*net), GFP_NOFS); + if (!ni || !net) { + kfree(ni); kfree(net); CERROR("Out of memory creating network %s\n", - libcfs_net2str(net)); + libcfs_net2str(net_id)); return NULL; } + /* initialize global paramters to undefiend */ + net->net_tunables.lct_peer_timeout = -1; + net->net_tunables.lct_max_tx_credits = -1; + net->net_tunables.lct_peer_tx_credits = -1; + net->net_tunables.lct_peer_rtr_credits = -1; spin_lock_init(&ni->ni_lock); INIT_LIST_HEAD(&ni->ni_cptlist); @@ -160,7 +169,7 @@ lnet_ni_alloc(__u32 net, struct cfs_expr_list *el, struct list_head *nilist) rc = cfs_expr_list_values(el, LNET_CPT_NUMBER, &ni->ni_cpts); if (rc <= 0) { CERROR("Failed to set CPTs for NI %s: %d\n", - libcfs_net2str(net), rc); + libcfs_net2str(net_id), rc); goto failed; } @@ -173,8 +182,9 @@ lnet_ni_alloc(__u32 net, struct cfs_expr_list *el, struct list_head *nilist) ni->ni_ncpts = rc; } + ni->ni_net = net; /* LND will fill in the address part of the NID */ - ni->ni_nid = LNET_MKNID(net, 0); + ni->ni_nid = LNET_MKNID(net_id, 0); /* Store net namespace in which current ni is being created */ if (current->nsproxy->net_ns) diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c index edcafac055ed..f186e6a16d34 100644 --- a/drivers/staging/lustre/lnet/lnet/lib-move.c +++ b/drivers/staging/lustre/lnet/lnet/lib-move.c @@ -524,7 +524,8 @@ lnet_peer_is_alive(struct lnet_peer *lp, unsigned long now) lp->lp_timestamp >= lp->lp_last_alive) return 0; - deadline = lp->lp_last_alive + lp->lp_ni->ni_peertimeout; + deadline = lp->lp_last_alive + + lp->lp_ni->ni_net->net_tunables.lct_peer_timeout; alive = deadline > now; /* Update obsolete lp_alive except for routers assumed to be dead @@ -569,7 +570,7 @@ lnet_peer_alive_locked(struct lnet_peer *lp) libcfs_nid2str(lp->lp_nid), now, next_query, lnet_queryinterval, - lp->lp_ni->ni_peertimeout); + lp->lp_ni->ni_net->net_tunables.lct_peer_timeout); return 0; } } diff --git a/drivers/staging/lustre/lnet/lnet/peer.c b/drivers/staging/lustre/lnet/lnet/peer.c index d9452c322e4d..b76ac3e051d9 100644 --- a/drivers/staging/lustre/lnet/lnet/peer.c +++ b/drivers/staging/lustre/lnet/lnet/peer.c @@ -342,8 +342,8 @@ lnet_nid2peer_locked(struct lnet_peer **lpp, lnet_nid_t nid, int cpt) goto out; } - lp->lp_txcredits = lp->lp_ni->ni_peertxcredits; - lp->lp_mintxcredits = lp->lp_ni->ni_peertxcredits; + lp->lp_txcredits = lp->lp_ni->ni_net->net_tunables.lct_peer_tx_credits; + lp->lp_mintxcredits = lp->lp_ni->ni_net->net_tunables.lct_peer_tx_credits; lp->lp_rtrcredits = lnet_peer_buffer_credits(lp->lp_ni); lp->lp_minrtrcredits = lnet_peer_buffer_credits(lp->lp_ni); @@ -383,7 +383,7 @@ lnet_debug_peer(lnet_nid_t nid) CDEBUG(D_WARNING, "%-24s %4d %5s %5d %5d %5d %5d %5d %ld\n", libcfs_nid2str(lp->lp_nid), lp->lp_refcount, - aliveness, lp->lp_ni->ni_peertxcredits, + aliveness, lp->lp_ni->ni_net->net_tunables.lct_peer_tx_credits, lp->lp_rtrcredits, lp->lp_minrtrcredits, lp->lp_txcredits, lp->lp_mintxcredits, lp->lp_txqnob); @@ -438,7 +438,8 @@ lnet_get_peer_info(__u32 peer_index, __u64 *nid, *nid = lp->lp_nid; *refcount = lp->lp_refcount; - *ni_peer_tx_credits = lp->lp_ni->ni_peertxcredits; + *ni_peer_tx_credits = + lp->lp_ni->ni_net->net_tunables.lct_peer_tx_credits; *peer_tx_credits = lp->lp_txcredits; *peer_rtr_credits = lp->lp_rtrcredits; *peer_min_rtr_credits = lp->lp_mintxcredits; diff --git a/drivers/staging/lustre/lnet/lnet/router.c b/drivers/staging/lustre/lnet/lnet/router.c index 02241fbc9eaa..7d61c5d71426 100644 --- a/drivers/staging/lustre/lnet/lnet/router.c +++ b/drivers/staging/lustre/lnet/lnet/router.c @@ -57,9 +57,11 @@ MODULE_PARM_DESC(auto_down, "Automatically mark peers down on comms error"); int lnet_peer_buffer_credits(struct lnet_ni *ni) { + struct lnet_net *net = ni->ni_net; + /* NI option overrides LNet default */ - if (ni->ni_peerrtrcredits > 0) - return ni->ni_peerrtrcredits; + if (net->net_tunables.lct_peer_rtr_credits > 0) + return net->net_tunables.lct_peer_rtr_credits; if (peer_buffer_credits > 0) return peer_buffer_credits; @@ -67,7 +69,7 @@ lnet_peer_buffer_credits(struct lnet_ni *ni) * As an approximation, allow this peer the same number of router * buffers as it is allowed outstanding sends */ - return ni->ni_peertxcredits; + return net->net_tunables.lct_peer_tx_credits; } /* forward ref's */ diff --git a/drivers/staging/lustre/lnet/lnet/router_proc.c b/drivers/staging/lustre/lnet/lnet/router_proc.c index 31f4982f7f17..19cea7076057 100644 --- a/drivers/staging/lustre/lnet/lnet/router_proc.c +++ b/drivers/staging/lustre/lnet/lnet/router_proc.c @@ -489,7 +489,7 @@ static int proc_lnet_peers(struct ctl_table *table, int write, int nrefs = peer->lp_refcount; time64_t lastalive = -1; char *aliveness = "NA"; - int maxcr = peer->lp_ni->ni_peertxcredits; + int maxcr = peer->lp_ni->ni_net->net_tunables.lct_peer_tx_credits; int txcr = peer->lp_txcredits; int mintxcr = peer->lp_mintxcredits; int rtrcr = peer->lp_rtrcredits; @@ -704,8 +704,8 @@ static int proc_lnet_nis(struct ctl_table *table, int write, "%-24s %6s %5lld %4d %4d %4d %5d %5d %5d\n", libcfs_nid2str(ni->ni_nid), stat, last_alive, *ni->ni_refs[i], - ni->ni_peertxcredits, - ni->ni_peerrtrcredits, + ni->ni_net->net_tunables.lct_peer_tx_credits, + ni->ni_net->net_tunables.lct_peer_rtr_credits, tq->tq_credits_max, tq->tq_credits, tq->tq_credits_min); From patchwork Fri Sep 7 00:49:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10591335 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C80A4112B for ; Fri, 7 Sep 2018 00:51:59 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B990A2B17D for ; Fri, 7 Sep 2018 00:51:59 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id ADBED2B181; Fri, 7 Sep 2018 00:51:59 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 08FE92B17D for ; Fri, 7 Sep 2018 00:51:59 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 732914E2FF8; Thu, 6 Sep 2018 17:51:58 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 84B504E2FDC for ; Thu, 6 Sep 2018 17:51:56 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id BB8B6AED7; Fri, 7 Sep 2018 00:51:55 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Fri, 07 Sep 2018 10:49:31 +1000 Message-ID: <153628137133.8267.15885218437939976879.stgit@noble> In-Reply-To: <153628058697.8267.6056114844033479774.stgit@noble> References: <153628058697.8267.6056114844033479774.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 03/34] lnet: struct lnet_ni: move ni_lnd to lnet_net X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP Also make some other minor changes to the structures. This is part of 8cbb8cd3e771e7f7e0f99cafc19fad32770dc015 LU-7734 lnet: Multi-Rail local NI split Signed-off-by: NeilBrown Acked-by: James Simmons Reviewed-by: James Simmons --- .../staging/lustre/include/linux/lnet/lib-types.h | 13 ++++++++----- .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c | 2 +- .../staging/lustre/lnet/klnds/socklnd/socklnd.c | 2 +- drivers/staging/lustre/lnet/lnet/acceptor.c | 4 ++-- drivers/staging/lustre/lnet/lnet/api-ni.c | 16 ++++++++-------- drivers/staging/lustre/lnet/lnet/lib-move.c | 16 ++++++++-------- drivers/staging/lustre/lnet/lnet/lo.c | 2 +- drivers/staging/lustre/lnet/lnet/router.c | 10 +++++----- drivers/staging/lustre/lnet/lnet/router_proc.c | 2 +- 9 files changed, 35 insertions(+), 32 deletions(-) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h index ead8a4e1125a..e170eb07a5bf 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-types.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h @@ -262,12 +262,17 @@ struct lnet_net { * shouldn't be reset */ bool net_tunables_set; + /* procedural interface */ + struct lnet_lnd *net_lnd; }; struct lnet_ni { - spinlock_t ni_lock; - struct list_head ni_list; /* chain on ln_nis */ - struct list_head ni_cptlist; /* chain on ln_nis_cpt */ + /* chain on ln_nis */ + struct list_head ni_list; + /* chain on ln_nis_cpt */ + struct list_head ni_cptlist; + + spinlock_t ni_lock; /* number of CPTs */ int ni_ncpts; @@ -281,8 +286,6 @@ struct lnet_ni { /* instance-specific data */ void *ni_data; - struct lnet_lnd *ni_lnd; /* procedural interface */ - /* percpt TX queues */ struct lnet_tx_queue **ni_tx_queues; diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c index 0d17e22c4401..5e1592b398c1 100644 --- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c +++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c @@ -2830,7 +2830,7 @@ static int kiblnd_startup(struct lnet_ni *ni) int rc; int newdev; - LASSERT(ni->ni_lnd == &the_o2iblnd); + LASSERT(ni->ni_net->net_lnd == &the_o2iblnd); if (kiblnd_data.kib_init == IBLND_INIT_NOTHING) { rc = kiblnd_base_startup(); diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c index 4ad885f10235..2036a0ae5917 100644 --- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c +++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c @@ -2726,7 +2726,7 @@ ksocknal_startup(struct lnet_ni *ni) int rc; int i; - LASSERT(ni->ni_lnd == &the_ksocklnd); + LASSERT(ni->ni_net->net_lnd == &the_ksocklnd); if (ksocknal_data.ksnd_init == SOCKNAL_INIT_NOTHING) { rc = ksocknal_base_startup(); diff --git a/drivers/staging/lustre/lnet/lnet/acceptor.c b/drivers/staging/lustre/lnet/lnet/acceptor.c index 3ae3ca1311a1..f8c921f0221c 100644 --- a/drivers/staging/lustre/lnet/lnet/acceptor.c +++ b/drivers/staging/lustre/lnet/lnet/acceptor.c @@ -306,7 +306,7 @@ lnet_accept(struct socket *sock, __u32 magic) return -EPERM; } - if (!ni->ni_lnd->lnd_accept) { + if (!ni->ni_net->net_lnd->lnd_accept) { /* This catches a request for the loopback LND */ lnet_ni_decref(ni); LCONSOLE_ERROR_MSG(0x121, "Refusing connection from %pI4h for %s: NI doesn not accept IP connections\n", @@ -317,7 +317,7 @@ lnet_accept(struct socket *sock, __u32 magic) CDEBUG(D_NET, "Accept %s from %pI4h\n", libcfs_nid2str(cr.acr_nid), &peer_ip); - rc = ni->ni_lnd->lnd_accept(ni, sock); + rc = ni->ni_net->net_lnd->lnd_accept(ni, sock); lnet_ni_decref(ni); return rc; diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index cd4189fa7acb..0896e75bc3d7 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -799,7 +799,7 @@ lnet_count_acceptor_nis(void) cpt = lnet_net_lock_current(); list_for_each_entry(ni, &the_lnet.ln_nis, ni_list) { - if (ni->ni_lnd->lnd_accept) + if (ni->ni_net->net_lnd->lnd_accept) count++; } @@ -1098,13 +1098,13 @@ lnet_clear_zombies_nis_locked(void) continue; } - ni->ni_lnd->lnd_refcount--; + ni->ni_net->net_lnd->lnd_refcount--; lnet_net_unlock(LNET_LOCK_EX); - islo = ni->ni_lnd->lnd_type == LOLND; + islo = ni->ni_net->net_lnd->lnd_type == LOLND; LASSERT(!in_interrupt()); - ni->ni_lnd->lnd_shutdown(ni); + ni->ni_net->net_lnd->lnd_shutdown(ni); /* * can't deref lnd anymore now; it might have unregistered @@ -1248,7 +1248,7 @@ lnet_startup_lndni(struct lnet_ni *ni, struct lnet_ioctl_config_data *conf) lnd->lnd_refcount++; lnet_net_unlock(LNET_LOCK_EX); - ni->ni_lnd = lnd; + ni->ni_net->net_lnd = lnd; if (conf && conf->cfg_hdr.ioc_len > sizeof(*conf)) lnd_tunables = (struct lnet_ioctl_config_lnd_tunables *)conf->cfg_bulk; @@ -1794,7 +1794,7 @@ lnet_dyn_add_ni(lnet_pid_t requested_pid, struct lnet_ioctl_config_data *conf) if (rc) goto failed1; - if (ni->ni_lnd->lnd_accept) { + if (ni->ni_net->net_lnd->lnd_accept) { rc = lnet_acceptor_start(); if (rc < 0) { /* shutdown the ni that we just started */ @@ -2074,10 +2074,10 @@ LNetCtl(unsigned int cmd, void *arg) if (!ni) return -EINVAL; - if (!ni->ni_lnd->lnd_ctl) + if (!ni->ni_net->net_lnd->lnd_ctl) rc = -EINVAL; else - rc = ni->ni_lnd->lnd_ctl(ni, cmd, arg); + rc = ni->ni_net->net_lnd->lnd_ctl(ni, cmd, arg); lnet_ni_decref(ni); return rc; diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c index f186e6a16d34..1bf12af87a20 100644 --- a/drivers/staging/lustre/lnet/lnet/lib-move.c +++ b/drivers/staging/lustre/lnet/lnet/lib-move.c @@ -406,7 +406,7 @@ lnet_ni_recv(struct lnet_ni *ni, void *private, struct lnet_msg *msg, iov_iter_bvec(&to, ITER_BVEC | READ, kiov, niov, mlen + offset); iov_iter_advance(&to, offset); } - rc = ni->ni_lnd->lnd_recv(ni, private, msg, delayed, &to, rlen); + rc = ni->ni_net->net_lnd->lnd_recv(ni, private, msg, delayed, &to, rlen); if (rc < 0) lnet_finalize(ni, msg, rc); } @@ -461,7 +461,7 @@ lnet_ni_send(struct lnet_ni *ni, struct lnet_msg *msg) LASSERT(LNET_NETTYP(LNET_NIDNET(ni->ni_nid)) == LOLND || (msg->msg_txcredit && msg->msg_peertxcredit)); - rc = ni->ni_lnd->lnd_send(ni, priv, msg); + rc = ni->ni_net->net_lnd->lnd_send(ni, priv, msg); if (rc < 0) lnet_finalize(ni, msg, rc); } @@ -474,10 +474,10 @@ lnet_ni_eager_recv(struct lnet_ni *ni, struct lnet_msg *msg) LASSERT(!msg->msg_sending); LASSERT(msg->msg_receiving); LASSERT(!msg->msg_rx_ready_delay); - LASSERT(ni->ni_lnd->lnd_eager_recv); + LASSERT(ni->ni_net->net_lnd->lnd_eager_recv); msg->msg_rx_ready_delay = 1; - rc = ni->ni_lnd->lnd_eager_recv(ni, msg->msg_private, msg, + rc = ni->ni_net->net_lnd->lnd_eager_recv(ni, msg->msg_private, msg, &msg->msg_private); if (rc) { CERROR("recv from %s / send to %s aborted: eager_recv failed %d\n", @@ -496,10 +496,10 @@ lnet_ni_query_locked(struct lnet_ni *ni, struct lnet_peer *lp) time64_t last_alive = 0; LASSERT(lnet_peer_aliveness_enabled(lp)); - LASSERT(ni->ni_lnd->lnd_query); + LASSERT(ni->ni_net->net_lnd->lnd_query); lnet_net_unlock(lp->lp_cpt); - ni->ni_lnd->lnd_query(ni, lp->lp_nid, &last_alive); + ni->ni_net->net_lnd->lnd_query(ni, lp->lp_nid, &last_alive); lnet_net_lock(lp->lp_cpt); lp->lp_last_query = ktime_get_seconds(); @@ -1287,7 +1287,7 @@ lnet_parse_put(struct lnet_ni *ni, struct lnet_msg *msg) info.mi_roffset = hdr->msg.put.offset; info.mi_mbits = hdr->msg.put.match_bits; - msg->msg_rx_ready_delay = !ni->ni_lnd->lnd_eager_recv; + msg->msg_rx_ready_delay = !ni->ni_net->net_lnd->lnd_eager_recv; ready_delay = msg->msg_rx_ready_delay; again: @@ -1518,7 +1518,7 @@ lnet_parse_forward_locked(struct lnet_ni *ni, struct lnet_msg *msg) if (msg->msg_rxpeer->lp_rtrcredits <= 0 || lnet_msg2bufpool(msg)->rbp_credits <= 0) { - if (!ni->ni_lnd->lnd_eager_recv) { + if (!ni->ni_net->net_lnd->lnd_eager_recv) { msg->msg_rx_ready_delay = 1; } else { lnet_net_unlock(msg->msg_rx_cpt); diff --git a/drivers/staging/lustre/lnet/lnet/lo.c b/drivers/staging/lustre/lnet/lnet/lo.c index eb14146bd879..8167980c2323 100644 --- a/drivers/staging/lustre/lnet/lnet/lo.c +++ b/drivers/staging/lustre/lnet/lnet/lo.c @@ -83,7 +83,7 @@ lolnd_shutdown(struct lnet_ni *ni) static int lolnd_startup(struct lnet_ni *ni) { - LASSERT(ni->ni_lnd == &the_lolnd); + LASSERT(ni->ni_net->net_lnd == &the_lolnd); LASSERT(!lolnd_instanced); lolnd_instanced = 1; diff --git a/drivers/staging/lustre/lnet/lnet/router.c b/drivers/staging/lustre/lnet/lnet/router.c index 7d61c5d71426..0c0ec0b27982 100644 --- a/drivers/staging/lustre/lnet/lnet/router.c +++ b/drivers/staging/lustre/lnet/lnet/router.c @@ -154,14 +154,14 @@ lnet_ni_notify_locked(struct lnet_ni *ni, struct lnet_peer *lp) lp->lp_notifylnd = 0; lp->lp_notify = 0; - if (notifylnd && ni->ni_lnd->lnd_notify) { + if (notifylnd && ni->ni_net->net_lnd->lnd_notify) { lnet_net_unlock(lp->lp_cpt); /* * A new notification could happen now; I'll handle it * when control returns to me */ - ni->ni_lnd->lnd_notify(ni, lp->lp_nid, alive); + ni->ni_net->net_lnd->lnd_notify(ni, lp->lp_nid, alive); lnet_net_lock(lp->lp_cpt); } @@ -380,8 +380,8 @@ lnet_add_route(__u32 net, __u32 hops, lnet_nid_t gateway, lnet_net_unlock(LNET_LOCK_EX); /* XXX Assume alive */ - if (ni->ni_lnd->lnd_notify) - ni->ni_lnd->lnd_notify(ni, gateway, 1); + if (ni->ni_net->net_lnd->lnd_notify) + ni->ni_net->net_lnd->lnd_notify(ni, gateway, 1); lnet_net_lock(LNET_LOCK_EX); } @@ -818,7 +818,7 @@ lnet_update_ni_status_locked(void) now = ktime_get_real_seconds(); list_for_each_entry(ni, &the_lnet.ln_nis, ni_list) { - if (ni->ni_lnd->lnd_type == LOLND) + if (ni->ni_net->net_lnd->lnd_type == LOLND) continue; if (now < ni->ni_last_alive + timeout) diff --git a/drivers/staging/lustre/lnet/lnet/router_proc.c b/drivers/staging/lustre/lnet/lnet/router_proc.c index 19cea7076057..f3ccd6a2b70e 100644 --- a/drivers/staging/lustre/lnet/lnet/router_proc.c +++ b/drivers/staging/lustre/lnet/lnet/router_proc.c @@ -674,7 +674,7 @@ static int proc_lnet_nis(struct ctl_table *table, int write, last_alive = now - ni->ni_last_alive; /* @lo forever alive */ - if (ni->ni_lnd->lnd_type == LOLND) + if (ni->ni_net->net_lnd->lnd_type == LOLND) last_alive = 0; lnet_ni_lock(ni); From patchwork Fri Sep 7 00:49:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10591337 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 20B9E921 for ; Fri, 7 Sep 2018 00:52:08 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 10E602B17D for ; Fri, 7 Sep 2018 00:52:08 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 02D472B181; Fri, 7 Sep 2018 00:52:07 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 650B92B17D for ; Fri, 7 Sep 2018 00:52:07 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 06FF44E30CB; Thu, 6 Sep 2018 17:52:07 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 863164E2FBA for ; Thu, 6 Sep 2018 17:52:04 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 9F0A7AEF1; Fri, 7 Sep 2018 00:52:03 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Fri, 07 Sep 2018 10:49:31 +1000 Message-ID: <153628137138.8267.11270892110905717079.stgit@noble> In-Reply-To: <153628058697.8267.6056114844033479774.stgit@noble> References: <153628058697.8267.6056114844033479774.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 04/34] lnet: embed lnd_tunables in lnet_ni X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP Instead of a pointer, embed the data struct. Also other related changes. This is part of 8cbb8cd3e771e7f7e0f99cafc19fad32770dc015 LU-7734 lnet: Multi-Rail local NI split Signed-off-by: NeilBrown Reviewed-by: Doug Oucharek > Signed-off-by: NeilBrown > Signed-off-by: NeilBrown <neilb@suse.com>
--- .../staging/lustre/include/linux/lnet/lib-types.h | 6 ++++ .../lustre/include/uapi/linux/lnet/lnet-dlc.h | 10 +++++-- .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c | 2 + .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h | 6 ++-- .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c | 8 +++--- .../lustre/lnet/klnds/o2iblnd/o2iblnd_modparams.c | 13 +++------- drivers/staging/lustre/lnet/lnet/api-ni.c | 27 +++++++++----------- drivers/staging/lustre/lnet/lnet/config.c | 2 - 8 files changed, 36 insertions(+), 38 deletions(-) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h index e170eb07a5bf..c5e3363de727 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-types.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h @@ -302,7 +302,11 @@ struct lnet_ni { struct lnet_ni_status *ni_status; /* per NI LND tunables */ - struct lnet_ioctl_config_lnd_tunables *ni_lnd_tunables; + struct lnet_lnd_tunables ni_lnd_tunables; + + /* lnd tunables set explicitly */ + bool ni_lnd_tunables_set; + /* * equivalent interfaces to use * This is an array because socklnd bonding can still be configured diff --git a/drivers/staging/lustre/include/uapi/linux/lnet/lnet-dlc.h b/drivers/staging/lustre/include/uapi/linux/lnet/lnet-dlc.h index a8eb3b8f9fd7..ac29f9d24d5d 100644 --- a/drivers/staging/lustre/include/uapi/linux/lnet/lnet-dlc.h +++ b/drivers/staging/lustre/include/uapi/linux/lnet/lnet-dlc.h @@ -57,11 +57,15 @@ struct lnet_ioctl_config_o2iblnd_tunables { __u16 pad; }; +struct lnet_lnd_tunables { + union { + struct lnet_ioctl_config_o2iblnd_tunables lnd_o2ib; + } lnd_tun_u; +}; + struct lnet_ioctl_config_lnd_tunables { struct lnet_ioctl_config_lnd_cmn_tunables lt_cmn; - union { - struct lnet_ioctl_config_o2iblnd_tunables lt_o2ib; - } lt_tun_u; + struct lnet_lnd_tunables lt_tun; }; struct lnet_ioctl_net_config { diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c index 5e1592b398c1..ade566d20c69 100644 --- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c +++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c @@ -2122,7 +2122,7 @@ static int kiblnd_net_init_pools(struct kib_net *net, struct lnet_ni *ni, int rc; int i; - tunables = &ni->ni_lnd_tunables->lt_tun_u.lt_o2ib; + tunables = &ni->ni_lnd_tunables.lnd_tun_u.lnd_o2ib; if (tunables->lnd_fmr_pool_size < *kiblnd_tunables.kib_ntx / 4) { CERROR("Can't set fmr pool size (%d) < ntx / 4(%d)\n", diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h index 42dc15cef194..522eb150d9a6 100644 --- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h +++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h @@ -608,7 +608,7 @@ kiblnd_cfg_rdma_frags(struct lnet_ni *ni) struct lnet_ioctl_config_o2iblnd_tunables *tunables; int mod; - tunables = &ni->ni_lnd_tunables->lt_tun_u.lt_o2ib; + tunables = &ni->ni_lnd_tunables.lnd_tun_u.lnd_o2ib; mod = tunables->lnd_map_on_demand; return mod ? mod : IBLND_MAX_RDMA_FRAGS >> IBLND_FRAG_SHIFT; } @@ -627,7 +627,7 @@ kiblnd_concurrent_sends(int version, struct lnet_ni *ni) struct lnet_ioctl_config_o2iblnd_tunables *tunables; int concurrent_sends; - tunables = &ni->ni_lnd_tunables->lt_tun_u.lt_o2ib; + tunables = &ni->ni_lnd_tunables.lnd_tun_u.lnd_o2ib; concurrent_sends = tunables->lnd_concurrent_sends; if (version == IBLND_MSG_VERSION_1) { @@ -777,7 +777,7 @@ kiblnd_need_noop(struct kib_conn *conn) struct lnet_ni *ni = conn->ibc_peer->ibp_ni; LASSERT(conn->ibc_state >= IBLND_CONN_ESTABLISHED); - tunables = &ni->ni_lnd_tunables->lt_tun_u.lt_o2ib; + tunables = &ni->ni_lnd_tunables.lnd_tun_u.lnd_o2ib; if (conn->ibc_outstanding_credits < IBLND_CREDITS_HIGHWATER(tunables, conn->ibc_version) && diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c index a8d2b4911dab..c266940cb2ae 100644 --- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c +++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c @@ -1452,7 +1452,7 @@ kiblnd_launch_tx(struct lnet_ni *ni, struct kib_tx *tx, lnet_nid_t nid) /* Brand new peer */ LASSERT(!peer->ibp_connecting); - tunables = &peer->ibp_ni->ni_lnd_tunables->lt_tun_u.lt_o2ib; + tunables = &peer->ibp_ni->ni_lnd_tunables.lnd_tun_u.lnd_o2ib; peer->ibp_connecting = tunables->lnd_conns_per_peer; /* always called with a ref on ni, which prevents ni being shutdown */ @@ -2592,14 +2592,14 @@ kiblnd_check_reconnect(struct kib_conn *conn, int version, break; case IBLND_REJECT_RDMA_FRAGS: { - struct lnet_ioctl_config_lnd_tunables *tunables; + struct lnet_ioctl_config_o2iblnd_tunables *tunables; if (!cp) { reason = "can't negotiate max frags"; goto out; } - tunables = peer->ibp_ni->ni_lnd_tunables; - if (!tunables->lt_tun_u.lt_o2ib.lnd_map_on_demand) { + tunables = &peer->ibp_ni->ni_lnd_tunables.lnd_tun_u.lnd_o2ib; + if (!tunables->lnd_map_on_demand) { reason = "map_on_demand must be enabled"; goto out; } diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_modparams.c b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_modparams.c index a1aca4dda38f..5117594f38fb 100644 --- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_modparams.c +++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_modparams.c @@ -185,16 +185,11 @@ int kiblnd_tunables_setup(struct lnet_ni *ni) * if there was no tunables specified, setup the tunables to be * defaulted */ - if (!ni->ni_lnd_tunables) { - ni->ni_lnd_tunables = kzalloc(sizeof(*ni->ni_lnd_tunables), - GFP_NOFS); - if (!ni->ni_lnd_tunables) - return -ENOMEM; - - memcpy(&ni->ni_lnd_tunables->lt_tun_u.lt_o2ib, + if (!ni->ni_lnd_tunables_set) + memcpy(&ni->ni_lnd_tunables.lnd_tun_u.lnd_o2ib, &default_tunables, sizeof(*tunables)); - } - tunables = &ni->ni_lnd_tunables->lt_tun_u.lt_o2ib; + + tunables = &ni->ni_lnd_tunables.lnd_tun_u.lnd_o2ib; /* Current API version */ tunables->lnd_version = 0; diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index 0896e75bc3d7..c944fbb155c8 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -1198,6 +1198,7 @@ static int lnet_startup_lndni(struct lnet_ni *ni, struct lnet_ioctl_config_data *conf) { struct lnet_ioctl_config_lnd_tunables *lnd_tunables = NULL; + struct lnet_lnd_tunables *tun = NULL; int rc = -EINVAL; int lnd_type; struct lnet_lnd *lnd; @@ -1250,19 +1251,15 @@ lnet_startup_lndni(struct lnet_ni *ni, struct lnet_ioctl_config_data *conf) ni->ni_net->net_lnd = lnd; - if (conf && conf->cfg_hdr.ioc_len > sizeof(*conf)) + if (conf && conf->cfg_hdr.ioc_len > sizeof(*conf)) { lnd_tunables = (struct lnet_ioctl_config_lnd_tunables *)conf->cfg_bulk; + tun = &lnd_tunables->lt_tun; + } - if (lnd_tunables) { - ni->ni_lnd_tunables = kzalloc(sizeof(*ni->ni_lnd_tunables), - GFP_NOFS); - if (!ni->ni_lnd_tunables) { - mutex_unlock(&the_lnet.ln_lnd_mutex); - rc = -ENOMEM; - goto failed0; - } - memcpy(ni->ni_lnd_tunables, lnd_tunables, - sizeof(*ni->ni_lnd_tunables)); + if (tun) { + memcpy(&ni->ni_lnd_tunables, tun, + sizeof(*tun)); + ni->ni_lnd_tunables_set = true; } /* @@ -1702,15 +1699,15 @@ lnet_fill_ni_info(struct lnet_ni *ni, struct lnet_ioctl_config_data *config) tunable_size = config->cfg_hdr.ioc_len - min_size; /* Don't copy to much data to user space */ - min_size = min(tunable_size, sizeof(*ni->ni_lnd_tunables)); + min_size = min(tunable_size, sizeof(ni->ni_lnd_tunables)); lnd_cfg = (struct lnet_ioctl_config_lnd_tunables *)net_config->cfg_bulk; - if (ni->ni_lnd_tunables && lnd_cfg && min_size) { - memcpy(lnd_cfg, ni->ni_lnd_tunables, min_size); + if (lnd_cfg && min_size) { + memcpy(&lnd_cfg->lt_tun, &ni->ni_lnd_tunables, min_size); config->cfg_config_u.cfg_net.net_interface_count = 1; /* Tell user land that kernel side has less data */ - if (tunable_size > sizeof(*ni->ni_lnd_tunables)) { + if (tunable_size > sizeof(ni->ni_lnd_tunables)) { min_size = tunable_size - sizeof(ni->ni_lnd_tunables); config->cfg_hdr.ioc_len -= min_size; } diff --git a/drivers/staging/lustre/lnet/lnet/config.c b/drivers/staging/lustre/lnet/lnet/config.c index 86a53854e427..5646feeb433e 100644 --- a/drivers/staging/lustre/lnet/lnet/config.c +++ b/drivers/staging/lustre/lnet/lnet/config.c @@ -105,8 +105,6 @@ lnet_ni_free(struct lnet_ni *ni) if (ni->ni_cpts) cfs_expr_list_values_free(ni->ni_cpts, ni->ni_ncpts); - kfree(ni->ni_lnd_tunables); - for (i = 0; i < LNET_MAX_INTERFACES && ni->ni_interfaces[i]; i++) kfree(ni->ni_interfaces[i]); From patchwork Fri Sep 7 00:49:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10591339 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 48D80112B for ; Fri, 7 Sep 2018 00:52:20 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 38DF52B17D for ; Fri, 7 Sep 2018 00:52:20 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2C9802B181; Fri, 7 Sep 2018 00:52:20 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.3 required=2.0 tests=BAYES_00,FUZZY_AMBIEN, MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=no version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id C8B7C2B17D for ; Fri, 7 Sep 2018 00:52:18 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 579824E2FBA; Thu, 6 Sep 2018 17:52:18 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 029694E2FBA for ; Thu, 6 Sep 2018 17:52:16 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 1BF90AED7; Fri, 7 Sep 2018 00:52:15 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Fri, 07 Sep 2018 10:49:31 +1000 Message-ID: <153628137142.8267.15402125903541546660.stgit@noble> In-Reply-To: <153628058697.8267.6056114844033479774.stgit@noble> References: <153628058697.8267.6056114844033479774.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 05/34] lnet: begin separating "networks" from "network interfaces". X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP We already have "struct lnet_net" separate from "struct lnet_ni", but they are currently allocated together and freed together and it is assumed that they are 1-to-1. This patch starts breaking that assumption. We have separate lnet_net_alloc() and lnet_net_free() to alloc/free the new lnet_net, though they is currently called only when lnet_ni_alloc/free are called. The netid is now stored in the lnet_net and fetched directly from there, rather than extracting it from the net-interface-id ni_nid. The linkage between these two structures is now richer, lnet_net can link to a list of lnet_ni. lnet_net now has a list of lnet_net, so to find all the lnet_ni, we need to walk a list of lists. This need to walk a list-of-lists occurs in several places, and new helpers like lnet_get_ni_idx_locked() and lnet_get_next_ni_locked are introduced. Previously a list_head was passed to lnet_ni_alloc() for the new lnet_ni to be attached to. Now a list is passed to lnet_net_alloc() for the net to be attached to, and a lnet_net is passed to lnet_ni_alloc() for the ni to attach to. lnet_ni_alloc() also receives an interface name, but this is currently unused. This is part of 8cbb8cd3e771e7f7e0f99cafc19fad32770dc015 LU-7734 lnet: Multi-Rail local NI split Signed-off-by: NeilBrown Reviewed-by: Doug Oucharek > Signed-off-by: NeilBrown > Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-by: James Simmons --- .../staging/lustre/include/linux/lnet/lib-lnet.h | 15 + .../staging/lustre/include/linux/lnet/lib-types.h | 23 +- drivers/staging/lustre/lnet/lnet/acceptor.c | 2 drivers/staging/lustre/lnet/lnet/api-ni.c | 255 ++++++++++++++------ drivers/staging/lustre/lnet/lnet/config.c | 135 +++++++---- drivers/staging/lustre/lnet/lnet/lib-move.c | 6 drivers/staging/lustre/lnet/lnet/router.c | 15 - drivers/staging/lustre/lnet/lnet/router_proc.c | 16 - 8 files changed, 308 insertions(+), 159 deletions(-) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h index 0fecf0d32c58..4440b87299c4 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h @@ -369,8 +369,14 @@ lnet_ni_decref(struct lnet_ni *ni) } void lnet_ni_free(struct lnet_ni *ni); +void lnet_net_free(struct lnet_net *net); + +struct lnet_net * +lnet_net_alloc(__u32 net_type, struct list_head *netlist); + struct lnet_ni * -lnet_ni_alloc(__u32 net, struct cfs_expr_list *el, struct list_head *nilist); +lnet_ni_alloc(struct lnet_net *net, struct cfs_expr_list *el, + char *iface); static inline int lnet_nid2peerhash(lnet_nid_t nid) @@ -412,6 +418,9 @@ void lnet_destroy_routes(void); int lnet_get_route(int idx, __u32 *net, __u32 *hops, lnet_nid_t *gateway, __u32 *alive, __u32 *priority); int lnet_get_rtr_pool_cfg(int idx, struct lnet_ioctl_pool_cfg *pool_cfg); +struct lnet_ni *lnet_get_next_ni_locked(struct lnet_net *mynet, + struct lnet_ni *prev); +struct lnet_ni *lnet_get_ni_idx_locked(int idx); void lnet_router_debugfs_init(void); void lnet_router_debugfs_fini(void); @@ -584,7 +593,7 @@ int lnet_connect(struct socket **sockp, lnet_nid_t peer_nid, __u32 local_ip, __u32 peer_ip, int peer_port); void lnet_connect_console_error(int rc, lnet_nid_t peer_nid, __u32 peer_ip, int port); -int lnet_count_acceptor_nis(void); +int lnet_count_acceptor_nets(void); int lnet_acceptor_timeout(void); int lnet_acceptor_port(void); @@ -618,7 +627,7 @@ void lnet_swap_pinginfo(struct lnet_ping_info *info); int lnet_parse_ip2nets(char **networksp, char *ip2nets); int lnet_parse_routes(char *route_str, int *im_a_router); int lnet_parse_networks(struct list_head *nilist, char *networks); -int lnet_net_unique(__u32 net, struct list_head *nilist); +bool lnet_net_unique(__u32 net, struct list_head *nilist); int lnet_nid2peer_locked(struct lnet_peer **lpp, lnet_nid_t nid, int cpt); struct lnet_peer *lnet_find_peer_locked(struct lnet_peer_table *ptable, diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h index c5e3363de727..5f0d4703bf86 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-types.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h @@ -254,6 +254,15 @@ struct lnet_tx_queue { }; struct lnet_net { + /* chain on the ln_nets */ + struct list_head net_list; + + /* net ID, which is compoed of + * (net_type << 16) | net_num. + * net_type can be one of the enumarated types defined in + * lnet/include/lnet/nidstr.h */ + __u32 net_id; + /* network tunables */ struct lnet_ioctl_config_lnd_cmn_tunables net_tunables; @@ -264,11 +273,13 @@ struct lnet_net { bool net_tunables_set; /* procedural interface */ struct lnet_lnd *net_lnd; + /* list of NIs on this net */ + struct list_head net_ni_list; }; struct lnet_ni { - /* chain on ln_nis */ - struct list_head ni_list; + /* chain on the lnet_net structure */ + struct list_head ni_netlist; /* chain on ln_nis_cpt */ struct list_head ni_cptlist; @@ -626,14 +637,16 @@ struct lnet { /* failure simulation */ struct list_head ln_test_peers; struct list_head ln_drop_rules; - struct list_head ln_delay_rules; + struct list_head ln_delay_rules; - struct list_head ln_nis; /* LND instances */ + /* LND instances */ + struct list_head ln_nets; /* NIs bond on specific CPT(s) */ struct list_head ln_nis_cpt; /* dying LND instances */ struct list_head ln_nis_zombie; - struct lnet_ni *ln_loni; /* the loopback NI */ + /* the loopback NI */ + struct lnet_ni *ln_loni; /* remote networks with routes to them */ struct list_head *ln_remote_nets_hash; diff --git a/drivers/staging/lustre/lnet/lnet/acceptor.c b/drivers/staging/lustre/lnet/lnet/acceptor.c index f8c921f0221c..88b90c1fdbaf 100644 --- a/drivers/staging/lustre/lnet/lnet/acceptor.c +++ b/drivers/staging/lustre/lnet/lnet/acceptor.c @@ -454,7 +454,7 @@ lnet_acceptor_start(void) if (rc <= 0) return rc; - if (!lnet_count_acceptor_nis()) /* not required */ + if (lnet_count_acceptor_nets() == 0) /* not required */ return 0; task = kthread_run(lnet_acceptor, (void *)(uintptr_t)secure, diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index c944fbb155c8..05687278334a 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -537,7 +537,7 @@ lnet_prepare(lnet_pid_t requested_pid) the_lnet.ln_pid = requested_pid; INIT_LIST_HEAD(&the_lnet.ln_test_peers); - INIT_LIST_HEAD(&the_lnet.ln_nis); + INIT_LIST_HEAD(&the_lnet.ln_nets); INIT_LIST_HEAD(&the_lnet.ln_nis_cpt); INIT_LIST_HEAD(&the_lnet.ln_nis_zombie); INIT_LIST_HEAD(&the_lnet.ln_routers); @@ -616,7 +616,7 @@ lnet_unprepare(void) LASSERT(!the_lnet.ln_refcount); LASSERT(list_empty(&the_lnet.ln_test_peers)); - LASSERT(list_empty(&the_lnet.ln_nis)); + LASSERT(list_empty(&the_lnet.ln_nets)); LASSERT(list_empty(&the_lnet.ln_nis_cpt)); LASSERT(list_empty(&the_lnet.ln_nis_zombie)); @@ -648,14 +648,17 @@ lnet_unprepare(void) } struct lnet_ni * -lnet_net2ni_locked(__u32 net, int cpt) +lnet_net2ni_locked(__u32 net_id, int cpt) { - struct lnet_ni *ni; + struct lnet_ni *ni; + struct lnet_net *net; LASSERT(cpt != LNET_LOCK_EX); - list_for_each_entry(ni, &the_lnet.ln_nis, ni_list) { - if (LNET_NIDNET(ni->ni_nid) == net) { + list_for_each_entry(net, &the_lnet.ln_nets, net_list) { + if (net->net_id == net_id) { + ni = list_entry(net->net_ni_list.next, struct lnet_ni, + ni_netlist); lnet_ni_addref_locked(ni, cpt); return ni; } @@ -760,14 +763,17 @@ lnet_islocalnet(__u32 net) struct lnet_ni * lnet_nid2ni_locked(lnet_nid_t nid, int cpt) { - struct lnet_ni *ni; + struct lnet_net *net; + struct lnet_ni *ni; LASSERT(cpt != LNET_LOCK_EX); - list_for_each_entry(ni, &the_lnet.ln_nis, ni_list) { - if (ni->ni_nid == nid) { - lnet_ni_addref_locked(ni, cpt); - return ni; + list_for_each_entry(net, &the_lnet.ln_nets, net_list) { + list_for_each_entry(ni, &net->net_ni_list, ni_netlist) { + if (ni->ni_nid == nid) { + lnet_ni_addref_locked(ni, cpt); + return ni; + } } } @@ -790,16 +796,18 @@ lnet_islocalnid(lnet_nid_t nid) } int -lnet_count_acceptor_nis(void) +lnet_count_acceptor_nets(void) { /* Return the # of NIs that need the acceptor. */ - int count = 0; - struct lnet_ni *ni; - int cpt; + int count = 0; + struct lnet_net *net; + int cpt; cpt = lnet_net_lock_current(); - list_for_each_entry(ni, &the_lnet.ln_nis, ni_list) { - if (ni->ni_net->net_lnd->lnd_accept) + list_for_each_entry(net, &the_lnet.ln_nets, net_list) { + /* all socklnd type networks should have the acceptor + * thread started */ + if (net->net_lnd->lnd_accept) count++; } @@ -832,13 +840,16 @@ lnet_ping_info_create(int num_ni) static inline int lnet_get_ni_count(void) { - struct lnet_ni *ni; - int count = 0; + struct lnet_ni *ni; + struct lnet_net *net; + int count = 0; lnet_net_lock(0); - list_for_each_entry(ni, &the_lnet.ln_nis, ni_list) - count++; + list_for_each_entry(net, &the_lnet.ln_nets, net_list) { + list_for_each_entry(ni, &net->net_ni_list, ni_netlist) + count++; + } lnet_net_unlock(0); @@ -854,14 +865,17 @@ lnet_ping_info_free(struct lnet_ping_info *pinfo) static void lnet_ping_info_destroy(void) { + struct lnet_net *net; struct lnet_ni *ni; lnet_net_lock(LNET_LOCK_EX); - list_for_each_entry(ni, &the_lnet.ln_nis, ni_list) { - lnet_ni_lock(ni); - ni->ni_status = NULL; - lnet_ni_unlock(ni); + list_for_each_entry(net, &the_lnet.ln_nets, net_list) { + list_for_each_entry(ni, &net->net_ni_list, ni_netlist) { + lnet_ni_lock(ni); + ni->ni_status = NULL; + lnet_ni_unlock(ni); + } } lnet_ping_info_free(the_lnet.ln_ping_info); @@ -963,24 +977,28 @@ lnet_ping_md_unlink(struct lnet_ping_info *pinfo, static void lnet_ping_info_install_locked(struct lnet_ping_info *ping_info) { + int i = 0; struct lnet_ni_status *ns; struct lnet_ni *ni; - int i = 0; + struct lnet_net *net; - list_for_each_entry(ni, &the_lnet.ln_nis, ni_list) { - LASSERT(i < ping_info->pi_nnis); + list_for_each_entry(net, &the_lnet.ln_nets, net_list) { + list_for_each_entry(ni, &net->net_ni_list, ni_netlist) { + LASSERT(i < ping_info->pi_nnis); - ns = &ping_info->pi_ni[i]; + ns = &ping_info->pi_ni[i]; - ns->ns_nid = ni->ni_nid; + ns->ns_nid = ni->ni_nid; - lnet_ni_lock(ni); - ns->ns_status = (ni->ni_status) ? - ni->ni_status->ns_status : LNET_NI_STATUS_UP; - ni->ni_status = ns; - lnet_ni_unlock(ni); + lnet_ni_lock(ni); + ns->ns_status = ni->ni_status ? + ni->ni_status->ns_status : + LNET_NI_STATUS_UP; + ni->ni_status = ns; + lnet_ni_unlock(ni); - i++; + i++; + } } } @@ -1054,9 +1072,9 @@ lnet_ni_unlink_locked(struct lnet_ni *ni) } /* move it to zombie list and nobody can find it anymore */ - LASSERT(!list_empty(&ni->ni_list)); - list_move(&ni->ni_list, &the_lnet.ln_nis_zombie); - lnet_ni_decref_locked(ni, 0); /* drop ln_nis' ref */ + LASSERT(!list_empty(&ni->ni_netlist)); + list_move(&ni->ni_netlist, &the_lnet.ln_nis_zombie); + lnet_ni_decref_locked(ni, 0); } static void @@ -1076,17 +1094,17 @@ lnet_clear_zombies_nis_locked(void) int j; ni = list_entry(the_lnet.ln_nis_zombie.next, - struct lnet_ni, ni_list); - list_del_init(&ni->ni_list); + struct lnet_ni, ni_netlist); + list_del_init(&ni->ni_netlist); cfs_percpt_for_each(ref, j, ni->ni_refs) { if (!*ref) continue; /* still busy, add it back to zombie list */ - list_add(&ni->ni_list, &the_lnet.ln_nis_zombie); + list_add(&ni->ni_netlist, &the_lnet.ln_nis_zombie); break; } - if (!list_empty(&ni->ni_list)) { + if (!list_empty(&ni->ni_netlist)) { lnet_net_unlock(LNET_LOCK_EX); ++i; if ((i & (-i)) == i) { @@ -1126,6 +1144,7 @@ lnet_shutdown_lndnis(void) { struct lnet_ni *ni; int i; + struct lnet_net *net; /* NB called holding the global mutex */ @@ -1138,10 +1157,14 @@ lnet_shutdown_lndnis(void) the_lnet.ln_shutdown = 1; /* flag shutdown */ /* Unlink NIs from the global table */ - while (!list_empty(&the_lnet.ln_nis)) { - ni = list_entry(the_lnet.ln_nis.next, - struct lnet_ni, ni_list); - lnet_ni_unlink_locked(ni); + while (!list_empty(&the_lnet.ln_nets)) { + net = list_entry(the_lnet.ln_nets.next, + struct lnet_net, net_list); + while (!list_empty(&net->net_ni_list)) { + ni = list_entry(net->net_ni_list.next, + struct lnet_ni, ni_netlist); + lnet_ni_unlink_locked(ni); + } } /* Drop the cached loopback NI. */ @@ -1212,7 +1235,7 @@ lnet_startup_lndni(struct lnet_ni *ni, struct lnet_ioctl_config_data *conf) /* Make sure this new NI is unique. */ lnet_net_lock(LNET_LOCK_EX); - rc = lnet_net_unique(LNET_NIDNET(ni->ni_nid), &the_lnet.ln_nis); + rc = lnet_net_unique(LNET_NIDNET(ni->ni_nid), &the_lnet.ln_nets); lnet_net_unlock(LNET_LOCK_EX); if (!rc) { if (lnd_type == LOLND) { @@ -1297,7 +1320,7 @@ lnet_startup_lndni(struct lnet_ni *ni, struct lnet_ioctl_config_data *conf) lnet_net_lock(LNET_LOCK_EX); /* refcount for ln_nis */ lnet_ni_addref_locked(ni, 0); - list_add_tail(&ni->ni_list, &the_lnet.ln_nis); + list_add_tail(&ni->ni_net->net_list, &the_lnet.ln_nets); if (ni->ni_cpts) { lnet_ni_addref_locked(ni, 0); list_add_tail(&ni->ni_cptlist, &the_lnet.ln_nis_cpt); @@ -1363,8 +1386,8 @@ lnet_startup_lndnis(struct list_head *nilist) int ni_count = 0; while (!list_empty(nilist)) { - ni = list_entry(nilist->next, struct lnet_ni, ni_list); - list_del(&ni->ni_list); + ni = list_entry(nilist->next, struct lnet_ni, ni_netlist); + list_del(&ni->ni_netlist); rc = lnet_startup_lndni(ni, NULL); if (rc < 0) @@ -1486,6 +1509,7 @@ LNetNIInit(lnet_pid_t requested_pid) struct lnet_ping_info *pinfo; struct lnet_handle_md md_handle; struct list_head net_head; + struct lnet_net *net; INIT_LIST_HEAD(&net_head); @@ -1505,8 +1529,15 @@ LNetNIInit(lnet_pid_t requested_pid) return rc; } - /* Add in the loopback network */ - if (!lnet_ni_alloc(LNET_MKNET(LOLND, 0), NULL, &net_head)) { + /* create a network for Loopback network */ + net = lnet_net_alloc(LNET_MKNET(LOLND, 0), &net_head); + if (net == NULL) { + rc = -ENOMEM; + goto err_empty_list; + } + + /* Add in the loopback NI */ + if (lnet_ni_alloc(net, NULL, NULL) == NULL) { rc = -ENOMEM; goto err_empty_list; } @@ -1584,11 +1615,11 @@ LNetNIInit(lnet_pid_t requested_pid) LASSERT(rc < 0); mutex_unlock(&the_lnet.ln_api_mutex); while (!list_empty(&net_head)) { - struct lnet_ni *ni; + struct lnet_net *net; - ni = list_entry(net_head.next, struct lnet_ni, ni_list); - list_del_init(&ni->ni_list); - lnet_ni_free(ni); + net = list_entry(net_head.next, struct lnet_net, net_list); + list_del_init(&net->net_list); + lnet_net_free(net); } return rc; } @@ -1714,25 +1745,83 @@ lnet_fill_ni_info(struct lnet_ni *ni, struct lnet_ioctl_config_data *config) } } +struct lnet_ni * +lnet_get_ni_idx_locked(int idx) +{ + struct lnet_ni *ni; + struct lnet_net *net; + + list_for_each_entry(net, &the_lnet.ln_nets, net_list) { + list_for_each_entry(ni, &net->net_ni_list, ni_netlist) { + if (idx-- == 0) + return ni; + } + } + + return NULL; +} + +struct lnet_ni * +lnet_get_next_ni_locked(struct lnet_net *mynet, struct lnet_ni *prev) +{ + struct lnet_ni *ni; + struct lnet_net *net = mynet; + + if (prev == NULL) { + if (net == NULL) + net = list_entry(the_lnet.ln_nets.next, struct lnet_net, + net_list); + ni = list_entry(net->net_ni_list.next, struct lnet_ni, + ni_netlist); + + return ni; + } + + if (prev->ni_netlist.next == &prev->ni_net->net_ni_list) { + /* if you reached the end of the ni list and the net is + * specified, then there are no more nis in that net */ + if (net != NULL) + return NULL; + + /* we reached the end of this net ni list. move to the + * next net */ + if (prev->ni_net->net_list.next == &the_lnet.ln_nets) + /* no more nets and no more NIs. */ + return NULL; + + /* get the next net */ + net = list_entry(prev->ni_net->net_list.next, struct lnet_net, + net_list); + /* get the ni on it */ + ni = list_entry(net->net_ni_list.next, struct lnet_ni, + ni_netlist); + + return ni; + } + + /* there are more nis left */ + ni = list_entry(prev->ni_netlist.next, struct lnet_ni, ni_netlist); + + return ni; +} + static int lnet_get_net_config(struct lnet_ioctl_config_data *config) { struct lnet_ni *ni; + int cpt; int idx = config->cfg_count; - int cpt, i = 0; int rc = -ENOENT; cpt = lnet_net_lock_current(); - list_for_each_entry(ni, &the_lnet.ln_nis, ni_list) { - if (i++ != idx) - continue; + ni = lnet_get_ni_idx_locked(idx); + if (ni != NULL) { + rc = 0; lnet_ni_lock(ni); lnet_fill_ni_info(ni, config); lnet_ni_unlock(ni); - rc = 0; - break; } lnet_net_unlock(cpt); @@ -1745,6 +1834,7 @@ lnet_dyn_add_ni(lnet_pid_t requested_pid, struct lnet_ioctl_config_data *conf) char *nets = conf->cfg_config_u.cfg_net.net_intf; struct lnet_ping_info *pinfo; struct lnet_handle_md md_handle; + struct lnet_net *net; struct lnet_ni *ni; struct list_head net_head; struct lnet_remotenet *rnet; @@ -1752,7 +1842,7 @@ lnet_dyn_add_ni(lnet_pid_t requested_pid, struct lnet_ioctl_config_data *conf) INIT_LIST_HEAD(&net_head); - /* Create a ni structure for the network string */ + /* Create a net/ni structures for the network string */ rc = lnet_parse_networks(&net_head, nets); if (rc <= 0) return !rc ? -EINVAL : rc; @@ -1760,14 +1850,14 @@ lnet_dyn_add_ni(lnet_pid_t requested_pid, struct lnet_ioctl_config_data *conf) mutex_lock(&the_lnet.ln_api_mutex); if (rc > 1) { - rc = -EINVAL; /* only add one interface per call */ + rc = -EINVAL; /* only add one network per call */ goto failed0; } - ni = list_entry(net_head.next, struct lnet_ni, ni_list); + net = list_entry(net_head.next, struct lnet_net, net_list); lnet_net_lock(LNET_LOCK_EX); - rnet = lnet_find_net_locked(LNET_NIDNET(ni->ni_nid)); + rnet = lnet_find_net_locked(net->net_id); lnet_net_unlock(LNET_LOCK_EX); /* * make sure that the net added doesn't invalidate the current @@ -1785,8 +1875,8 @@ lnet_dyn_add_ni(lnet_pid_t requested_pid, struct lnet_ioctl_config_data *conf) if (rc) goto failed0; - list_del_init(&ni->ni_list); - + list_del_init(&net->net_list); + ni = list_first_entry(&net->net_ni_list, struct lnet_ni, ni_netlist); rc = lnet_startup_lndni(ni, conf); if (rc) goto failed1; @@ -1812,9 +1902,9 @@ lnet_dyn_add_ni(lnet_pid_t requested_pid, struct lnet_ioctl_config_data *conf) failed0: mutex_unlock(&the_lnet.ln_api_mutex); while (!list_empty(&net_head)) { - ni = list_entry(net_head.next, struct lnet_ni, ni_list); - list_del_init(&ni->ni_list); - lnet_ni_free(ni); + net = list_entry(net_head.next, struct lnet_net, net_list); + list_del_init(&net->net_list); + lnet_net_free(net); } return rc; } @@ -1849,7 +1939,7 @@ lnet_dyn_del_ni(__u32 net) lnet_shutdown_lndni(ni); - if (!lnet_count_acceptor_nis()) + if (!lnet_count_acceptor_nets()) lnet_acceptor_stop(); lnet_ping_target_update(pinfo, md_handle); @@ -2103,7 +2193,8 @@ EXPORT_SYMBOL(LNetDebugPeer); int LNetGetId(unsigned int index, struct lnet_process_id *id) { - struct lnet_ni *ni; + struct lnet_ni *ni; + struct lnet_net *net; int cpt; int rc = -ENOENT; @@ -2111,14 +2202,16 @@ LNetGetId(unsigned int index, struct lnet_process_id *id) cpt = lnet_net_lock_current(); - list_for_each_entry(ni, &the_lnet.ln_nis, ni_list) { - if (index--) - continue; + list_for_each_entry(net, &the_lnet.ln_nets, net_list) { + list_for_each_entry(ni, &net->net_ni_list, ni_netlist) { + if (index-- != 0) + continue; - id->nid = ni->ni_nid; - id->pid = the_lnet.ln_pid; - rc = 0; - break; + id->nid = ni->ni_nid; + id->pid = the_lnet.ln_pid; + rc = 0; + break; + } } lnet_net_unlock(cpt); diff --git a/drivers/staging/lustre/lnet/lnet/config.c b/drivers/staging/lustre/lnet/lnet/config.c index 5646feeb433e..e83bdbec11e3 100644 --- a/drivers/staging/lustre/lnet/lnet/config.c +++ b/drivers/staging/lustre/lnet/lnet/config.c @@ -78,17 +78,17 @@ lnet_issep(char c) } } -int -lnet_net_unique(__u32 net, struct list_head *nilist) +bool +lnet_net_unique(__u32 net, struct list_head *netlist) { - struct lnet_ni *ni; + struct lnet_net *net_l; - list_for_each_entry(ni, nilist, ni_list) { - if (LNET_NIDNET(ni->ni_nid) == net) - return 0; + list_for_each_entry(net_l, netlist, net_list) { + if (net_l->net_id == net) + return false; } - return 1; + return true; } void @@ -112,41 +112,78 @@ lnet_ni_free(struct lnet_ni *ni) if (ni->ni_net_ns) put_net(ni->ni_net_ns); - kvfree(ni->ni_net); kfree(ni); } -struct lnet_ni * -lnet_ni_alloc(__u32 net_id, struct cfs_expr_list *el, struct list_head *nilist) +void +lnet_net_free(struct lnet_net *net) { - struct lnet_tx_queue *tq; + struct list_head *tmp, *tmp2; struct lnet_ni *ni; - int rc; - int i; + + /* delete any nis which have been started. */ + list_for_each_safe(tmp, tmp2, &net->net_ni_list) { + ni = list_entry(tmp, struct lnet_ni, ni_netlist); + list_del_init(&ni->ni_netlist); + lnet_ni_free(ni); + } + + kfree(net); +} + +struct lnet_net * +lnet_net_alloc(__u32 net_id, struct list_head *net_list) +{ struct lnet_net *net; - if (!lnet_net_unique(net_id, nilist)) { - LCONSOLE_ERROR_MSG(0x111, "Duplicate network specified: %s\n", - libcfs_net2str(net_id)); + if (!lnet_net_unique(net_id, net_list)) { + CERROR("Duplicate net %s. Ignore\n", + libcfs_net2str(net_id)); return NULL; } - ni = kzalloc(sizeof(*ni), GFP_NOFS); net = kzalloc(sizeof(*net), GFP_NOFS); - if (!ni || !net) { - kfree(ni); kfree(net); + if (!net) { CERROR("Out of memory creating network %s\n", libcfs_net2str(net_id)); return NULL; } + + INIT_LIST_HEAD(&net->net_list); + INIT_LIST_HEAD(&net->net_ni_list); + + net->net_id = net_id; + /* initialize global paramters to undefiend */ net->net_tunables.lct_peer_timeout = -1; net->net_tunables.lct_max_tx_credits = -1; net->net_tunables.lct_peer_tx_credits = -1; net->net_tunables.lct_peer_rtr_credits = -1; + list_add_tail(&net->net_list, net_list); + + return net; +} + +struct lnet_ni * +lnet_ni_alloc(struct lnet_net *net, struct cfs_expr_list *el, char *iface) +{ + struct lnet_tx_queue *tq; + struct lnet_ni *ni; + int rc; + int i; + + ni = kzalloc(sizeof(*ni), GFP_KERNEL); + if (ni == NULL) { + CERROR("Out of memory creating network interface %s%s\n", + libcfs_net2str(net->net_id), + (iface != NULL) ? iface : ""); + return NULL; + } + spin_lock_init(&ni->ni_lock); INIT_LIST_HEAD(&ni->ni_cptlist); + INIT_LIST_HEAD(&ni->ni_netlist); ni->ni_refs = cfs_percpt_alloc(lnet_cpt_table(), sizeof(*ni->ni_refs[0])); if (!ni->ni_refs) @@ -166,8 +203,9 @@ lnet_ni_alloc(__u32 net_id, struct cfs_expr_list *el, struct list_head *nilist) } else { rc = cfs_expr_list_values(el, LNET_CPT_NUMBER, &ni->ni_cpts); if (rc <= 0) { - CERROR("Failed to set CPTs for NI %s: %d\n", - libcfs_net2str(net_id), rc); + CERROR("Failed to set CPTs for NI %s(%s): %d\n", + libcfs_net2str(net->net_id), + (iface != NULL) ? iface : "", rc); goto failed; } @@ -182,7 +220,7 @@ lnet_ni_alloc(__u32 net_id, struct cfs_expr_list *el, struct list_head *nilist) ni->ni_net = net; /* LND will fill in the address part of the NID */ - ni->ni_nid = LNET_MKNID(net_id, 0); + ni->ni_nid = LNET_MKNID(net->net_id, 0); /* Store net namespace in which current ni is being created */ if (current->nsproxy->net_ns) @@ -191,22 +229,24 @@ lnet_ni_alloc(__u32 net_id, struct cfs_expr_list *el, struct list_head *nilist) ni->ni_net_ns = NULL; ni->ni_last_alive = ktime_get_real_seconds(); - list_add_tail(&ni->ni_list, nilist); + list_add_tail(&ni->ni_netlist, &net->net_ni_list); + return ni; - failed: +failed: lnet_ni_free(ni); return NULL; } int -lnet_parse_networks(struct list_head *nilist, char *networks) +lnet_parse_networks(struct list_head *netlist, char *networks) { struct cfs_expr_list *el = NULL; char *tokens; char *str; char *tmp; - struct lnet_ni *ni; - __u32 net; + struct lnet_net *net; + struct lnet_ni *ni = NULL; + __u32 net_id; int nnets = 0; struct list_head *temp_node; @@ -275,18 +315,21 @@ lnet_parse_networks(struct list_head *nilist, char *networks) if (comma) *comma++ = 0; - net = libcfs_str2net(strim(str)); + net_id = libcfs_str2net(strim(str)); - if (net == LNET_NIDNET(LNET_NID_ANY)) { + if (net_id == LNET_NIDNET(LNET_NID_ANY)) { LCONSOLE_ERROR_MSG(0x113, "Unrecognised network type\n"); tmp = str; goto failed_syntax; } - if (LNET_NETTYP(net) != LOLND && /* LO is implicit */ - !lnet_ni_alloc(net, el, nilist)) - goto failed; + if (LNET_NETTYP(net_id) != LOLND) { /* LO is implicit */ + net = lnet_net_alloc(net_id, netlist); + if (!net || + !lnet_ni_alloc(net, el, NULL)) + goto failed; + } if (el) { cfs_expr_list_free(el); @@ -298,14 +341,21 @@ lnet_parse_networks(struct list_head *nilist, char *networks) } *bracket = 0; - net = libcfs_str2net(strim(str)); - if (net == LNET_NIDNET(LNET_NID_ANY)) { + net_id = libcfs_str2net(strim(str)); + if (net_id == LNET_NIDNET(LNET_NID_ANY)) { tmp = str; goto failed_syntax; } - ni = lnet_ni_alloc(net, el, nilist); - if (!ni) + /* always allocate a net, since we will eventually add an + * interface to it, or we will fail, in which case we'll + * just delete it */ + net = lnet_net_alloc(net_id, netlist); + if (IS_ERR_OR_NULL(net)) + goto failed; + + ni = lnet_ni_alloc(net, el, NULL); + if (IS_ERR_OR_NULL(ni)) goto failed; if (el) { @@ -337,7 +387,7 @@ lnet_parse_networks(struct list_head *nilist, char *networks) if (niface == LNET_MAX_INTERFACES) { LCONSOLE_ERROR_MSG(0x115, "Too many interfaces for net %s\n", - libcfs_net2str(net)); + libcfs_net2str(net_id)); goto failed; } @@ -378,7 +428,7 @@ lnet_parse_networks(struct list_head *nilist, char *networks) } } - list_for_each(temp_node, nilist) + list_for_each(temp_node, netlist) nnets++; kfree(tokens); @@ -387,11 +437,12 @@ lnet_parse_networks(struct list_head *nilist, char *networks) failed_syntax: lnet_syntax("networks", networks, (int)(tmp - tokens), strlen(tmp)); failed: - while (!list_empty(nilist)) { - ni = list_entry(nilist->next, struct lnet_ni, ni_list); + /* free the net list and all the nis on each net */ + while (!list_empty(netlist)) { + net = list_entry(netlist->next, struct lnet_net, net_list); - list_del(&ni->ni_list); - lnet_ni_free(ni); + list_del_init(&net->net_list); + lnet_net_free(net); } if (el) diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c index 1bf12af87a20..1c874025fa74 100644 --- a/drivers/staging/lustre/lnet/lnet/lib-move.c +++ b/drivers/staging/lustre/lnet/lnet/lib-move.c @@ -2289,7 +2289,7 @@ EXPORT_SYMBOL(LNetGet); int LNetDist(lnet_nid_t dstnid, lnet_nid_t *srcnidp, __u32 *orderp) { - struct lnet_ni *ni; + struct lnet_ni *ni = NULL; struct lnet_remotenet *rnet; __u32 dstnet = LNET_NIDNET(dstnid); int hops; @@ -2307,9 +2307,9 @@ LNetDist(lnet_nid_t dstnid, lnet_nid_t *srcnidp, __u32 *orderp) cpt = lnet_net_lock_current(); - list_for_each_entry(ni, &the_lnet.ln_nis, ni_list) { + while ((ni = lnet_get_next_ni_locked(NULL, ni))) { if (ni->ni_nid == dstnid) { - if (srcnidp) + if (srcnidp != NULL) *srcnidp = dstnid; if (orderp) { if (LNET_NETTYP(LNET_NIDNET(dstnid)) == LOLND) diff --git a/drivers/staging/lustre/lnet/lnet/router.c b/drivers/staging/lustre/lnet/lnet/router.c index 0c0ec0b27982..135dfe793b0b 100644 --- a/drivers/staging/lustre/lnet/lnet/router.c +++ b/drivers/staging/lustre/lnet/lnet/router.c @@ -245,13 +245,10 @@ static void lnet_shuffle_seed(void) if (seeded) return; - /* - * Nodes with small feet have little entropy - * the NID for this node gives the most entropy in the low bits - */ - list_for_each_entry(ni, &the_lnet.ln_nis, ni_list) { + /* Nodes with small feet have little entropy + * the NID for this node gives the most entropy in the low bits */ + while ((ni = lnet_get_next_ni_locked(NULL, ni))) { __u32 lnd_type, seed; - lnd_type = LNET_NETTYP(LNET_NIDNET(ni->ni_nid)); if (lnd_type != LOLND) { seed = (LNET_NIDADDR(ni->ni_nid) | lnd_type); @@ -807,8 +804,8 @@ lnet_router_ni_update_locked(struct lnet_peer *gw, __u32 net) static void lnet_update_ni_status_locked(void) { - struct lnet_ni *ni; - time64_t now; + struct lnet_ni *ni = NULL; + time64_t now; time64_t timeout; LASSERT(the_lnet.ln_routing); @@ -817,7 +814,7 @@ lnet_update_ni_status_locked(void) max(live_router_check_interval, dead_router_check_interval); now = ktime_get_real_seconds(); - list_for_each_entry(ni, &the_lnet.ln_nis, ni_list) { + while ((ni = lnet_get_next_ni_locked(NULL, ni))) { if (ni->ni_net->net_lnd->lnd_type == LOLND) continue; diff --git a/drivers/staging/lustre/lnet/lnet/router_proc.c b/drivers/staging/lustre/lnet/lnet/router_proc.c index f3ccd6a2b70e..2a366e9a8627 100644 --- a/drivers/staging/lustre/lnet/lnet/router_proc.c +++ b/drivers/staging/lustre/lnet/lnet/router_proc.c @@ -641,26 +641,12 @@ static int proc_lnet_nis(struct ctl_table *table, int write, "rtr", "max", "tx", "min"); LASSERT(tmpstr + tmpsiz - s > 0); } else { - struct list_head *n; struct lnet_ni *ni = NULL; int skip = *ppos - 1; lnet_net_lock(0); - n = the_lnet.ln_nis.next; - - while (n != &the_lnet.ln_nis) { - struct lnet_ni *a_ni; - - a_ni = list_entry(n, struct lnet_ni, ni_list); - if (!skip) { - ni = a_ni; - break; - } - - skip--; - n = n->next; - } + ni = lnet_get_ni_idx_locked(skip); if (ni) { struct lnet_tx_queue *tq; From patchwork Fri Sep 7 00:49:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10591341 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AF830921 for ; Fri, 7 Sep 2018 00:52:26 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A2E1D2B17E for ; Fri, 7 Sep 2018 00:52:26 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 96D852B181; Fri, 7 Sep 2018 00:52:26 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 5F78E2B17D for ; Fri, 7 Sep 2018 00:52:25 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 06A504E30C9; Thu, 6 Sep 2018 17:52:25 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 83A8F4E2FED for ; Thu, 6 Sep 2018 17:52:23 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id AA41FAEF1; Fri, 7 Sep 2018 00:52:22 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Fri, 07 Sep 2018 10:49:31 +1000 Message-ID: <153628137147.8267.3706504130592682241.stgit@noble> In-Reply-To: <153628058697.8267.6056114844033479774.stgit@noble> References: <153628058697.8267.6056114844033479774.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 06/34] lnet: store separate xmit/recv net-interface in each message. X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP Currently we store the net-interface in the peer, but the peer should identify just the network, not the particular interface. To help track which actual interface is used for each message, store them explicitly. This is part of 8cbb8cd3e771e7f7e0f99cafc19fad32770dc015 LU-7734 lnet: Multi-Rail local NI split and includes commit 63c3e5129873 ("LU-7734 lnet: Fix lnet_msg_free()") Signed-off-by: NeilBrown Reviewed-by: Doug Oucharek > Signed-off-by: NeilBrown > Reviewed-by: Doug Oucharek <dougso@me.com> Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-by: James Simmons Signed-off-by: Amir Shehata Reviewed-by: Doug Oucharek Reviewed-by: Olaf Weber Signed-off-by: NeilBrown Reviewed-by: James Simmons Signed-off-by: Amir Shehata Reviewed-by: Doug Oucharek Reviewed-by: Olaf Weber Signed-off-by: NeilBrown --- .../staging/lustre/include/linux/lnet/lib-types.h | 3 +++ drivers/staging/lustre/lnet/lnet/lib-move.c | 21 ++++++++++++++++++-- 2 files changed, 22 insertions(+), 2 deletions(-) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h index 5f0d4703bf86..16a493529a46 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-types.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h @@ -98,6 +98,9 @@ struct lnet_msg { void *msg_private; struct lnet_libmd *msg_md; + /* the NI the message was sent or received over */ + struct lnet_ni *msg_txni; + struct lnet_ni *msg_rxni; unsigned int msg_len; unsigned int msg_wanted; diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c index 1c874025fa74..b2a52ddcefcb 100644 --- a/drivers/staging/lustre/lnet/lnet/lib-move.c +++ b/drivers/staging/lustre/lnet/lnet/lib-move.c @@ -782,6 +782,7 @@ lnet_return_tx_credits_locked(struct lnet_msg *msg) { struct lnet_peer *txpeer = msg->msg_txpeer; struct lnet_msg *msg2; + struct lnet_ni *txni = msg->msg_txni; if (msg->msg_txcredit) { struct lnet_ni *ni = txpeer->lp_ni; @@ -829,6 +830,11 @@ lnet_return_tx_credits_locked(struct lnet_msg *msg) } } + if (txni != NULL) { + msg->msg_txni = NULL; + lnet_ni_decref_locked(txni, msg->msg_tx_cpt); + } + if (txpeer) { msg->msg_txpeer = NULL; lnet_peer_decref_locked(txpeer); @@ -876,6 +882,7 @@ void lnet_return_rx_credits_locked(struct lnet_msg *msg) { struct lnet_peer *rxpeer = msg->msg_rxpeer; + struct lnet_ni *rxni = msg->msg_rxni; struct lnet_msg *msg2; if (msg->msg_rtrcredit) { @@ -951,6 +958,10 @@ lnet_return_rx_credits_locked(struct lnet_msg *msg) (void)lnet_post_routed_recv_locked(msg2, 1); } } + if (rxni != NULL) { + msg->msg_rxni = NULL; + lnet_ni_decref_locked(rxni, msg->msg_rx_cpt); + } if (rxpeer) { msg->msg_rxpeer = NULL; lnet_peer_decref_locked(rxpeer); @@ -1218,9 +1229,12 @@ lnet_send(lnet_nid_t src_nid, struct lnet_msg *msg, lnet_nid_t rtr_nid) LASSERT(!msg->msg_peertxcredit); LASSERT(!msg->msg_txcredit); - LASSERT(!msg->msg_txpeer); + LASSERT(msg->msg_txpeer == NULL); - msg->msg_txpeer = lp; /* msg takes my ref on lp */ + msg->msg_txpeer = lp; /* msg takes my ref on lp */ + /* set the NI for this message */ + msg->msg_txni = src_ni; + lnet_ni_addref_locked(msg->msg_txni, cpt); rc = lnet_post_send_locked(msg, 0); lnet_net_unlock(cpt); @@ -1818,6 +1832,8 @@ lnet_parse(struct lnet_ni *ni, struct lnet_hdr *hdr, lnet_nid_t from_nid, return 0; goto drop; } + msg->msg_rxni = ni; + lnet_ni_addref_locked(ni, cpt); if (lnet_isrouter(msg->msg_rxpeer)) { lnet_peer_set_alive(msg->msg_rxpeer); @@ -1934,6 +1950,7 @@ lnet_recv_delayed_msg_list(struct list_head *head) LASSERT(msg->msg_rx_delayed); LASSERT(msg->msg_md); LASSERT(msg->msg_rxpeer); + LASSERT(msg->msg_rxni); LASSERT(msg->msg_hdr.type == LNET_MSG_PUT); CDEBUG(D_NET, "Resuming delayed PUT from %s portal %d match %llu offset %d length %d.\n", From patchwork Fri Sep 7 00:49:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10591343 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 10337921 for ; Fri, 7 Sep 2018 00:52:36 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 00BDC2B17D for ; Fri, 7 Sep 2018 00:52:36 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E90902B181; Fri, 7 Sep 2018 00:52:35 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 0F9F42B17D for ; Fri, 7 Sep 2018 00:52:35 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id C16C04E3128; Thu, 6 Sep 2018 17:52:34 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 2F5294E2FED for ; Thu, 6 Sep 2018 17:52:33 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 36E8DAED7; Fri, 7 Sep 2018 00:52:32 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Fri, 07 Sep 2018 10:49:31 +1000 Message-ID: <153628137151.8267.3943711043829439593.stgit@noble> In-Reply-To: <153628058697.8267.6056114844033479774.stgit@noble> References: <153628058697.8267.6056114844033479774.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 07/34] lnet: change lnet_peer to reference the net, rather than ni. X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP As a net will soon have multiple ni, a peer should identify just the net. Various places that we need the ni, we now use rxni or txni from the message This is part of 8cbb8cd3e771e7f7e0f99cafc19fad32770dc015 LU-7734 lnet: Multi-Rail local NI split Signed-off-by: NeilBrown --- .../staging/lustre/include/linux/lnet/lib-lnet.h | 3 + .../staging/lustre/include/linux/lnet/lib-types.h | 5 +- drivers/staging/lustre/lnet/lnet/api-ni.c | 13 +++++ drivers/staging/lustre/lnet/lnet/lib-move.c | 49 +++++++++++--------- drivers/staging/lustre/lnet/lnet/lib-ptl.c | 2 - drivers/staging/lustre/lnet/lnet/net_fault.c | 3 + drivers/staging/lustre/lnet/lnet/peer.c | 26 ++++------- drivers/staging/lustre/lnet/lnet/router.c | 14 +++--- drivers/staging/lustre/lnet/lnet/router_proc.c | 2 - 9 files changed, 67 insertions(+), 50 deletions(-) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h index 4440b87299c4..34509e52bac7 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h @@ -435,6 +435,7 @@ int lnet_dyn_add_ni(lnet_pid_t requested_pid, struct lnet_ioctl_config_data *conf); int lnet_dyn_del_ni(__u32 net); int lnet_clear_lazy_portal(struct lnet_ni *ni, int portal, char *reason); +struct lnet_net *lnet_get_net_locked(__u32 net_id); int lnet_islocalnid(lnet_nid_t nid); int lnet_islocalnet(__u32 net); @@ -617,7 +618,7 @@ int lnet_sock_connect(struct socket **sockp, int *fatal, void libcfs_sock_release(struct socket *sock); int lnet_peers_start_down(void); -int lnet_peer_buffer_credits(struct lnet_ni *ni); +int lnet_peer_buffer_credits(struct lnet_net *net); int lnet_router_checker_start(void); void lnet_router_checker_stop(void); diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h index 16a493529a46..255c6c4bbb89 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-types.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h @@ -396,7 +396,8 @@ struct lnet_peer { time64_t lp_last_query; /* when lp_ni was queried * last time */ - struct lnet_ni *lp_ni; /* interface peer is on */ + /* network peer is on */ + struct lnet_net *lp_net; lnet_nid_t lp_nid; /* peer's NID */ int lp_refcount; /* # refs */ int lp_cpt; /* CPT this peer attached on */ @@ -427,7 +428,7 @@ struct lnet_peer_table { * lnet_ni::ni_peertimeout has been set to a positive value */ #define lnet_peer_aliveness_enabled(lp) (the_lnet.ln_routing && \ - (lp)->lp_ni->ni_net->net_tunables.lct_peer_timeout > 0) + (lp)->lp_net->net_tunables.lct_peer_timeout > 0) struct lnet_route { struct list_head lr_list; /* chain on net */ diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index 05687278334a..c21aef32cdde 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -680,6 +680,19 @@ lnet_net2ni(__u32 net) } EXPORT_SYMBOL(lnet_net2ni); +struct lnet_net * +lnet_get_net_locked(__u32 net_id) +{ + struct lnet_net *net; + + list_for_each_entry(net, &the_lnet.ln_nets, net_list) { + if (net->net_id == net_id) + return net; + } + + return NULL; +} + static unsigned int lnet_nid_cpt_hash(lnet_nid_t nid, unsigned int number) { diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c index b2a52ddcefcb..b8b15f56a275 100644 --- a/drivers/staging/lustre/lnet/lnet/lib-move.c +++ b/drivers/staging/lustre/lnet/lnet/lib-move.c @@ -525,7 +525,7 @@ lnet_peer_is_alive(struct lnet_peer *lp, unsigned long now) return 0; deadline = lp->lp_last_alive + - lp->lp_ni->ni_net->net_tunables.lct_peer_timeout; + lp->lp_net->net_tunables.lct_peer_timeout; alive = deadline > now; /* Update obsolete lp_alive except for routers assumed to be dead @@ -544,7 +544,7 @@ lnet_peer_is_alive(struct lnet_peer *lp, unsigned long now) * may drop the lnet_net_lock */ static int -lnet_peer_alive_locked(struct lnet_peer *lp) +lnet_peer_alive_locked(struct lnet_ni *ni, struct lnet_peer *lp) { time64_t now = ktime_get_seconds(); @@ -570,13 +570,13 @@ lnet_peer_alive_locked(struct lnet_peer *lp) libcfs_nid2str(lp->lp_nid), now, next_query, lnet_queryinterval, - lp->lp_ni->ni_net->net_tunables.lct_peer_timeout); + lp->lp_net->net_tunables.lct_peer_timeout); return 0; } } /* query NI for latest aliveness news */ - lnet_ni_query_locked(lp->lp_ni, lp); + lnet_ni_query_locked(ni, lp); if (lnet_peer_is_alive(lp, now)) return 1; @@ -600,7 +600,7 @@ static int lnet_post_send_locked(struct lnet_msg *msg, int do_send) { struct lnet_peer *lp = msg->msg_txpeer; - struct lnet_ni *ni = lp->lp_ni; + struct lnet_ni *ni = msg->msg_txni; int cpt = msg->msg_tx_cpt; struct lnet_tx_queue *tq = ni->ni_tx_queues[cpt]; @@ -611,7 +611,7 @@ lnet_post_send_locked(struct lnet_msg *msg, int do_send) /* NB 'lp' is always the next hop */ if (!(msg->msg_target.pid & LNET_PID_USERFLAG) && - !lnet_peer_alive_locked(lp)) { + !lnet_peer_alive_locked(ni, lp)) { the_lnet.ln_counters[cpt]->drop_count++; the_lnet.ln_counters[cpt]->drop_length += msg->msg_len; lnet_net_unlock(cpt); @@ -770,7 +770,7 @@ lnet_post_routed_recv_locked(struct lnet_msg *msg, int do_recv) int cpt = msg->msg_rx_cpt; lnet_net_unlock(cpt); - lnet_ni_recv(lp->lp_ni, msg->msg_private, msg, 1, + lnet_ni_recv(msg->msg_rxni, msg->msg_private, msg, 1, 0, msg->msg_len, msg->msg_len); lnet_net_lock(cpt); } @@ -785,7 +785,7 @@ lnet_return_tx_credits_locked(struct lnet_msg *msg) struct lnet_ni *txni = msg->msg_txni; if (msg->msg_txcredit) { - struct lnet_ni *ni = txpeer->lp_ni; + struct lnet_ni *ni = msg->msg_txni; struct lnet_tx_queue *tq = ni->ni_tx_queues[msg->msg_tx_cpt]; /* give back NI txcredits */ @@ -800,7 +800,7 @@ lnet_return_tx_credits_locked(struct lnet_msg *msg) struct lnet_msg, msg_list); list_del(&msg2->msg_list); - LASSERT(msg2->msg_txpeer->lp_ni == ni); + LASSERT(msg2->msg_txni == ni); LASSERT(msg2->msg_tx_delayed); (void)lnet_post_send_locked(msg2, 1); @@ -869,7 +869,7 @@ lnet_drop_routed_msgs_locked(struct list_head *list, int cpt) while(!list_empty(&drop)) { msg = list_first_entry(&drop, struct lnet_msg, msg_list); - lnet_ni_recv(msg->msg_rxpeer->lp_ni, msg->msg_private, NULL, + lnet_ni_recv(msg->msg_rxni, msg->msg_private, NULL, 0, 0, 0, msg->msg_hdr.payload_length); list_del_init(&msg->msg_list); lnet_finalize(NULL, msg, -ECANCELED); @@ -1007,7 +1007,7 @@ lnet_compare_routes(struct lnet_route *r1, struct lnet_route *r2) } static struct lnet_peer * -lnet_find_route_locked(struct lnet_ni *ni, lnet_nid_t target, +lnet_find_route_locked(struct lnet_net *net, lnet_nid_t target, lnet_nid_t rtr_nid) { struct lnet_remotenet *rnet; @@ -1035,7 +1035,7 @@ lnet_find_route_locked(struct lnet_ni *ni, lnet_nid_t target, if (!lnet_is_route_alive(route)) continue; - if (ni && lp->lp_ni != ni) + if (net && lp->lp_net != net) continue; if (lp->lp_nid == rtr_nid) /* it's pre-determined router */ @@ -1164,10 +1164,12 @@ lnet_send(lnet_nid_t src_nid, struct lnet_msg *msg, lnet_nid_t rtr_nid) /* ENOMEM or shutting down */ return rc; } - LASSERT(lp->lp_ni == src_ni); + LASSERT(lp->lp_net == src_ni->ni_net); } else { /* sending to a remote network */ - lp = lnet_find_route_locked(src_ni, dst_nid, rtr_nid); + lp = lnet_find_route_locked(src_ni != NULL ? + src_ni->ni_net : NULL, + dst_nid, rtr_nid); if (!lp) { if (src_ni) lnet_ni_decref_locked(src_ni, cpt); @@ -1203,10 +1205,11 @@ lnet_send(lnet_nid_t src_nid, struct lnet_msg *msg, lnet_nid_t rtr_nid) lnet_msgtyp2str(msg->msg_type), msg->msg_len); if (!src_ni) { - src_ni = lp->lp_ni; + src_ni = lnet_get_next_ni_locked(lp->lp_net, NULL); + LASSERT(src_ni != NULL); src_nid = src_ni->ni_nid; } else { - LASSERT(src_ni == lp->lp_ni); + LASSERT(src_ni->ni_net == lp->lp_net); lnet_ni_decref_locked(src_ni, cpt); } @@ -1918,7 +1921,7 @@ lnet_drop_delayed_msg_list(struct list_head *head, char *reason) * called lnet_drop_message(), so I just hang onto msg as well * until that's done */ - lnet_drop_message(msg->msg_rxpeer->lp_ni, + lnet_drop_message(msg->msg_rxni, msg->msg_rxpeer->lp_cpt, msg->msg_private, msg->msg_len); /* @@ -1926,7 +1929,7 @@ lnet_drop_delayed_msg_list(struct list_head *head, char *reason) * but we still should give error code so lnet_msg_decommit() * can skip counters operations and other checks. */ - lnet_finalize(msg->msg_rxpeer->lp_ni, msg, -ENOENT); + lnet_finalize(msg->msg_rxni, msg, -ENOENT); } } @@ -1959,7 +1962,7 @@ lnet_recv_delayed_msg_list(struct list_head *head) msg->msg_hdr.msg.put.offset, msg->msg_hdr.payload_length); - lnet_recv_put(msg->msg_rxpeer->lp_ni, msg); + lnet_recv_put(msg->msg_rxni, msg); } } @@ -2384,8 +2387,12 @@ LNetDist(lnet_nid_t dstnid, lnet_nid_t *srcnidp, __u32 *orderp) LASSERT(shortest); hops = shortest_hops; - if (srcnidp) - *srcnidp = shortest->lr_gateway->lp_ni->ni_nid; + if (srcnidp) { + ni = lnet_get_next_ni_locked( + shortest->lr_gateway->lp_net, + NULL); + *srcnidp = ni->ni_nid; + } if (orderp) *orderp = order; lnet_net_unlock(cpt); diff --git a/drivers/staging/lustre/lnet/lnet/lib-ptl.c b/drivers/staging/lustre/lnet/lnet/lib-ptl.c index fc47379c5938..4c5737083422 100644 --- a/drivers/staging/lustre/lnet/lnet/lib-ptl.c +++ b/drivers/staging/lustre/lnet/lnet/lib-ptl.c @@ -946,7 +946,7 @@ lnet_clear_lazy_portal(struct lnet_ni *ni, int portal, char *reason) /* grab all messages which are on the NI passed in */ list_for_each_entry_safe(msg, tmp, &ptl->ptl_msg_delayed, msg_list) { - if (msg->msg_rxpeer->lp_ni == ni) + if (msg->msg_txni == ni || msg->msg_rxni == ni) list_move(&msg->msg_list, &zombies); } } else { diff --git a/drivers/staging/lustre/lnet/lnet/net_fault.c b/drivers/staging/lustre/lnet/lnet/net_fault.c index 41d6131ee15a..6c53ae1811e5 100644 --- a/drivers/staging/lustre/lnet/lnet/net_fault.c +++ b/drivers/staging/lustre/lnet/lnet/net_fault.c @@ -601,8 +601,9 @@ delayed_msg_process(struct list_head *msg_list, bool drop) msg = list_entry(msg_list->next, struct lnet_msg, msg_list); LASSERT(msg->msg_rxpeer); + LASSERT(msg->msg_rxni != NULL); - ni = msg->msg_rxpeer->lp_ni; + ni = msg->msg_rxni; cpt = msg->msg_rx_cpt; list_del_init(&msg->msg_list); diff --git a/drivers/staging/lustre/lnet/lnet/peer.c b/drivers/staging/lustre/lnet/lnet/peer.c index b76ac3e051d9..ed29124ebded 100644 --- a/drivers/staging/lustre/lnet/lnet/peer.c +++ b/drivers/staging/lustre/lnet/lnet/peer.c @@ -112,7 +112,7 @@ lnet_peer_table_cleanup_locked(struct lnet_ni *ni, for (i = 0; i < LNET_PEER_HASH_SIZE; i++) { list_for_each_entry_safe(lp, tmp, &ptable->pt_hash[i], lp_hashlist) { - if (ni && ni != lp->lp_ni) + if (ni && ni->ni_net != lp->lp_net) continue; list_del_init(&lp->lp_hashlist); /* Lose hash table's ref */ @@ -154,7 +154,7 @@ lnet_peer_table_del_rtrs_locked(struct lnet_ni *ni, for (i = 0; i < LNET_PEER_HASH_SIZE; i++) { list_for_each_entry_safe(lp, tmp, &ptable->pt_hash[i], lp_hashlist) { - if (ni != lp->lp_ni) + if (ni->ni_net != lp->lp_net) continue; if (!lp->lp_rtr_refcount) @@ -230,8 +230,7 @@ lnet_destroy_peer_locked(struct lnet_peer *lp) LASSERT(ptable->pt_number > 0); ptable->pt_number--; - lnet_ni_decref_locked(lp->lp_ni, lp->lp_cpt); - lp->lp_ni = NULL; + lp->lp_net = NULL; list_add(&lp->lp_hashlist, &ptable->pt_deathrow); LASSERT(ptable->pt_zombies > 0); @@ -336,16 +335,11 @@ lnet_nid2peer_locked(struct lnet_peer **lpp, lnet_nid_t nid, int cpt) goto out; } - lp->lp_ni = lnet_net2ni_locked(LNET_NIDNET(nid), cpt2); - if (!lp->lp_ni) { - rc = -EHOSTUNREACH; - goto out; - } - - lp->lp_txcredits = lp->lp_ni->ni_net->net_tunables.lct_peer_tx_credits; - lp->lp_mintxcredits = lp->lp_ni->ni_net->net_tunables.lct_peer_tx_credits; - lp->lp_rtrcredits = lnet_peer_buffer_credits(lp->lp_ni); - lp->lp_minrtrcredits = lnet_peer_buffer_credits(lp->lp_ni); + lp->lp_net = lnet_get_net_locked(LNET_NIDNET(!lp->lp_nid)); + lp->lp_txcredits = + lp->lp_mintxcredits = lp->lp_net->net_tunables.lct_peer_tx_credits; + lp->lp_rtrcredits = + lp->lp_minrtrcredits = lnet_peer_buffer_credits(lp->lp_net); list_add_tail(&lp->lp_hashlist, &ptable->pt_hash[lnet_nid2peerhash(nid)]); @@ -383,7 +377,7 @@ lnet_debug_peer(lnet_nid_t nid) CDEBUG(D_WARNING, "%-24s %4d %5s %5d %5d %5d %5d %5d %ld\n", libcfs_nid2str(lp->lp_nid), lp->lp_refcount, - aliveness, lp->lp_ni->ni_net->net_tunables.lct_peer_tx_credits, + aliveness, lp->lp_net->net_tunables.lct_peer_tx_credits, lp->lp_rtrcredits, lp->lp_minrtrcredits, lp->lp_txcredits, lp->lp_mintxcredits, lp->lp_txqnob); @@ -439,7 +433,7 @@ lnet_get_peer_info(__u32 peer_index, __u64 *nid, *nid = lp->lp_nid; *refcount = lp->lp_refcount; *ni_peer_tx_credits = - lp->lp_ni->ni_net->net_tunables.lct_peer_tx_credits; + lp->lp_net->net_tunables.lct_peer_tx_credits; *peer_tx_credits = lp->lp_txcredits; *peer_rtr_credits = lp->lp_rtrcredits; *peer_min_rtr_credits = lp->lp_mintxcredits; diff --git a/drivers/staging/lustre/lnet/lnet/router.c b/drivers/staging/lustre/lnet/lnet/router.c index 135dfe793b0b..72b8ca2b0fc6 100644 --- a/drivers/staging/lustre/lnet/lnet/router.c +++ b/drivers/staging/lustre/lnet/lnet/router.c @@ -55,10 +55,8 @@ module_param(auto_down, int, 0444); MODULE_PARM_DESC(auto_down, "Automatically mark peers down on comms error"); int -lnet_peer_buffer_credits(struct lnet_ni *ni) +lnet_peer_buffer_credits(struct lnet_net *net) { - struct lnet_net *net = ni->ni_net; - /* NI option overrides LNet default */ if (net->net_tunables.lct_peer_rtr_credits > 0) return net->net_tunables.lct_peer_rtr_credits; @@ -373,7 +371,7 @@ lnet_add_route(__u32 net, __u32 hops, lnet_nid_t gateway, lnet_peer_addref_locked(route->lr_gateway); /* +1 for notify */ lnet_add_route_to_rnet(rnet2, route); - ni = route->lr_gateway->lp_ni; + ni = lnet_get_next_ni_locked(route->lr_gateway->lp_net, NULL); lnet_net_unlock(LNET_LOCK_EX); /* XXX Assume alive */ @@ -428,8 +426,8 @@ lnet_check_routes(void) continue; } - if (route->lr_gateway->lp_ni == - route2->lr_gateway->lp_ni) + if (route->lr_gateway->lp_net == + route2->lr_gateway->lp_net) continue; nid1 = route->lr_gateway->lp_nid; @@ -952,6 +950,7 @@ lnet_ping_router_locked(struct lnet_peer *rtr) struct lnet_rc_data *rcd = NULL; time64_t now = ktime_get_seconds(); time64_t secs; + struct lnet_ni *ni; lnet_peer_addref_locked(rtr); @@ -960,7 +959,8 @@ lnet_ping_router_locked(struct lnet_peer *rtr) lnet_notify_locked(rtr, 1, 0, now); /* Run any outstanding notifications */ - lnet_ni_notify_locked(rtr->lp_ni, rtr); + ni = lnet_get_next_ni_locked(rtr->lp_net, NULL); + lnet_ni_notify_locked(ni, rtr); if (!lnet_isrouter(rtr) || the_lnet.ln_rc_state != LNET_RC_STATE_RUNNING) { diff --git a/drivers/staging/lustre/lnet/lnet/router_proc.c b/drivers/staging/lustre/lnet/lnet/router_proc.c index 2a366e9a8627..52714b898aac 100644 --- a/drivers/staging/lustre/lnet/lnet/router_proc.c +++ b/drivers/staging/lustre/lnet/lnet/router_proc.c @@ -489,7 +489,7 @@ static int proc_lnet_peers(struct ctl_table *table, int write, int nrefs = peer->lp_refcount; time64_t lastalive = -1; char *aliveness = "NA"; - int maxcr = peer->lp_ni->ni_net->net_tunables.lct_peer_tx_credits; + int maxcr = peer->lp_net->net_tunables.lct_peer_tx_credits; int txcr = peer->lp_txcredits; int mintxcr = peer->lp_mintxcredits; int rtrcr = peer->lp_rtrcredits; From patchwork Fri Sep 7 00:49:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10591345 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 69113921 for ; Fri, 7 Sep 2018 00:52:42 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5A9E12B17D for ; Fri, 7 Sep 2018 00:52:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4E1A62B181; Fri, 7 Sep 2018 00:52:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 0C5072B17D for ; Fri, 7 Sep 2018 00:52:42 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id B464F4E313E; Thu, 6 Sep 2018 17:52:41 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id C543A4E3048 for ; Thu, 6 Sep 2018 17:52:39 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 13DF0AEF1; Fri, 7 Sep 2018 00:52:39 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Fri, 07 Sep 2018 10:49:31 +1000 Message-ID: <153628137155.8267.2566576537174390617.stgit@noble> In-Reply-To: <153628058697.8267.6056114844033479774.stgit@noble> References: <153628058697.8267.6056114844033479774.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 08/34] lnet: add cpt to lnet_match_info. X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP This seems to be a more direct way to get the cpt needed in lnet_mt_of_match(). This is part of 8cbb8cd3e771e7f7e0f99cafc19fad32770dc015 LU-7734 lnet: Multi-Rail local NI split Signed-off-by: NeilBrown Reviewed-by: Doug Oucharek > Signed-off-by: NeilBrown > Reviewed-by: Doug Oucharek <dougso@me.com> Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-by: James Simmons Reviewed-by: James Simmons --- .../staging/lustre/include/linux/lnet/lib-types.h | 1 + drivers/staging/lustre/lnet/lnet/lib-move.c | 1 + drivers/staging/lustre/lnet/lnet/lib-ptl.c | 2 +- 3 files changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h index 255c6c4bbb89..2d2c066a11ba 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-types.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h @@ -511,6 +511,7 @@ enum lnet_match_flags { struct lnet_match_info { __u64 mi_mbits; struct lnet_process_id mi_id; + unsigned int mi_cpt; unsigned int mi_opc; unsigned int mi_portal; unsigned int mi_rlength; diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c index b8b15f56a275..b6e81a693fc3 100644 --- a/drivers/staging/lustre/lnet/lnet/lib-move.c +++ b/drivers/staging/lustre/lnet/lnet/lib-move.c @@ -1303,6 +1303,7 @@ lnet_parse_put(struct lnet_ni *ni, struct lnet_msg *msg) info.mi_rlength = hdr->payload_length; info.mi_roffset = hdr->msg.put.offset; info.mi_mbits = hdr->msg.put.match_bits; + info.mi_cpt = msg->msg_rxpeer->lp_cpt; msg->msg_rx_ready_delay = !ni->ni_net->net_lnd->lnd_eager_recv; ready_delay = msg->msg_rx_ready_delay; diff --git a/drivers/staging/lustre/lnet/lnet/lib-ptl.c b/drivers/staging/lustre/lnet/lnet/lib-ptl.c index 4c5737083422..90ce51801726 100644 --- a/drivers/staging/lustre/lnet/lnet/lib-ptl.c +++ b/drivers/staging/lustre/lnet/lnet/lib-ptl.c @@ -292,7 +292,7 @@ lnet_mt_of_match(struct lnet_match_info *info, struct lnet_msg *msg) rotor = ptl->ptl_rotor++; /* get round-robin factor */ if (portal_rotor == LNET_PTL_ROTOR_HASH_RT && routed) - cpt = lnet_cpt_of_nid(msg->msg_hdr.src_nid); + cpt = info->mi_cpt; else cpt = rotor % LNET_CPT_NUMBER; From patchwork Fri Sep 7 00:49:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10591347 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 84287112B for ; Fri, 7 Sep 2018 00:52:48 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 74B442B17D for ; Fri, 7 Sep 2018 00:52:48 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 671312B181; Fri, 7 Sep 2018 00:52:48 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id E9EB32B17D for ; Fri, 7 Sep 2018 00:52:47 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 785A04E3156; Thu, 6 Sep 2018 17:52:47 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 6193F4E311B for ; Thu, 6 Sep 2018 17:52:46 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 92281AED7; Fri, 7 Sep 2018 00:52:45 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Fri, 07 Sep 2018 10:49:31 +1000 Message-ID: <153628137159.8267.921309094971745898.stgit@noble> In-Reply-To: <153628058697.8267.6056114844033479774.stgit@noble> References: <153628058697.8267.6056114844033479774.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 09/34] lnet: add list of cpts to lnet_net. X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP struct lnet_net now has a list of cpts, which is the union of the cpts for each lnet_ni. This is part of 8cbb8cd3e771e7f7e0f99cafc19fad32770dc015 LU-7734 lnet: Multi-Rail local NI split Signed-off-by: NeilBrown Reviewed-by: Doug Oucharek > Signed-off-by: NeilBrown > Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-by: James Simmons --- .../staging/lustre/include/linux/lnet/lib-types.h | 6 + drivers/staging/lustre/lnet/lnet/config.c | 164 ++++++++++++++++++++ 2 files changed, 170 insertions(+) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h index 2d2c066a11ba..22957d142cc0 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-types.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h @@ -266,6 +266,12 @@ struct lnet_net { * lnet/include/lnet/nidstr.h */ __u32 net_id; + /* total number of CPTs in the array */ + __u32 net_ncpts; + + /* cumulative CPTs of all NIs in this net */ + __u32 *net_cpts; + /* network tunables */ struct lnet_ioctl_config_lnd_cmn_tunables net_tunables; diff --git a/drivers/staging/lustre/lnet/lnet/config.c b/drivers/staging/lustre/lnet/lnet/config.c index e83bdbec11e3..380a3fb1caba 100644 --- a/drivers/staging/lustre/lnet/lnet/config.c +++ b/drivers/staging/lustre/lnet/lnet/config.c @@ -91,11 +91,169 @@ lnet_net_unique(__u32 net, struct list_head *netlist) return true; } +static bool +in_array(__u32 *array, __u32 size, __u32 value) +{ + int i; + + for (i = 0; i < size; i++) { + if (array[i] == value) + return false; + } + + return true; +} + +static int +lnet_net_append_cpts(__u32 *cpts, __u32 ncpts, struct lnet_net *net) +{ + __u32 *added_cpts = NULL; + int i, j = 0, rc = 0; + + /* + * no need to go futher since a subset of the NIs already exist on + * all CPTs + */ + if (net->net_ncpts == LNET_CPT_NUMBER) + return 0; + + if (cpts == NULL) { + /* there is an NI which will exist on all CPTs */ + if (net->net_cpts != NULL) + kvfree(net->net_cpts); + net->net_cpts = NULL; + net->net_ncpts = LNET_CPT_NUMBER; + return 0; + } + + if (net->net_cpts == NULL) { + net->net_cpts = kmalloc_array(ncpts, sizeof(net->net_cpts), + GFP_KERNEL); + if (net->net_cpts == NULL) + return -ENOMEM; + memcpy(net->net_cpts, cpts, ncpts); + return 0; + } + + added_cpts = kmalloc_array(LNET_CPT_NUMBER, sizeof(*added_cpts), + GFP_KERNEL); + if (added_cpts == NULL) + return -ENOMEM; + + for (i = 0; i < ncpts; i++) { + if (!in_array(net->net_cpts, net->net_ncpts, cpts[i])) { + added_cpts[j] = cpts[i]; + j++; + } + } + + /* append the new cpts if any to the list of cpts in the net */ + if (j > 0) { + __u32 *array = NULL, *loc; + __u32 total_entries = j + net->net_ncpts; + + array = kmalloc_array(total_entries, sizeof(*net->net_cpts), + GFP_KERNEL); + if (array == NULL) { + rc = -ENOMEM; + goto failed; + } + + memcpy(array, net->net_cpts, + net->net_ncpts * sizeof(*net->net_cpts)); + loc = array + net->net_ncpts; + memcpy(loc, added_cpts, j * sizeof(*net->net_cpts)); + + kfree(net->net_cpts); + net->net_ncpts = total_entries; + net->net_cpts = array; + } + +failed: + kfree(added_cpts); + + return rc; +} + +static void +lnet_net_remove_cpts(__u32 *cpts, __u32 ncpts, struct lnet_net *net) +{ + struct lnet_ni *ni; + int rc; + + /* + * Operation Assumption: + * This function is called after an NI has been removed from + * its parent net. + * + * if we're removing an NI which exists on all CPTs then + * we have to check if any of the other NIs on this net also + * exists on all CPTs. If none, then we need to build our Net CPT + * list based on the remaining NIs. + * + * If the NI being removed exist on a subset of the CPTs then we + * alo rebuild the Net CPT list based on the remaining NIs, which + * should resutl in the expected Net CPT list. + */ + + /* + * sometimes this function can be called due to some failure + * creating an NI, before any of the cpts are allocated, so check + * for that case and don't do anything + */ + if (ncpts == 0) + return; + + if (ncpts == LNET_CPT_NUMBER) { + /* + * first iteration through the NI list in the net to see + * if any of the NIs exist on all the CPTs. If one is + * found then our job is done. + */ + list_for_each_entry(ni, &net->net_ni_list, ni_netlist) { + if (ni->ni_ncpts == LNET_CPT_NUMBER) + return; + } + } + + /* + * Rebuild the Net CPT list again, thereby only including only the + * CPTs which the remaining NIs are associated with. + */ + if (net->net_cpts != NULL) { + kfree(net->net_cpts); + net->net_cpts = NULL; + } + + list_for_each_entry(ni, &net->net_ni_list, ni_netlist) { + rc = lnet_net_append_cpts(ni->ni_cpts, ni->ni_ncpts, + net); + if (rc != 0) { + CERROR("Out of Memory\n"); + /* + * do our best to keep on going. Delete + * the net cpts and set it to NULL. This + * way we can keep on going but less + * efficiently, since memory accesses might be + * accross CPT lines. + */ + if (net->net_cpts != NULL) { + kfree(net->net_cpts); + net->net_cpts = NULL; + net->net_ncpts = LNET_CPT_NUMBER; + } + return; + } + } +} + void lnet_ni_free(struct lnet_ni *ni) { int i; + lnet_net_remove_cpts(ni->ni_cpts, ni->ni_ncpts, ni->ni_net); + if (ni->ni_refs) cfs_percpt_free(ni->ni_refs); @@ -128,6 +286,9 @@ lnet_net_free(struct lnet_net *net) lnet_ni_free(ni); } + if (net->net_cpts != NULL) + kfree(net->net_cpts); + kfree(net); } @@ -229,6 +390,9 @@ lnet_ni_alloc(struct lnet_net *net, struct cfs_expr_list *el, char *iface) ni->ni_net_ns = NULL; ni->ni_last_alive = ktime_get_real_seconds(); + rc = lnet_net_append_cpts(ni->ni_cpts, ni->ni_ncpts, net); + if (rc != 0) + goto failed; list_add_tail(&ni->ni_netlist, &net->net_ni_list); return ni; From patchwork Fri Sep 7 00:49:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10591349 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 65359112B for ; Fri, 7 Sep 2018 00:52:57 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 524E9283FF for ; Fri, 7 Sep 2018 00:52:57 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 448F32859E; Fri, 7 Sep 2018 00:52:57 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 842E6283FF for ; Fri, 7 Sep 2018 00:52:56 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 23A034E3165; Thu, 6 Sep 2018 17:52:56 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 6B8EF4E30C2 for ; Thu, 6 Sep 2018 17:52:54 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 7F49AAEF1; Fri, 7 Sep 2018 00:52:53 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Fri, 07 Sep 2018 10:49:31 +1000 Message-ID: <153628137163.8267.8668023630519839070.stgit@noble> In-Reply-To: <153628058697.8267.6056114844033479774.stgit@noble> References: <153628058697.8267.6056114844033479774.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 10/34] lnet: add ni arg to lnet_cpt_of_nid() X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP When choosing a cpt to use for a given network (identified by nid), the choice might depend on a particular interface which has already been identified - different interfaces can have different sets of cpts. So add an 'ni' arg to lnet_cpt_of_nid(). If given, choose a cpt from the cpts of that interface. If not given, choose one from the set of all cpts associated with any interface on the network. This is part of 8cbb8cd3e771e7f7e0f99cafc19fad32770dc015 LU-7734 lnet: Multi-Rail local NI split Signed-off-by: NeilBrown Reviewed-by: Doug Oucharek > Signed-off-by: NeilBrown > Reviewed-by: Doug Oucharek <dougso@me.com> Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-by: James Simmons --- .../staging/lustre/include/linux/lnet/lib-lnet.h | 4 +- .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c | 4 +- .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c | 2 - .../staging/lustre/lnet/klnds/socklnd/socklnd.c | 4 +- drivers/staging/lustre/lnet/lnet/api-ni.c | 41 ++++++++++++-------- drivers/staging/lustre/lnet/lnet/lib-move.c | 12 +++--- drivers/staging/lustre/lnet/lnet/lib-ptl.c | 2 - drivers/staging/lustre/lnet/lnet/peer.c | 4 +- drivers/staging/lustre/lnet/lnet/router.c | 4 +- drivers/staging/lustre/lnet/selftest/brw_test.c | 2 - drivers/staging/lustre/lnet/selftest/framework.c | 3 + drivers/staging/lustre/lnet/selftest/selftest.h | 2 - 12 files changed, 48 insertions(+), 36 deletions(-) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h index 34509e52bac7..e32dbb854d80 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h @@ -395,8 +395,8 @@ lnet_net2rnethash(__u32 net) extern struct lnet_lnd the_lolnd; extern int avoid_asym_router_failure; -int lnet_cpt_of_nid_locked(lnet_nid_t nid); -int lnet_cpt_of_nid(lnet_nid_t nid); +int lnet_cpt_of_nid_locked(lnet_nid_t nid, struct lnet_ni *ni); +int lnet_cpt_of_nid(lnet_nid_t nid, struct lnet_ni *ni); struct lnet_ni *lnet_nid2ni_locked(lnet_nid_t nid, int cpt); struct lnet_ni *lnet_net2ni_locked(__u32 net, int cpt); struct lnet_ni *lnet_net2ni(__u32 net); diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c index ade566d20c69..958ac9a99045 100644 --- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c +++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c @@ -320,7 +320,7 @@ int kiblnd_create_peer(struct lnet_ni *ni, struct kib_peer **peerp, { struct kib_peer *peer; struct kib_net *net = ni->ni_data; - int cpt = lnet_cpt_of_nid(nid); + int cpt = lnet_cpt_of_nid(nid, ni); unsigned long flags; LASSERT(net); @@ -643,7 +643,7 @@ struct kib_conn *kiblnd_create_conn(struct kib_peer *peer, struct rdma_cm_id *cm dev = net->ibn_dev; - cpt = lnet_cpt_of_nid(peer->ibp_nid); + cpt = lnet_cpt_of_nid(peer->ibp_nid, peer->ibp_ni); sched = kiblnd_data.kib_scheds[cpt]; LASSERT(sched->ibs_nthreads > 0); diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c index c266940cb2ae..e64c14914924 100644 --- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c +++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c @@ -119,7 +119,7 @@ kiblnd_get_idle_tx(struct lnet_ni *ni, lnet_nid_t target) struct kib_tx *tx; struct kib_tx_poolset *tps; - tps = net->ibn_tx_ps[lnet_cpt_of_nid(target)]; + tps = net->ibn_tx_ps[lnet_cpt_of_nid(target, ni)]; node = kiblnd_pool_alloc_node(&tps->tps_poolset); if (!node) return NULL; diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c index 2036a0ae5917..ba68bcee90bc 100644 --- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c +++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c @@ -101,7 +101,7 @@ static int ksocknal_create_peer(struct ksock_peer **peerp, struct lnet_ni *ni, struct lnet_process_id id) { - int cpt = lnet_cpt_of_nid(id.nid); + int cpt = lnet_cpt_of_nid(id.nid, ni); struct ksock_net *net = ni->ni_data; struct ksock_peer *peer; @@ -1099,7 +1099,7 @@ ksocknal_create_conn(struct lnet_ni *ni, struct ksock_route *route, LASSERT(conn->ksnc_proto); LASSERT(peerid.nid != LNET_NID_ANY); - cpt = lnet_cpt_of_nid(peerid.nid); + cpt = lnet_cpt_of_nid(peerid.nid, ni); if (active) { ksocknal_peer_addref(peer); diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index c21aef32cdde..6e0b8310574d 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -713,31 +713,41 @@ lnet_nid_cpt_hash(lnet_nid_t nid, unsigned int number) } int -lnet_cpt_of_nid_locked(lnet_nid_t nid) +lnet_cpt_of_nid_locked(lnet_nid_t nid, struct lnet_ni *ni) { - struct lnet_ni *ni; + struct lnet_net *net; /* must called with hold of lnet_net_lock */ if (LNET_CPT_NUMBER == 1) return 0; /* the only one */ - /* take lnet_net_lock(any) would be OK */ - if (!list_empty(&the_lnet.ln_nis_cpt)) { - list_for_each_entry(ni, &the_lnet.ln_nis_cpt, ni_cptlist) { - if (LNET_NIDNET(ni->ni_nid) != LNET_NIDNET(nid)) - continue; + /* + * If NI is provided then use the CPT identified in the NI cpt + * list if one exists. If one doesn't exist, then that NI is + * associated with all CPTs and it follows that the net it belongs + * to is implicitly associated with all CPTs, so just hash the nid + * and return that. + */ + if (ni != NULL) { + if (ni->ni_cpts != NULL) + return ni->ni_cpts[lnet_nid_cpt_hash(nid, + ni->ni_ncpts)]; + else + return lnet_nid_cpt_hash(nid, LNET_CPT_NUMBER); + } - LASSERT(ni->ni_cpts); - return ni->ni_cpts[lnet_nid_cpt_hash - (nid, ni->ni_ncpts)]; - } + /* no NI provided so look at the net */ + net = lnet_get_net_locked(LNET_NIDNET(nid)); + + if (net != NULL && net->net_cpts) { + return net->net_cpts[lnet_nid_cpt_hash(nid, net->net_ncpts)]; } return lnet_nid_cpt_hash(nid, LNET_CPT_NUMBER); } int -lnet_cpt_of_nid(lnet_nid_t nid) +lnet_cpt_of_nid(lnet_nid_t nid, struct lnet_ni *ni) { int cpt; int cpt2; @@ -745,11 +755,10 @@ lnet_cpt_of_nid(lnet_nid_t nid) if (LNET_CPT_NUMBER == 1) return 0; /* the only one */ - if (list_empty(&the_lnet.ln_nis_cpt)) - return lnet_nid_cpt_hash(nid, LNET_CPT_NUMBER); - cpt = lnet_net_lock_current(); - cpt2 = lnet_cpt_of_nid_locked(nid); + + cpt2 = lnet_cpt_of_nid_locked(nid, ni); + lnet_net_unlock(cpt); return cpt2; diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c index b6e81a693fc3..02cd1a5a466f 100644 --- a/drivers/staging/lustre/lnet/lnet/lib-move.c +++ b/drivers/staging/lustre/lnet/lnet/lib-move.c @@ -1095,7 +1095,9 @@ lnet_send(lnet_nid_t src_nid, struct lnet_msg *msg, lnet_nid_t rtr_nid) msg->msg_sending = 1; LASSERT(!msg->msg_tx_committed); - cpt = lnet_cpt_of_nid(rtr_nid == LNET_NID_ANY ? dst_nid : rtr_nid); + local_ni = lnet_net2ni(LNET_NIDNET(dst_nid)); + cpt = lnet_cpt_of_nid(rtr_nid == LNET_NID_ANY ? dst_nid : rtr_nid, + local_ni); again: lnet_net_lock(cpt); @@ -1188,7 +1190,7 @@ lnet_send(lnet_nid_t src_nid, struct lnet_msg *msg, lnet_nid_t rtr_nid) * was changed when we release the lock */ if (rtr_nid != lp->lp_nid) { - cpt2 = lnet_cpt_of_nid_locked(lp->lp_nid); + cpt2 = lp->lp_cpt; if (cpt2 != cpt) { if (src_ni) lnet_ni_decref_locked(src_ni, cpt); @@ -1677,7 +1679,7 @@ lnet_parse(struct lnet_ni *ni, struct lnet_hdr *hdr, lnet_nid_t from_nid, payload_length = le32_to_cpu(hdr->payload_length); for_me = (ni->ni_nid == dest_nid); - cpt = lnet_cpt_of_nid(from_nid); + cpt = lnet_cpt_of_nid(from_nid, ni); switch (type) { case LNET_MSG_ACK: @@ -2149,7 +2151,7 @@ lnet_create_reply_msg(struct lnet_ni *ni, struct lnet_msg *getmsg) lnet_msg_attach_md(msg, getmd, getmd->md_offset, getmd->md_length); lnet_res_unlock(cpt); - cpt = lnet_cpt_of_nid(peer_id.nid); + cpt = lnet_cpt_of_nid(peer_id.nid, ni); lnet_net_lock(cpt); lnet_msg_commit(msg, cpt); @@ -2160,7 +2162,7 @@ lnet_create_reply_msg(struct lnet_ni *ni, struct lnet_msg *getmsg) return msg; drop: - cpt = lnet_cpt_of_nid(peer_id.nid); + cpt = lnet_cpt_of_nid(peer_id.nid, ni); lnet_net_lock(cpt); the_lnet.ln_counters[cpt]->drop_count++; diff --git a/drivers/staging/lustre/lnet/lnet/lib-ptl.c b/drivers/staging/lustre/lnet/lnet/lib-ptl.c index 90ce51801726..c8d8162cc706 100644 --- a/drivers/staging/lustre/lnet/lnet/lib-ptl.c +++ b/drivers/staging/lustre/lnet/lnet/lib-ptl.c @@ -220,7 +220,7 @@ lnet_match2mt(struct lnet_portal *ptl, struct lnet_process_id id, __u64 mbits) /* if it's a unique portal, return match-table hashed by NID */ return lnet_ptl_is_unique(ptl) ? - ptl->ptl_mtables[lnet_cpt_of_nid(id.nid)] : NULL; + ptl->ptl_mtables[lnet_cpt_of_nid(id.nid, NULL)] : NULL; } struct lnet_match_table * diff --git a/drivers/staging/lustre/lnet/lnet/peer.c b/drivers/staging/lustre/lnet/lnet/peer.c index ed29124ebded..808ce25f1f00 100644 --- a/drivers/staging/lustre/lnet/lnet/peer.c +++ b/drivers/staging/lustre/lnet/lnet/peer.c @@ -270,7 +270,7 @@ lnet_nid2peer_locked(struct lnet_peer **lpp, lnet_nid_t nid, int cpt) return -ESHUTDOWN; /* cpt can be LNET_LOCK_EX if it's called from router functions */ - cpt2 = cpt != LNET_LOCK_EX ? cpt : lnet_cpt_of_nid_locked(nid); + cpt2 = cpt != LNET_LOCK_EX ? cpt : lnet_cpt_of_nid_locked(nid, NULL); ptable = the_lnet.ln_peer_tables[cpt2]; lp = lnet_find_peer_locked(ptable, nid); @@ -362,7 +362,7 @@ lnet_debug_peer(lnet_nid_t nid) int rc; int cpt; - cpt = lnet_cpt_of_nid(nid); + cpt = lnet_cpt_of_nid(nid, NULL); lnet_net_lock(cpt); rc = lnet_nid2peer_locked(&lp, nid, cpt); diff --git a/drivers/staging/lustre/lnet/lnet/router.c b/drivers/staging/lustre/lnet/lnet/router.c index 72b8ca2b0fc6..5493d13de6d9 100644 --- a/drivers/staging/lustre/lnet/lnet/router.c +++ b/drivers/staging/lustre/lnet/lnet/router.c @@ -1207,7 +1207,7 @@ lnet_router_checker(void *arg) version = the_lnet.ln_routers_version; list_for_each_entry(rtr, &the_lnet.ln_routers, lp_rtr_list) { - cpt2 = lnet_cpt_of_nid_locked(rtr->lp_nid); + cpt2 = rtr->lp_cpt; if (cpt != cpt2) { lnet_net_unlock(cpt); cpt = cpt2; @@ -1693,7 +1693,7 @@ lnet_notify(struct lnet_ni *ni, lnet_nid_t nid, int alive, time64_t when) { struct lnet_peer *lp = NULL; time64_t now = ktime_get_seconds(); - int cpt = lnet_cpt_of_nid(nid); + int cpt = lnet_cpt_of_nid(nid, ni); LASSERT(!in_interrupt()); diff --git a/drivers/staging/lustre/lnet/selftest/brw_test.c b/drivers/staging/lustre/lnet/selftest/brw_test.c index f1ee219bc8f3..e372ff3044c8 100644 --- a/drivers/staging/lustre/lnet/selftest/brw_test.c +++ b/drivers/staging/lustre/lnet/selftest/brw_test.c @@ -124,7 +124,7 @@ brw_client_init(struct sfw_test_instance *tsi) return -EINVAL; list_for_each_entry(tsu, &tsi->tsi_units, tsu_list) { - bulk = srpc_alloc_bulk(lnet_cpt_of_nid(tsu->tsu_dest.nid), + bulk = srpc_alloc_bulk(lnet_cpt_of_nid(tsu->tsu_dest.nid, NULL), off, npg, len, opc == LST_BRW_READ); if (!bulk) { brw_client_fini(tsi); diff --git a/drivers/staging/lustre/lnet/selftest/framework.c b/drivers/staging/lustre/lnet/selftest/framework.c index 944a2a6598fa..a82efc394659 100644 --- a/drivers/staging/lustre/lnet/selftest/framework.c +++ b/drivers/staging/lustre/lnet/selftest/framework.c @@ -1013,7 +1013,8 @@ sfw_run_batch(struct sfw_batch *tsb) tsu->tsu_loop = tsi->tsi_loop; wi = &tsu->tsu_worker; swi_init_workitem(wi, sfw_run_test, - lst_test_wq[lnet_cpt_of_nid(tsu->tsu_dest.nid)]); + lst_test_wq[lnet_cpt_of_nid(tsu->tsu_dest.nid, + NULL)]); swi_schedule_workitem(wi); } } diff --git a/drivers/staging/lustre/lnet/selftest/selftest.h b/drivers/staging/lustre/lnet/selftest/selftest.h index 9dbb0a51d430..edf783af90e8 100644 --- a/drivers/staging/lustre/lnet/selftest/selftest.h +++ b/drivers/staging/lustre/lnet/selftest/selftest.h @@ -527,7 +527,7 @@ srpc_init_client_rpc(struct srpc_client_rpc *rpc, struct lnet_process_id peer, INIT_LIST_HEAD(&rpc->crpc_list); swi_init_workitem(&rpc->crpc_wi, srpc_send_rpc, - lst_test_wq[lnet_cpt_of_nid(peer.nid)]); + lst_test_wq[lnet_cpt_of_nid(peer.nid, NULL)]); spin_lock_init(&rpc->crpc_lock); atomic_set(&rpc->crpc_refcount, 1); /* 1 ref for caller */ From patchwork Fri Sep 7 00:49:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10591351 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F0452921 for ; Fri, 7 Sep 2018 00:53:03 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E2CEB283FF for ; Fri, 7 Sep 2018 00:53:03 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D674F2859E; Fri, 7 Sep 2018 00:53:03 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 866FA283FF for ; Fri, 7 Sep 2018 00:53:03 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 2823C4E3152; Thu, 6 Sep 2018 17:53:03 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 409344E2FED for ; Thu, 6 Sep 2018 17:53:01 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 84A9CAED7; Fri, 7 Sep 2018 00:53:00 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Fri, 07 Sep 2018 10:49:31 +1000 Message-ID: <153628137168.8267.1167942220741712684.stgit@noble> In-Reply-To: <153628058697.8267.6056114844033479774.stgit@noble> References: <153628058697.8267.6056114844033479774.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 11/34] lnet: pass tun to lnet_startup_lndni, instead of full conf X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP I don't understand parts of this change. Particularly the removal for /* If given some LND tunable parameters, parse those now to * override the values in the NI structure. */ isn't clear to me. This is part of 8cbb8cd3e771e7f7e0f99cafc19fad32770dc015 LU-7734 lnet: Multi-Rail local NI split Signed-off-by: NeilBrown Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-by: Doug Oucharek --- drivers/staging/lustre/lnet/lnet/api-ni.c | 41 ++++++++--------------------- 1 file changed, 12 insertions(+), 29 deletions(-) diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index 6e0b8310574d..53ecfd700db3 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -1240,10 +1240,8 @@ lnet_shutdown_lndni(struct lnet_ni *ni) } static int -lnet_startup_lndni(struct lnet_ni *ni, struct lnet_ioctl_config_data *conf) +lnet_startup_lndni(struct lnet_ni *ni, struct lnet_lnd_tunables *tun) { - struct lnet_ioctl_config_lnd_tunables *lnd_tunables = NULL; - struct lnet_lnd_tunables *tun = NULL; int rc = -EINVAL; int lnd_type; struct lnet_lnd *lnd; @@ -1296,36 +1294,12 @@ lnet_startup_lndni(struct lnet_ni *ni, struct lnet_ioctl_config_data *conf) ni->ni_net->net_lnd = lnd; - if (conf && conf->cfg_hdr.ioc_len > sizeof(*conf)) { - lnd_tunables = (struct lnet_ioctl_config_lnd_tunables *)conf->cfg_bulk; - tun = &lnd_tunables->lt_tun; - } - if (tun) { memcpy(&ni->ni_lnd_tunables, tun, sizeof(*tun)); ni->ni_lnd_tunables_set = true; } - /* - * If given some LND tunable parameters, parse those now to - * override the values in the NI structure. - */ - if (conf) { - if (conf->cfg_config_u.cfg_net.net_peer_rtr_credits >= 0) - ni->ni_net->net_tunables.lct_peer_rtr_credits = - conf->cfg_config_u.cfg_net.net_peer_rtr_credits; - if (conf->cfg_config_u.cfg_net.net_peer_timeout >= 0) - ni->ni_net->net_tunables.lct_peer_timeout = - conf->cfg_config_u.cfg_net.net_peer_timeout; - if (conf->cfg_config_u.cfg_net.net_peer_tx_credits != -1) - ni->ni_net->net_tunables.lct_peer_tx_credits = - conf->cfg_config_u.cfg_net.net_peer_tx_credits; - if (conf->cfg_config_u.cfg_net.net_max_tx_credits >= 0) - ni->ni_net->net_tunables.lct_max_tx_credits = - conf->cfg_config_u.cfg_net.net_max_tx_credits; - } - rc = lnd->lnd_startup(ni); mutex_unlock(&the_lnet.ln_lnd_mutex); @@ -1861,9 +1835,13 @@ lnet_dyn_add_ni(lnet_pid_t requested_pid, struct lnet_ioctl_config_data *conf) struct list_head net_head; struct lnet_remotenet *rnet; int rc; + struct lnet_ioctl_config_lnd_tunables *lnd_tunables = NULL; INIT_LIST_HEAD(&net_head); + if (conf && conf->cfg_hdr.ioc_len > sizeof(*conf)) + lnd_tunables = (struct lnet_ioctl_config_lnd_tunables *)conf->cfg_bulk; + /* Create a net/ni structures for the network string */ rc = lnet_parse_networks(&net_head, nets); if (rc <= 0) @@ -1898,9 +1876,14 @@ lnet_dyn_add_ni(lnet_pid_t requested_pid, struct lnet_ioctl_config_data *conf) goto failed0; list_del_init(&net->net_list); + if (lnd_tunables) + memcpy(&net->net_tunables, + &lnd_tunables->lt_cmn, sizeof(lnd_tunables->lt_cmn)); + ni = list_first_entry(&net->net_ni_list, struct lnet_ni, ni_netlist); - rc = lnet_startup_lndni(ni, conf); - if (rc) + rc = lnet_startup_lndni(ni, (lnd_tunables ? + &lnd_tunables->lt_tun : NULL)); + if (rc < 0) goto failed1; if (ni->ni_net->net_lnd->lnd_accept) { From patchwork Fri Sep 7 00:49:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10591353 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EC2B0112B for ; Fri, 7 Sep 2018 00:53:10 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DE674283FF for ; Fri, 7 Sep 2018 00:53:10 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D0F362859E; Fri, 7 Sep 2018 00:53:10 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 4E39F283FF for ; Fri, 7 Sep 2018 00:53:10 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 0B5B64E3177; Thu, 6 Sep 2018 17:53:10 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 69FBB4E30B4 for ; Thu, 6 Sep 2018 17:53:08 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 92E05AEF1; Fri, 7 Sep 2018 00:53:07 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Fri, 07 Sep 2018 10:49:31 +1000 Message-ID: <153628137171.8267.13510813931908233567.stgit@noble> In-Reply-To: <153628058697.8267.6056114844033479774.stgit@noble> References: <153628058697.8267.6056114844033479774.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 12/34] lnet: split lnet_startup_lndni X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP Split into lnet_startup_lndnet which starts all nis in a net, and lnet_startup_lndni which starts an individual ni. lnet_startup_lndni() returns 0 on success, or -ve error. lnet_startup_lndnis() returned the count of interfaces started. The new lnet_startup_lndnet() returns the count of started interfaces, This requires adding lnet_shutdown_lndnet() to handle errors in lnet_dyn_add_ni(), which now uses the new lnet_startup_lndnet(). We now drop the ln_lnd_mutex near the end of lnet_startup_lndnet(), and re-claim it for each lnet_startup_lndni(). This is part of 8cbb8cd3e771e7f7e0f99cafc19fad32770dc015 LU-7734 lnet: Multi-Rail local NI split Signed-off-by: NeilBrown Reviewed-by: Doug Oucharek --- drivers/staging/lustre/lnet/lnet/api-ni.c | 142 +++++++++++++++++++++++------ 1 file changed, 111 insertions(+), 31 deletions(-) diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index 53ecfd700db3..8afddf11b5e2 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -1239,32 +1239,61 @@ lnet_shutdown_lndni(struct lnet_ni *ni) lnet_net_unlock(LNET_LOCK_EX); } +static void +lnet_shutdown_lndnet(struct lnet_net *net) +{ + struct lnet_ni *ni; + + lnet_net_lock(LNET_LOCK_EX); + + list_del_init(&net->net_list); + + while (!list_empty(&net->net_ni_list)) { + ni = list_entry(net->net_ni_list.next, + struct lnet_ni, ni_netlist); + lnet_net_unlock(LNET_LOCK_EX); + lnet_shutdown_lndni(ni); + lnet_net_lock(LNET_LOCK_EX); + } + + /* + * decrement ref count on lnd only when the entire network goes + * away + */ + net->net_lnd->lnd_refcount--; + + lnet_net_unlock(LNET_LOCK_EX); + + lnet_net_free(net); +} + static int -lnet_startup_lndni(struct lnet_ni *ni, struct lnet_lnd_tunables *tun) +lnet_startup_lndni(struct lnet_ni *ni, struct lnet_lnd_tunables *tun); + +static int +lnet_startup_lndnet(struct lnet_net *net, struct lnet_lnd_tunables *tun) { - int rc = -EINVAL; - int lnd_type; - struct lnet_lnd *lnd; - struct lnet_tx_queue *tq; - int i; - u32 seed; + struct lnet_ni *ni; + __u32 lnd_type; + struct lnet_lnd *lnd; + int rc; - lnd_type = LNET_NETTYP(LNET_NIDNET(ni->ni_nid)); + lnd_type = LNET_NETTYP(net->net_id); LASSERT(libcfs_isknown_lnd(lnd_type)); /* Make sure this new NI is unique. */ lnet_net_lock(LNET_LOCK_EX); - rc = lnet_net_unique(LNET_NIDNET(ni->ni_nid), &the_lnet.ln_nets); + rc = lnet_net_unique(net->net_id, &the_lnet.ln_nets); lnet_net_unlock(LNET_LOCK_EX); if (!rc) { if (lnd_type == LOLND) { - lnet_ni_free(ni); + lnet_net_free(net); return 0; } CERROR("Net %s is not unique\n", - libcfs_net2str(LNET_NIDNET(ni->ni_nid))); + libcfs_net2str(net->net_id)); rc = -EEXIST; goto failed0; } @@ -1291,8 +1320,32 @@ lnet_startup_lndni(struct lnet_ni *ni, struct lnet_lnd_tunables *tun) lnet_net_lock(LNET_LOCK_EX); lnd->lnd_refcount++; lnet_net_unlock(LNET_LOCK_EX); + net->net_lnd = lnd; + mutex_unlock(&the_lnet.ln_lnd_mutex); - ni->ni_net->net_lnd = lnd; + ni = list_first_entry(&net->net_ni_list, struct lnet_ni, ni_netlist); + + rc = lnet_startup_lndni(ni, tun); + if (rc < 0) + return rc; + return 1; + +failed0: + lnet_net_free(net); + + return rc; +} + +static int +lnet_startup_lndni(struct lnet_ni *ni, struct lnet_lnd_tunables *tun) +{ + int rc = -EINVAL; + struct lnet_tx_queue *tq; + int i; + struct lnet_net *net = ni->ni_net; + u32 seed; + + mutex_lock(&the_lnet.ln_lnd_mutex); if (tun) { memcpy(&ni->ni_lnd_tunables, tun, @@ -1300,15 +1353,15 @@ lnet_startup_lndni(struct lnet_ni *ni, struct lnet_lnd_tunables *tun) ni->ni_lnd_tunables_set = true; } - rc = lnd->lnd_startup(ni); + rc = net->net_lnd->lnd_startup(ni); mutex_unlock(&the_lnet.ln_lnd_mutex); if (rc) { LCONSOLE_ERROR_MSG(0x105, "Error %d starting up LNI %s\n", - rc, libcfs_lnd2str(lnd->lnd_type)); + rc, libcfs_lnd2str(net->net_lnd->lnd_type)); lnet_net_lock(LNET_LOCK_EX); - lnd->lnd_refcount--; + net->net_lnd->lnd_refcount--; lnet_net_unlock(LNET_LOCK_EX); goto failed0; } @@ -1324,7 +1377,7 @@ lnet_startup_lndni(struct lnet_ni *ni, struct lnet_lnd_tunables *tun) lnet_net_unlock(LNET_LOCK_EX); - if (lnd->lnd_type == LOLND) { + if (net->net_lnd->lnd_type == LOLND) { lnet_ni_addref(ni); LASSERT(!the_lnet.ln_loni); the_lnet.ln_loni = ni; @@ -1338,7 +1391,7 @@ lnet_startup_lndni(struct lnet_ni *ni, struct lnet_lnd_tunables *tun) if (!ni->ni_net->net_tunables.lct_peer_tx_credits || !ni->ni_net->net_tunables.lct_max_tx_credits) { LCONSOLE_ERROR_MSG(0x107, "LNI %s has no %scredits\n", - libcfs_lnd2str(lnd->lnd_type), + libcfs_lnd2str(net->net_lnd->lnd_type), !ni->ni_net->net_tunables.lct_peer_tx_credits ? "" : "per-peer "); /* @@ -1375,21 +1428,22 @@ lnet_startup_lndni(struct lnet_ni *ni, struct lnet_lnd_tunables *tun) } static int -lnet_startup_lndnis(struct list_head *nilist) +lnet_startup_lndnets(struct list_head *netlist) { - struct lnet_ni *ni; + struct lnet_net *net; int rc; int ni_count = 0; - while (!list_empty(nilist)) { - ni = list_entry(nilist->next, struct lnet_ni, ni_netlist); - list_del(&ni->ni_netlist); - rc = lnet_startup_lndni(ni, NULL); + while (!list_empty(netlist)) { + net = list_entry(netlist->next, struct lnet_net, net_list); + list_del_init(&net->net_list); + + rc = lnet_startup_lndnet(net, NULL); if (rc < 0) goto failed; - ni_count++; + ni_count += rc; } return ni_count; @@ -1552,7 +1606,7 @@ LNetNIInit(lnet_pid_t requested_pid) goto err_empty_list; } - ni_count = lnet_startup_lndnis(&net_head); + ni_count = lnet_startup_lndnets(&net_head); if (ni_count < 0) { rc = ni_count; goto err_empty_list; @@ -1831,10 +1885,11 @@ lnet_dyn_add_ni(lnet_pid_t requested_pid, struct lnet_ioctl_config_data *conf) struct lnet_ping_info *pinfo; struct lnet_handle_md md_handle; struct lnet_net *net; - struct lnet_ni *ni; struct list_head net_head; struct lnet_remotenet *rnet; int rc; + int num_acceptor_nets; + __u32 net_type; struct lnet_ioctl_config_lnd_tunables *lnd_tunables = NULL; INIT_LIST_HEAD(&net_head); @@ -1876,22 +1931,47 @@ lnet_dyn_add_ni(lnet_pid_t requested_pid, struct lnet_ioctl_config_data *conf) goto failed0; list_del_init(&net->net_list); + if (lnd_tunables) memcpy(&net->net_tunables, &lnd_tunables->lt_cmn, sizeof(lnd_tunables->lt_cmn)); - ni = list_first_entry(&net->net_ni_list, struct lnet_ni, ni_netlist); - rc = lnet_startup_lndni(ni, (lnd_tunables ? + /* + * before starting this network get a count of the current TCP + * networks which require the acceptor thread running. If that + * count is == 0 before we start up this network, then we'd want to + * start up the acceptor thread after starting up this network + */ + num_acceptor_nets = lnet_count_acceptor_nets(); + + /* + * lnd_startup_lndnet() can deallocate 'net' even if it it returns + * success, because we endded up adding interfaces to an existing + * network. So grab the net_type now + */ + net_type = LNET_NETTYP(net->net_id); + + rc = lnet_startup_lndnet(net, (lnd_tunables ? &lnd_tunables->lt_tun : NULL)); if (rc < 0) goto failed1; - if (ni->ni_net->net_lnd->lnd_accept) { + /* + * Start the acceptor thread if this is the first network + * being added that requires the thread. + */ + if (net_type == SOCKLND && num_acceptor_nets == 0) { rc = lnet_acceptor_start(); if (rc < 0) { - /* shutdown the ni that we just started */ + /* shutdown the net that we just started */ CERROR("Failed to start up acceptor thread\n"); - lnet_shutdown_lndni(ni); + /* + * Note that if we needed to start the acceptor + * thread, then 'net' must have been the first TCP + * network, therefore was unique, and therefore + * wasn't deallocated by lnet_startup_lndnet() + */ + lnet_shutdown_lndnet(net); goto failed1; } } From patchwork Fri Sep 7 00:49:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10591355 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D15F8112B for ; Fri, 7 Sep 2018 00:53:19 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C25CB283FF for ; Fri, 7 Sep 2018 00:53:19 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B64572859E; Fri, 7 Sep 2018 00:53:19 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 5E342283FF for ; Fri, 7 Sep 2018 00:53:19 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id B2E2B4E318F; Thu, 6 Sep 2018 17:53:18 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 0E2AF4E30B4 for ; Thu, 6 Sep 2018 17:53:17 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 464A9AED7; Fri, 7 Sep 2018 00:53:16 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Fri, 07 Sep 2018 10:49:31 +1000 Message-ID: <153628137175.8267.2271624767774752203.stgit@noble> In-Reply-To: <153628058697.8267.6056114844033479774.stgit@noble> References: <153628058697.8267.6056114844033479774.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 13/34] lnet: reverse order of lnet_startup_lnd{net, ni} X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP Change the order - no other change. This is part of 8cbb8cd3e771e7f7e0f99cafc19fad32770dc015 LU-7734 lnet: Multi-Rail local NI split Signed-off-by: NeilBrown Reviewed-by: Doug Oucharek --- drivers/staging/lustre/lnet/lnet/api-ni.c | 135 ++++++++++++++--------------- 1 file changed, 66 insertions(+), 69 deletions(-) diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index 8afddf11b5e2..09ea7e506128 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -1267,75 +1267,6 @@ lnet_shutdown_lndnet(struct lnet_net *net) lnet_net_free(net); } -static int -lnet_startup_lndni(struct lnet_ni *ni, struct lnet_lnd_tunables *tun); - -static int -lnet_startup_lndnet(struct lnet_net *net, struct lnet_lnd_tunables *tun) -{ - struct lnet_ni *ni; - __u32 lnd_type; - struct lnet_lnd *lnd; - int rc; - - lnd_type = LNET_NETTYP(net->net_id); - - LASSERT(libcfs_isknown_lnd(lnd_type)); - - /* Make sure this new NI is unique. */ - lnet_net_lock(LNET_LOCK_EX); - rc = lnet_net_unique(net->net_id, &the_lnet.ln_nets); - lnet_net_unlock(LNET_LOCK_EX); - if (!rc) { - if (lnd_type == LOLND) { - lnet_net_free(net); - return 0; - } - - CERROR("Net %s is not unique\n", - libcfs_net2str(net->net_id)); - rc = -EEXIST; - goto failed0; - } - - mutex_lock(&the_lnet.ln_lnd_mutex); - lnd = lnet_find_lnd_by_type(lnd_type); - - if (!lnd) { - mutex_unlock(&the_lnet.ln_lnd_mutex); - rc = request_module("%s", libcfs_lnd2modname(lnd_type)); - mutex_lock(&the_lnet.ln_lnd_mutex); - - lnd = lnet_find_lnd_by_type(lnd_type); - if (!lnd) { - mutex_unlock(&the_lnet.ln_lnd_mutex); - CERROR("Can't load LND %s, module %s, rc=%d\n", - libcfs_lnd2str(lnd_type), - libcfs_lnd2modname(lnd_type), rc); - rc = -EINVAL; - goto failed0; - } - } - - lnet_net_lock(LNET_LOCK_EX); - lnd->lnd_refcount++; - lnet_net_unlock(LNET_LOCK_EX); - net->net_lnd = lnd; - mutex_unlock(&the_lnet.ln_lnd_mutex); - - ni = list_first_entry(&net->net_ni_list, struct lnet_ni, ni_netlist); - - rc = lnet_startup_lndni(ni, tun); - if (rc < 0) - return rc; - return 1; - -failed0: - lnet_net_free(net); - - return rc; -} - static int lnet_startup_lndni(struct lnet_ni *ni, struct lnet_lnd_tunables *tun) { @@ -1427,6 +1358,72 @@ lnet_startup_lndni(struct lnet_ni *ni, struct lnet_lnd_tunables *tun) return rc; } +static int +lnet_startup_lndnet(struct lnet_net *net, struct lnet_lnd_tunables *tun) +{ + struct lnet_ni *ni; + __u32 lnd_type; + struct lnet_lnd *lnd; + int rc; + + lnd_type = LNET_NETTYP(net->net_id); + + LASSERT(libcfs_isknown_lnd(lnd_type)); + + /* Make sure this new NI is unique. */ + lnet_net_lock(LNET_LOCK_EX); + rc = lnet_net_unique(net->net_id, &the_lnet.ln_nets); + lnet_net_unlock(LNET_LOCK_EX); + if (!rc) { + if (lnd_type == LOLND) { + lnet_net_free(net); + return 0; + } + + CERROR("Net %s is not unique\n", + libcfs_net2str(net->net_id)); + rc = -EEXIST; + goto failed0; + } + + mutex_lock(&the_lnet.ln_lnd_mutex); + lnd = lnet_find_lnd_by_type(lnd_type); + + if (!lnd) { + mutex_unlock(&the_lnet.ln_lnd_mutex); + rc = request_module("%s", libcfs_lnd2modname(lnd_type)); + mutex_lock(&the_lnet.ln_lnd_mutex); + + lnd = lnet_find_lnd_by_type(lnd_type); + if (!lnd) { + mutex_unlock(&the_lnet.ln_lnd_mutex); + CERROR("Can't load LND %s, module %s, rc=%d\n", + libcfs_lnd2str(lnd_type), + libcfs_lnd2modname(lnd_type), rc); + rc = -EINVAL; + goto failed0; + } + } + + lnet_net_lock(LNET_LOCK_EX); + lnd->lnd_refcount++; + lnet_net_unlock(LNET_LOCK_EX); + net->net_lnd = lnd; + mutex_unlock(&the_lnet.ln_lnd_mutex); + + ni = list_first_entry(&net->net_ni_list, struct lnet_ni, ni_netlist); + + rc = lnet_startup_lndni(ni, tun); + if (rc < 0) + return rc; + return 1; + +failed0: + lnet_net_free(net); + + return rc; +} + static int lnet_startup_lndnets(struct list_head *netlist) { From patchwork Fri Sep 7 00:49:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10591357 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3FF70921 for ; Fri, 7 Sep 2018 00:53:29 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 315532AF87 for ; Fri, 7 Sep 2018 00:53:29 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 257A92B17D; Fri, 7 Sep 2018 00:53:29 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id CAB992AF87 for ; Fri, 7 Sep 2018 00:53:28 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 837834E31A0; Thu, 6 Sep 2018 17:53:28 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 172174E312F for ; Thu, 6 Sep 2018 17:53:26 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 155C1AEF1; Fri, 7 Sep 2018 00:53:25 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Fri, 07 Sep 2018 10:49:31 +1000 Message-ID: <153628137179.8267.355535015389180735.stgit@noble> In-Reply-To: <153628058697.8267.6056114844033479774.stgit@noble> References: <153628058697.8267.6056114844033479774.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 14/34] lnet: rename lnet_find_net_locked to lnet_find_rnet_locked X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP This is part of 8cbb8cd3e771e7f7e0f99cafc19fad32770dc015 LU-7734 lnet: Multi-Rail local NI split Signed-off-by: NeilBrown Reviewed-by: Doug Oucharek --- .../staging/lustre/include/linux/lnet/lib-lnet.h | 2 +- drivers/staging/lustre/lnet/lnet/api-ni.c | 2 +- drivers/staging/lustre/lnet/lnet/lib-move.c | 2 +- drivers/staging/lustre/lnet/lnet/router.c | 4 ++-- 4 files changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h index e32dbb854d80..faa3f19dd844 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h @@ -430,7 +430,7 @@ int lnet_rtrpools_adjust(int tiny, int small, int large); int lnet_rtrpools_enable(void); void lnet_rtrpools_disable(void); void lnet_rtrpools_free(int keep_pools); -struct lnet_remotenet *lnet_find_net_locked(__u32 net); +struct lnet_remotenet *lnet_find_rnet_locked(__u32 net); int lnet_dyn_add_ni(lnet_pid_t requested_pid, struct lnet_ioctl_config_data *conf); int lnet_dyn_del_ni(__u32 net); diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index 09ea7e506128..c3c568e63342 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -1909,7 +1909,7 @@ lnet_dyn_add_ni(lnet_pid_t requested_pid, struct lnet_ioctl_config_data *conf) net = list_entry(net_head.next, struct lnet_net, net_list); lnet_net_lock(LNET_LOCK_EX); - rnet = lnet_find_net_locked(net->net_id); + rnet = lnet_find_rnet_locked(net->net_id); lnet_net_unlock(LNET_LOCK_EX); /* * make sure that the net added doesn't invalidate the current diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c index 02cd1a5a466f..00a89221c9b3 100644 --- a/drivers/staging/lustre/lnet/lnet/lib-move.c +++ b/drivers/staging/lustre/lnet/lnet/lib-move.c @@ -1022,7 +1022,7 @@ lnet_find_route_locked(struct lnet_net *net, lnet_nid_t target, * If @rtr_nid is not LNET_NID_ANY, return the gateway with * rtr_nid nid, otherwise find the best gateway I can use */ - rnet = lnet_find_net_locked(LNET_NIDNET(target)); + rnet = lnet_find_rnet_locked(LNET_NIDNET(target)); if (!rnet) return NULL; diff --git a/drivers/staging/lustre/lnet/lnet/router.c b/drivers/staging/lustre/lnet/lnet/router.c index 5493d13de6d9..1fce991fcb0e 100644 --- a/drivers/staging/lustre/lnet/lnet/router.c +++ b/drivers/staging/lustre/lnet/lnet/router.c @@ -220,7 +220,7 @@ lnet_rtr_decref_locked(struct lnet_peer *lp) } struct lnet_remotenet * -lnet_find_net_locked(__u32 net) +lnet_find_rnet_locked(__u32 net) { struct lnet_remotenet *rnet; struct list_head *rn_list; @@ -347,7 +347,7 @@ lnet_add_route(__u32 net, __u32 hops, lnet_nid_t gateway, LASSERT(!the_lnet.ln_shutdown); - rnet2 = lnet_find_net_locked(net); + rnet2 = lnet_find_rnet_locked(net); if (!rnet2) { /* new network */ list_add_tail(&rnet->lrn_list, lnet_net2rnethash(net)); From patchwork Fri Sep 7 00:49:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10591359 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2F580921 for ; Fri, 7 Sep 2018 00:53:35 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1F6BD2B06C for ; Fri, 7 Sep 2018 00:53:35 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 12C182B17D; Fri, 7 Sep 2018 00:53:35 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.3 required=2.0 tests=BAYES_00,FUZZY_AMBIEN, MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=no version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 93A7E2B184 for ; Fri, 7 Sep 2018 00:53:34 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 2B56A4E31DC; Thu, 6 Sep 2018 17:53:34 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 7DB884E318F for ; Thu, 6 Sep 2018 17:53:32 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id C5CF2AED7; Fri, 7 Sep 2018 00:53:31 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Fri, 07 Sep 2018 10:49:31 +1000 Message-ID: <153628137183.8267.14166864803956204561.stgit@noble> In-Reply-To: <153628058697.8267.6056114844033479774.stgit@noble> References: <153628058697.8267.6056114844033479774.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 15/34] lnet: extend zombie handling to nets and nis X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP A zombie lnet_ni is now attached to the lnet_net rather than the global the_lnet. The zombie lnet_net are attached to the_lnet. For some reason, we don't drop the refcount on the lnd before shutting it down now. This is part of 8cbb8cd3e771e7f7e0f99cafc19fad32770dc015 LU-7734 lnet: Multi-Rail local NI split Signed-off-by: NeilBrown Reviewed-by: Doug Oucharek k --- .../staging/lustre/include/linux/lnet/lib-types.h | 9 ++- drivers/staging/lustre/lnet/lnet/api-ni.c | 65 ++++++++++---------- drivers/staging/lustre/lnet/lnet/config.c | 3 + 3 files changed, 42 insertions(+), 35 deletions(-) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h index 22957d142cc0..1d372672e2de 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-types.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h @@ -284,6 +284,9 @@ struct lnet_net { struct lnet_lnd *net_lnd; /* list of NIs on this net */ struct list_head net_ni_list; + + /* dying LND instances */ + struct list_head net_ni_zombie; }; struct lnet_ni { @@ -653,11 +656,11 @@ struct lnet { /* LND instances */ struct list_head ln_nets; /* NIs bond on specific CPT(s) */ - struct list_head ln_nis_cpt; - /* dying LND instances */ - struct list_head ln_nis_zombie; + struct list_head ln_nis_cpt; /* the loopback NI */ struct lnet_ni *ln_loni; + /* network zombie list */ + struct list_head ln_net_zombie; /* remote networks with routes to them */ struct list_head *ln_remote_nets_hash; diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index c3c568e63342..18d111cb826b 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -539,7 +539,6 @@ lnet_prepare(lnet_pid_t requested_pid) INIT_LIST_HEAD(&the_lnet.ln_test_peers); INIT_LIST_HEAD(&the_lnet.ln_nets); INIT_LIST_HEAD(&the_lnet.ln_nis_cpt); - INIT_LIST_HEAD(&the_lnet.ln_nis_zombie); INIT_LIST_HEAD(&the_lnet.ln_routers); INIT_LIST_HEAD(&the_lnet.ln_drop_rules); INIT_LIST_HEAD(&the_lnet.ln_delay_rules); @@ -618,7 +617,6 @@ lnet_unprepare(void) LASSERT(list_empty(&the_lnet.ln_test_peers)); LASSERT(list_empty(&the_lnet.ln_nets)); LASSERT(list_empty(&the_lnet.ln_nis_cpt)); - LASSERT(list_empty(&the_lnet.ln_nis_zombie)); lnet_portals_destroy(); @@ -1095,34 +1093,35 @@ lnet_ni_unlink_locked(struct lnet_ni *ni) /* move it to zombie list and nobody can find it anymore */ LASSERT(!list_empty(&ni->ni_netlist)); - list_move(&ni->ni_netlist, &the_lnet.ln_nis_zombie); + list_move(&ni->ni_netlist, &ni->ni_net->net_ni_zombie); lnet_ni_decref_locked(ni, 0); } static void -lnet_clear_zombies_nis_locked(void) +lnet_clear_zombies_nis_locked(struct lnet_net *net) { int i; int islo; struct lnet_ni *ni; + struct list_head *zombie_list = &net->net_ni_zombie; /* - * Now wait for the NI's I just nuked to show up on ln_zombie_nis - * and shut them down in guaranteed thread context + * Now wait for the NIs I just nuked to show up on the zombie + * list and shut them down in guaranteed thread context */ i = 2; - while (!list_empty(&the_lnet.ln_nis_zombie)) { + while (!list_empty(zombie_list)) { int *ref; int j; - ni = list_entry(the_lnet.ln_nis_zombie.next, + ni = list_entry(zombie_list->next, struct lnet_ni, ni_netlist); list_del_init(&ni->ni_netlist); cfs_percpt_for_each(ref, j, ni->ni_refs) { if (!*ref) continue; /* still busy, add it back to zombie list */ - list_add(&ni->ni_netlist, &the_lnet.ln_nis_zombie); + list_add(&ni->ni_netlist, zombie_list); break; } @@ -1138,18 +1137,13 @@ lnet_clear_zombies_nis_locked(void) continue; } - ni->ni_net->net_lnd->lnd_refcount--; lnet_net_unlock(LNET_LOCK_EX); islo = ni->ni_net->net_lnd->lnd_type == LOLND; LASSERT(!in_interrupt()); - ni->ni_net->net_lnd->lnd_shutdown(ni); + net->net_lnd->lnd_shutdown(ni); - /* - * can't deref lnd anymore now; it might have unregistered - * itself... - */ if (!islo) CDEBUG(D_LNI, "Removed LNI %s\n", libcfs_nid2str(ni->ni_nid)); @@ -1162,9 +1156,11 @@ lnet_clear_zombies_nis_locked(void) } static void -lnet_shutdown_lndnis(void) +lnet_shutdown_lndnet(struct lnet_net *net); + +static void +lnet_shutdown_lndnets(void) { - struct lnet_ni *ni; int i; struct lnet_net *net; @@ -1173,30 +1169,35 @@ lnet_shutdown_lndnis(void) /* All quiet on the API front */ LASSERT(!the_lnet.ln_shutdown); LASSERT(!the_lnet.ln_refcount); - LASSERT(list_empty(&the_lnet.ln_nis_zombie)); lnet_net_lock(LNET_LOCK_EX); the_lnet.ln_shutdown = 1; /* flag shutdown */ - /* Unlink NIs from the global table */ while (!list_empty(&the_lnet.ln_nets)) { + /* + * move the nets to the zombie list to avoid them being + * picked up for new work. LONET is also included in the + * Nets that will be moved to the zombie list + */ net = list_entry(the_lnet.ln_nets.next, struct lnet_net, net_list); - while (!list_empty(&net->net_ni_list)) { - ni = list_entry(net->net_ni_list.next, - struct lnet_ni, ni_netlist); - lnet_ni_unlink_locked(ni); - } + list_move(&net->net_list, &the_lnet.ln_net_zombie); } - /* Drop the cached loopback NI. */ + /* Drop the cached loopback Net. */ if (the_lnet.ln_loni) { lnet_ni_decref_locked(the_lnet.ln_loni, 0); the_lnet.ln_loni = NULL; } - lnet_net_unlock(LNET_LOCK_EX); + /* iterate through the net zombie list and delete each net */ + while (!list_empty(&the_lnet.ln_net_zombie)) { + net = list_entry(the_lnet.ln_net_zombie.next, + struct lnet_net, net_list); + lnet_shutdown_lndnet(net); + } + /* * Clear lazy portals and drop delayed messages which hold refs * on their lnet_msg::msg_rxpeer @@ -1211,8 +1212,6 @@ lnet_shutdown_lndnis(void) lnet_peer_tables_cleanup(NULL); lnet_net_lock(LNET_LOCK_EX); - - lnet_clear_zombies_nis_locked(); the_lnet.ln_shutdown = 0; lnet_net_unlock(LNET_LOCK_EX); } @@ -1222,6 +1221,7 @@ static void lnet_shutdown_lndni(struct lnet_ni *ni) { int i; + struct lnet_net *net = ni->ni_net; lnet_net_lock(LNET_LOCK_EX); lnet_ni_unlink_locked(ni); @@ -1235,7 +1235,7 @@ lnet_shutdown_lndni(struct lnet_ni *ni) lnet_peer_tables_cleanup(ni); lnet_net_lock(LNET_LOCK_EX); - lnet_clear_zombies_nis_locked(); + lnet_clear_zombies_nis_locked(net); lnet_net_unlock(LNET_LOCK_EX); } @@ -1445,7 +1445,7 @@ lnet_startup_lndnets(struct list_head *netlist) return ni_count; failed: - lnet_shutdown_lndnis(); + lnet_shutdown_lndnets(); return rc; } @@ -1492,6 +1492,7 @@ int lnet_lib_init(void) the_lnet.ln_refcount = 0; LNetInvalidateEQHandle(&the_lnet.ln_rc_eqh); INIT_LIST_HEAD(&the_lnet.ln_lnds); + INIT_LIST_HEAD(&the_lnet.ln_net_zombie); INIT_LIST_HEAD(&the_lnet.ln_rcd_zombie); INIT_LIST_HEAD(&the_lnet.ln_rcd_deathrow); @@ -1656,7 +1657,7 @@ LNetNIInit(lnet_pid_t requested_pid) if (!the_lnet.ln_nis_from_mod_params) lnet_destroy_routes(); err_shutdown_lndnis: - lnet_shutdown_lndnis(); + lnet_shutdown_lndnets(); err_empty_list: lnet_unprepare(); LASSERT(rc < 0); @@ -1703,7 +1704,7 @@ LNetNIFini(void) lnet_acceptor_stop(); lnet_destroy_routes(); - lnet_shutdown_lndnis(); + lnet_shutdown_lndnets(); lnet_unprepare(); } diff --git a/drivers/staging/lustre/lnet/lnet/config.c b/drivers/staging/lustre/lnet/lnet/config.c index 380a3fb1caba..2588d67fea1b 100644 --- a/drivers/staging/lustre/lnet/lnet/config.c +++ b/drivers/staging/lustre/lnet/lnet/config.c @@ -279,6 +279,8 @@ lnet_net_free(struct lnet_net *net) struct list_head *tmp, *tmp2; struct lnet_ni *ni; + LASSERT(list_empty(&net->net_ni_zombie)); + /* delete any nis which have been started. */ list_for_each_safe(tmp, tmp2, &net->net_ni_list) { ni = list_entry(tmp, struct lnet_ni, ni_netlist); @@ -312,6 +314,7 @@ lnet_net_alloc(__u32 net_id, struct list_head *net_list) INIT_LIST_HEAD(&net->net_list); INIT_LIST_HEAD(&net->net_ni_list); + INIT_LIST_HEAD(&net->net_ni_zombie); net->net_id = net_id; From patchwork Fri Sep 7 00:49:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10591361 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4D83F921 for ; Fri, 7 Sep 2018 00:53:47 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3E7F42AF87 for ; Fri, 7 Sep 2018 00:53:47 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 330B12B17D; Fri, 7 Sep 2018 00:53:47 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id ED8EE2AF87 for ; Fri, 7 Sep 2018 00:53:46 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id B86D84E31C4; Thu, 6 Sep 2018 17:53:46 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id E91954E3165 for ; Thu, 6 Sep 2018 17:53:44 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id BE8EEAE56; Fri, 7 Sep 2018 00:53:43 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Fri, 07 Sep 2018 10:49:31 +1000 Message-ID: <153628137187.8267.1370078359892639459.stgit@noble> In-Reply-To: <153628058697.8267.6056114844033479774.stgit@noble> References: <153628058697.8267.6056114844033479774.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 16/34] lnet: lnet_shutdown_lndnets - remove some cleanup code. X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP I don't know what this did, or why it is being removed. This is part of 8cbb8cd3e771e7f7e0f99cafc19fad32770dc015 LU-7734 lnet: Multi-Rail local NI split Signed-off-by: NeilBrown --- drivers/staging/lustre/lnet/lnet/api-ni.c | 14 -------------- 1 file changed, 14 deletions(-) diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index 18d111cb826b..2529a11c6c59 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -1161,7 +1161,6 @@ lnet_shutdown_lndnet(struct lnet_net *net); static void lnet_shutdown_lndnets(void) { - int i; struct lnet_net *net; /* NB called holding the global mutex */ @@ -1198,19 +1197,6 @@ lnet_shutdown_lndnets(void) lnet_shutdown_lndnet(net); } - /* - * Clear lazy portals and drop delayed messages which hold refs - * on their lnet_msg::msg_rxpeer - */ - for (i = 0; i < the_lnet.ln_nportals; i++) - LNetClearLazyPortal(i); - - /* - * Clear the peer table and wait for all peers to go (they hold refs on - * their NIs) - */ - lnet_peer_tables_cleanup(NULL); - lnet_net_lock(LNET_LOCK_EX); the_lnet.ln_shutdown = 0; lnet_net_unlock(LNET_LOCK_EX); From patchwork Fri Sep 7 00:49:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10591363 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 362EF921 for ; Fri, 7 Sep 2018 00:53:53 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2945F2AF87 for ; Fri, 7 Sep 2018 00:53:53 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1D2FB2B17D; Fri, 7 Sep 2018 00:53:53 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.3 required=2.0 tests=BAYES_00,FUZZY_AMBIEN, MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=no version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id CAEB72AF87 for ; Fri, 7 Sep 2018 00:53:52 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 809AF4E31F6; Thu, 6 Sep 2018 17:53:52 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 3790B4E319F for ; Thu, 6 Sep 2018 17:53:51 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 54E1EAD72; Fri, 7 Sep 2018 00:53:50 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Fri, 07 Sep 2018 10:49:31 +1000 Message-ID: <153628137192.8267.6087363236690952153.stgit@noble> In-Reply-To: <153628058697.8267.6056114844033479774.stgit@noble> References: <153628058697.8267.6056114844033479774.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 17/34] lnet: move lnet_shutdown_lndnets down to after first use X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP This is part of 8cbb8cd3e771e7f7e0f99cafc19fad32770dc015 LU-7734 lnet: Multi-Rail local NI split Signed-off-by: NeilBrown Reviewed-by: Doug Oucharek --- drivers/staging/lustre/lnet/lnet/api-ni.c | 91 ++++++++++++++--------------- 1 file changed, 44 insertions(+), 47 deletions(-) diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index 2529a11c6c59..46c5ca71bc07 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -1155,53 +1155,6 @@ lnet_clear_zombies_nis_locked(struct lnet_net *net) } } -static void -lnet_shutdown_lndnet(struct lnet_net *net); - -static void -lnet_shutdown_lndnets(void) -{ - struct lnet_net *net; - - /* NB called holding the global mutex */ - - /* All quiet on the API front */ - LASSERT(!the_lnet.ln_shutdown); - LASSERT(!the_lnet.ln_refcount); - - lnet_net_lock(LNET_LOCK_EX); - the_lnet.ln_shutdown = 1; /* flag shutdown */ - - while (!list_empty(&the_lnet.ln_nets)) { - /* - * move the nets to the zombie list to avoid them being - * picked up for new work. LONET is also included in the - * Nets that will be moved to the zombie list - */ - net = list_entry(the_lnet.ln_nets.next, - struct lnet_net, net_list); - list_move(&net->net_list, &the_lnet.ln_net_zombie); - } - - /* Drop the cached loopback Net. */ - if (the_lnet.ln_loni) { - lnet_ni_decref_locked(the_lnet.ln_loni, 0); - the_lnet.ln_loni = NULL; - } - lnet_net_unlock(LNET_LOCK_EX); - - /* iterate through the net zombie list and delete each net */ - while (!list_empty(&the_lnet.ln_net_zombie)) { - net = list_entry(the_lnet.ln_net_zombie.next, - struct lnet_net, net_list); - lnet_shutdown_lndnet(net); - } - - lnet_net_lock(LNET_LOCK_EX); - the_lnet.ln_shutdown = 0; - lnet_net_unlock(LNET_LOCK_EX); -} - /* shutdown down the NI and release refcount */ static void lnet_shutdown_lndni(struct lnet_ni *ni) @@ -1253,6 +1206,50 @@ lnet_shutdown_lndnet(struct lnet_net *net) lnet_net_free(net); } +static void +lnet_shutdown_lndnets(void) +{ + struct lnet_net *net; + + /* NB called holding the global mutex */ + + /* All quiet on the API front */ + LASSERT(!the_lnet.ln_shutdown); + LASSERT(!the_lnet.ln_refcount); + + lnet_net_lock(LNET_LOCK_EX); + the_lnet.ln_shutdown = 1; /* flag shutdown */ + + while (!list_empty(&the_lnet.ln_nets)) { + /* + * move the nets to the zombie list to avoid them being + * picked up for new work. LONET is also included in the + * Nets that will be moved to the zombie list + */ + net = list_entry(the_lnet.ln_nets.next, + struct lnet_net, net_list); + list_move(&net->net_list, &the_lnet.ln_net_zombie); + } + + /* Drop the cached loopback Net. */ + if (the_lnet.ln_loni) { + lnet_ni_decref_locked(the_lnet.ln_loni, 0); + the_lnet.ln_loni = NULL; + } + lnet_net_unlock(LNET_LOCK_EX); + + /* iterate through the net zombie list and delete each net */ + while (!list_empty(&the_lnet.ln_net_zombie)) { + net = list_entry(the_lnet.ln_net_zombie.next, + struct lnet_net, net_list); + lnet_shutdown_lndnet(net); + } + + lnet_net_lock(LNET_LOCK_EX); + the_lnet.ln_shutdown = 0; + lnet_net_unlock(LNET_LOCK_EX); +} + static int lnet_startup_lndni(struct lnet_ni *ni, struct lnet_lnd_tunables *tun) { From patchwork Fri Sep 7 00:49:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10591365 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 372DD112B for ; Fri, 7 Sep 2018 00:54:02 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 26D6E2AF87 for ; Fri, 7 Sep 2018 00:54:02 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 197532B17D; Fri, 7 Sep 2018 00:54:02 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id A62B62AF87 for ; Fri, 7 Sep 2018 00:54:01 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 464154E31F3; Thu, 6 Sep 2018 17:54:01 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 3B2D94E31CA for ; Thu, 6 Sep 2018 17:53:59 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 59766AD72; Fri, 7 Sep 2018 00:53:58 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Fri, 07 Sep 2018 10:49:31 +1000 Message-ID: <153628137195.8267.16400748098054215181.stgit@noble> In-Reply-To: <153628058697.8267.6056114844033479774.stgit@noble> References: <153628058697.8267.6056114844033479774.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 18/34] lnet: add ni_state X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP This is barely used. This is part of 8cbb8cd3e771e7f7e0f99cafc19fad32770dc015 LU-7734 lnet: Multi-Rail local NI split Signed-off-by: NeilBrown Reviewed-by: Doug Oucharek Reviewed-by: Doug Oucharek Signed-off-by: NeilBrown --- .../staging/lustre/include/linux/lnet/lib-lnet.h | 1 + .../staging/lustre/include/linux/lnet/lib-types.h | 16 ++++++++++++++++ drivers/staging/lustre/lnet/lnet/api-ni.c | 16 ++++++++++++++++ drivers/staging/lustre/lnet/lnet/config.c | 1 + 4 files changed, 34 insertions(+) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h index faa3f19dd844..54a93235834c 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h @@ -400,6 +400,7 @@ int lnet_cpt_of_nid(lnet_nid_t nid, struct lnet_ni *ni); struct lnet_ni *lnet_nid2ni_locked(lnet_nid_t nid, int cpt); struct lnet_ni *lnet_net2ni_locked(__u32 net, int cpt); struct lnet_ni *lnet_net2ni(__u32 net); +bool lnet_is_ni_healthy_locked(struct lnet_ni *ni); extern int portal_rotor; diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h index 1d372672e2de..6c34ecf22021 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-types.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h @@ -256,6 +256,19 @@ struct lnet_tx_queue { struct list_head tq_delayed; /* delayed TXs */ }; +enum lnet_ni_state { + /* set when NI block is allocated */ + LNET_NI_STATE_INIT = 0, + /* set when NI is started successfully */ + LNET_NI_STATE_ACTIVE, + /* set when LND notifies NI failed */ + LNET_NI_STATE_FAILED, + /* set when LND notifies NI degraded */ + LNET_NI_STATE_DEGRADED, + /* set when shuttding down NI */ + LNET_NI_STATE_DELETING +}; + struct lnet_net { /* chain on the ln_nets */ struct list_head net_list; @@ -324,6 +337,9 @@ struct lnet_ni { /* my health status */ struct lnet_ni_status *ni_status; + /* NI FSM */ + enum lnet_ni_state ni_state; + /* per NI LND tunables */ struct lnet_lnd_tunables ni_lnd_tunables; diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index 46c5ca71bc07..618fdf8141f0 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -780,6 +780,16 @@ lnet_islocalnet(__u32 net) return !!ni; } +bool +lnet_is_ni_healthy_locked(struct lnet_ni *ni) +{ + if (ni->ni_state == LNET_NI_STATE_ACTIVE || + ni->ni_state == LNET_NI_STATE_DEGRADED) + return true; + + return false; +} + struct lnet_ni * lnet_nid2ni_locked(lnet_nid_t nid, int cpt) { @@ -1117,6 +1127,9 @@ lnet_clear_zombies_nis_locked(struct lnet_net *net) ni = list_entry(zombie_list->next, struct lnet_ni, ni_netlist); list_del_init(&ni->ni_netlist); + /* the ni should be in deleting state. If it's not it's + * a bug */ + LASSERT(ni->ni_state == LNET_NI_STATE_DELETING); cfs_percpt_for_each(ref, j, ni->ni_refs) { if (!*ref) continue; @@ -1163,6 +1176,7 @@ lnet_shutdown_lndni(struct lnet_ni *ni) struct lnet_net *net = ni->ni_net; lnet_net_lock(LNET_LOCK_EX); + ni->ni_state = LNET_NI_STATE_DELETING; lnet_ni_unlink_locked(ni); lnet_net_unlock(LNET_LOCK_EX); @@ -1291,6 +1305,8 @@ lnet_startup_lndni(struct lnet_ni *ni, struct lnet_lnd_tunables *tun) lnet_net_unlock(LNET_LOCK_EX); + ni->ni_state = LNET_NI_STATE_ACTIVE; + if (net->net_lnd->lnd_type == LOLND) { lnet_ni_addref(ni); LASSERT(!the_lnet.ln_loni); diff --git a/drivers/staging/lustre/lnet/lnet/config.c b/drivers/staging/lustre/lnet/lnet/config.c index 2588d67fea1b..081812e19b13 100644 --- a/drivers/staging/lustre/lnet/lnet/config.c +++ b/drivers/staging/lustre/lnet/lnet/config.c @@ -393,6 +393,7 @@ lnet_ni_alloc(struct lnet_net *net, struct cfs_expr_list *el, char *iface) ni->ni_net_ns = NULL; ni->ni_last_alive = ktime_get_real_seconds(); + ni->ni_state = LNET_NI_STATE_INIT; rc = lnet_net_append_cpts(ni->ni_cpts, ni->ni_ncpts, net); if (rc != 0) goto failed; From patchwork Fri Sep 7 00:49:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10591367 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 82365112B for ; Fri, 7 Sep 2018 00:54:08 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7187E2AF87 for ; Fri, 7 Sep 2018 00:54:08 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 661542B17D; Fri, 7 Sep 2018 00:54:08 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 1DB7C2AF87 for ; Fri, 7 Sep 2018 00:54:08 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id B1E864E320C; Thu, 6 Sep 2018 17:54:07 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 8754C4E31F6 for ; Thu, 6 Sep 2018 17:54:05 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id BC9A6AED7; Fri, 7 Sep 2018 00:54:04 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Fri, 07 Sep 2018 10:49:32 +1000 Message-ID: <153628137199.8267.13740900509571445302.stgit@noble> In-Reply-To: <153628058697.8267.6056114844033479774.stgit@noble> References: <153628058697.8267.6056114844033479774.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 19/34] lnet: simplify lnet_islocalnet() X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP Having lnet_get_net_locked() makes this (a little) simpler. This is part of 8cbb8cd3e771e7f7e0f99cafc19fad32770dc015 LU-7734 lnet: Multi-Rail local NI split Signed-off-by: NeilBrown Reviewed-by: Doug Oucharek --- drivers/staging/lustre/lnet/lnet/api-ni.c | 14 +++++--------- 1 file changed, 5 insertions(+), 9 deletions(-) diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index 618fdf8141f0..546d5101360f 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -764,20 +764,16 @@ lnet_cpt_of_nid(lnet_nid_t nid, struct lnet_ni *ni) EXPORT_SYMBOL(lnet_cpt_of_nid); int -lnet_islocalnet(__u32 net) +lnet_islocalnet(__u32 net_id) { - struct lnet_ni *ni; - int cpt; + struct lnet_net *net; + int cpt; cpt = lnet_net_lock_current(); - - ni = lnet_net2ni_locked(net, cpt); - if (ni) - lnet_ni_decref_locked(ni, cpt); - + net = lnet_get_net_locked(net_id); lnet_net_unlock(cpt); - return !!ni; + return !!net; } bool From patchwork Fri Sep 7 00:49:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10591371 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B46EF921 for ; Fri, 7 Sep 2018 00:54:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A73DF2AF87 for ; Fri, 7 Sep 2018 00:54:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9BA8E2B17D; Fri, 7 Sep 2018 00:54:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 558B02AF87 for ; Fri, 7 Sep 2018 00:54:17 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 10B574E320C; Thu, 6 Sep 2018 17:54:17 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id E3C774E3142 for ; Thu, 6 Sep 2018 17:54:15 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id CE9BAAE56; Fri, 7 Sep 2018 00:54:14 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Fri, 07 Sep 2018 10:49:32 +1000 Message-ID: <153628137203.8267.14277020278461943610.stgit@noble> In-Reply-To: <153628058697.8267.6056114844033479774.stgit@noble> References: <153628058697.8267.6056114844033479774.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 20/34] lnet: discard ni_cpt_list X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP This isn't used any more. The new comment is odd - this is no net_ni_cpt !! The ni_cptlist linkage is no longer used - should it go too? This is part of 8cbb8cd3e771e7f7e0f99cafc19fad32770dc015 LU-7734 lnet: Multi-Rail local NI split Signed-off-by: NeilBrown Reviewed-by: Doug Oucharek Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: NeilBrown --- .../staging/lustre/include/linux/lnet/lib-types.h | 4 +--- drivers/staging/lustre/lnet/lnet/api-ni.c | 7 ------- 2 files changed, 1 insertion(+), 10 deletions(-) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h index 6c34ecf22021..dc15fa75a9d2 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-types.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h @@ -305,7 +305,7 @@ struct lnet_net { struct lnet_ni { /* chain on the lnet_net structure */ struct list_head ni_netlist; - /* chain on ln_nis_cpt */ + /* chain on net_ni_cpt */ struct list_head ni_cptlist; spinlock_t ni_lock; @@ -671,8 +671,6 @@ struct lnet { /* LND instances */ struct list_head ln_nets; - /* NIs bond on specific CPT(s) */ - struct list_head ln_nis_cpt; /* the loopback NI */ struct lnet_ni *ln_loni; /* network zombie list */ diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index 546d5101360f..960f235df5e7 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -538,7 +538,6 @@ lnet_prepare(lnet_pid_t requested_pid) INIT_LIST_HEAD(&the_lnet.ln_test_peers); INIT_LIST_HEAD(&the_lnet.ln_nets); - INIT_LIST_HEAD(&the_lnet.ln_nis_cpt); INIT_LIST_HEAD(&the_lnet.ln_routers); INIT_LIST_HEAD(&the_lnet.ln_drop_rules); INIT_LIST_HEAD(&the_lnet.ln_delay_rules); @@ -616,7 +615,6 @@ lnet_unprepare(void) LASSERT(!the_lnet.ln_refcount); LASSERT(list_empty(&the_lnet.ln_test_peers)); LASSERT(list_empty(&the_lnet.ln_nets)); - LASSERT(list_empty(&the_lnet.ln_nis_cpt)); lnet_portals_destroy(); @@ -1294,11 +1292,6 @@ lnet_startup_lndni(struct lnet_ni *ni, struct lnet_lnd_tunables *tun) /* refcount for ln_nis */ lnet_ni_addref_locked(ni, 0); list_add_tail(&ni->ni_net->net_list, &the_lnet.ln_nets); - if (ni->ni_cpts) { - lnet_ni_addref_locked(ni, 0); - list_add_tail(&ni->ni_cptlist, &the_lnet.ln_nis_cpt); - } - lnet_net_unlock(LNET_LOCK_EX); ni->ni_state = LNET_NI_STATE_ACTIVE; From patchwork Fri Sep 7 00:49:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10591373 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C3237112B for ; Fri, 7 Sep 2018 00:54:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B471F2AF87 for ; Fri, 7 Sep 2018 00:54:25 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A8C332B17D; Fri, 7 Sep 2018 00:54:25 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 525082AF87 for ; Fri, 7 Sep 2018 00:54:25 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 18EFD4E3238; Thu, 6 Sep 2018 17:54:25 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 416054E31BD for ; Thu, 6 Sep 2018 17:54:23 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 691E3AD72; Fri, 7 Sep 2018 00:54:22 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Fri, 07 Sep 2018 10:49:32 +1000 Message-ID: <153628137207.8267.8628833457984431545.stgit@noble> In-Reply-To: <153628058697.8267.6056114844033479774.stgit@noble> References: <153628058697.8267.6056114844033479774.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 21/34] lnet: add net_ni_added X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP When we allocate an ni, it is now added to the new net_ni_added list of unstarted interfaces. lnet_startup_lndnet() now starts all those added interfaces. This is part of 8cbb8cd3e771e7f7e0f99cafc19fad32770dc015 LU-7734 lnet: Multi-Rail local NI split Signed-off-by: NeilBrown Reviewed-by: Doug Oucharek --- .../staging/lustre/include/linux/lnet/lib-types.h | 3 ++ drivers/staging/lustre/lnet/lnet/api-ni.c | 39 +++++++++++++++++--- drivers/staging/lustre/lnet/lnet/config.c | 13 ++++++- 3 files changed, 48 insertions(+), 7 deletions(-) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h index dc15fa75a9d2..1faa247a93b8 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-types.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h @@ -298,6 +298,9 @@ struct lnet_net { /* list of NIs on this net */ struct list_head net_ni_list; + /* list of NIs being added, but not started yet */ + struct list_head net_ni_added; + /* dying LND instances */ struct list_head net_ni_zombie; }; diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index 960f235df5e7..ce3dd0f32e12 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -1350,12 +1350,15 @@ static int lnet_startup_lndnet(struct lnet_net *net, struct lnet_lnd_tunables *tun) { struct lnet_ni *ni; + struct list_head local_ni_list; + int rc; + int ni_count = 0; __u32 lnd_type; struct lnet_lnd *lnd; - int rc; lnd_type = LNET_NETTYP(net->net_id); + INIT_LIST_HEAD(&local_ni_list); LASSERT(libcfs_isknown_lnd(lnd_type)); /* Make sure this new NI is unique. */ @@ -1399,12 +1402,36 @@ lnet_startup_lndnet(struct lnet_net *net, struct lnet_lnd_tunables *tun) net->net_lnd = lnd; mutex_unlock(&the_lnet.ln_lnd_mutex); - ni = list_first_entry(&net->net_ni_list, struct lnet_ni, ni_netlist); + while (!list_empty(&net->net_ni_added)) { + ni = list_entry(net->net_ni_added.next, struct lnet_ni, + ni_netlist); + list_del_init(&ni->ni_netlist); - rc = lnet_startup_lndni(ni, tun); - if (rc < 0) - return rc; - return 1; + rc = lnet_startup_lndni(ni, tun); + + if (rc < 0) + goto failed1; + + list_add_tail(&ni->ni_netlist, &local_ni_list); + + ni_count++; + } + lnet_net_lock(LNET_LOCK_EX); + list_splice_tail(&local_ni_list, &net->net_ni_list); + lnet_net_unlock(LNET_LOCK_EX); + return ni_count; + +failed1: + /* + * shutdown the new NIs that are being started up + * free the NET being started + */ + while (!list_empty(&local_ni_list)) { + ni = list_entry(local_ni_list.next, struct lnet_ni, + ni_netlist); + + lnet_shutdown_lndni(ni); + } failed0: lnet_net_free(net); diff --git a/drivers/staging/lustre/lnet/lnet/config.c b/drivers/staging/lustre/lnet/lnet/config.c index 081812e19b13..f886dcfc6d6e 100644 --- a/drivers/staging/lustre/lnet/lnet/config.c +++ b/drivers/staging/lustre/lnet/lnet/config.c @@ -281,6 +281,16 @@ lnet_net_free(struct lnet_net *net) LASSERT(list_empty(&net->net_ni_zombie)); + /* + * delete any nis that haven't been added yet. This could happen + * if there is a failure on net startup + */ + list_for_each_safe(tmp, tmp2, &net->net_ni_added) { + ni = list_entry(tmp, struct lnet_ni, ni_netlist); + list_del_init(&ni->ni_netlist); + lnet_ni_free(ni); + } + /* delete any nis which have been started. */ list_for_each_safe(tmp, tmp2, &net->net_ni_list) { ni = list_entry(tmp, struct lnet_ni, ni_netlist); @@ -314,6 +324,7 @@ lnet_net_alloc(__u32 net_id, struct list_head *net_list) INIT_LIST_HEAD(&net->net_list); INIT_LIST_HEAD(&net->net_ni_list); + INIT_LIST_HEAD(&net->net_ni_added); INIT_LIST_HEAD(&net->net_ni_zombie); net->net_id = net_id; @@ -397,7 +408,7 @@ lnet_ni_alloc(struct lnet_net *net, struct cfs_expr_list *el, char *iface) rc = lnet_net_append_cpts(ni->ni_cpts, ni->ni_ncpts, net); if (rc != 0) goto failed; - list_add_tail(&ni->ni_netlist, &net->net_ni_list); + list_add_tail(&ni->ni_netlist, &net->net_ni_added); return ni; failed: From patchwork Fri Sep 7 00:49:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10591375 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E04B7921 for ; Fri, 7 Sep 2018 00:54:32 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D32C92AF87 for ; Fri, 7 Sep 2018 00:54:32 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C78F82B17D; Fri, 7 Sep 2018 00:54:32 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 4E64E2AF87 for ; Fri, 7 Sep 2018 00:54:32 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 121824E324C; Thu, 6 Sep 2018 17:54:32 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id A84054E320C for ; Thu, 6 Sep 2018 17:54:30 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id A841DAD72; Fri, 7 Sep 2018 00:54:29 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Fri, 07 Sep 2018 10:49:32 +1000 Message-ID: <153628137211.8267.6066882916704220570.stgit@noble> In-Reply-To: <153628058697.8267.6056114844033479774.stgit@noble> References: <153628058697.8267.6056114844033479774.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 22/34] lnet: don't take reference in lnet_XX2ni_locked() X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP lnet_net2ni_locked() and lnet_nid2ni_locked() no longer take a reference - as the lock is held, a ref isn't always needed. Instead, introduce lnet_nid2ni_addref() which does take the reference (but doesn't need the lock). Various places which called lnet_net2ni_locked() or lnet_nid2ni_locked() no longer need to drop the ref afterwards. This is part of 8cbb8cd3e771e7f7e0f99cafc19fad32770dc015 LU-7734 lnet: Multi-Rail local NI split Signed-off-by: NeilBrown Reviewed-by: Doug Oucharek --- .../staging/lustre/include/linux/lnet/lib-lnet.h | 1 + .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c | 2 + drivers/staging/lustre/lnet/lnet/acceptor.c | 2 + drivers/staging/lustre/lnet/lnet/api-ni.c | 27 +++++++++++++------- drivers/staging/lustre/lnet/lnet/lib-move.c | 17 +------------ 5 files changed, 21 insertions(+), 28 deletions(-) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h index 54a93235834c..6401d9a37b23 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h @@ -398,6 +398,7 @@ extern int avoid_asym_router_failure; int lnet_cpt_of_nid_locked(lnet_nid_t nid, struct lnet_ni *ni); int lnet_cpt_of_nid(lnet_nid_t nid, struct lnet_ni *ni); struct lnet_ni *lnet_nid2ni_locked(lnet_nid_t nid, int cpt); +struct lnet_ni *lnet_nid2ni_addref(lnet_nid_t nid); struct lnet_ni *lnet_net2ni_locked(__u32 net, int cpt); struct lnet_ni *lnet_net2ni(__u32 net); bool lnet_is_ni_healthy_locked(struct lnet_ni *ni); diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c index e64c14914924..af8f863b6a68 100644 --- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c +++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c @@ -2294,7 +2294,7 @@ kiblnd_passive_connect(struct rdma_cm_id *cmid, void *priv, int priv_nob) } nid = reqmsg->ibm_srcnid; - ni = lnet_net2ni(LNET_NIDNET(reqmsg->ibm_dstnid)); + ni = lnet_nid2ni_addref(reqmsg->ibm_dstnid); if (ni) { net = (struct kib_net *)ni->ni_data; diff --git a/drivers/staging/lustre/lnet/lnet/acceptor.c b/drivers/staging/lustre/lnet/lnet/acceptor.c index 88b90c1fdbaf..25205f686801 100644 --- a/drivers/staging/lustre/lnet/lnet/acceptor.c +++ b/drivers/staging/lustre/lnet/lnet/acceptor.c @@ -296,7 +296,7 @@ lnet_accept(struct socket *sock, __u32 magic) if (flip) __swab64s(&cr.acr_nid); - ni = lnet_net2ni(LNET_NIDNET(cr.acr_nid)); + ni = lnet_nid2ni_addref(cr.acr_nid); if (!ni || /* no matching net */ ni->ni_nid != cr.acr_nid) { /* right NET, wrong NID! */ if (ni) diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index ce3dd0f32e12..42e775e2a669 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -655,7 +655,6 @@ lnet_net2ni_locked(__u32 net_id, int cpt) if (net->net_id == net_id) { ni = list_entry(net->net_ni_list.next, struct lnet_ni, ni_netlist); - lnet_ni_addref_locked(ni, cpt); return ni; } } @@ -794,16 +793,29 @@ lnet_nid2ni_locked(lnet_nid_t nid, int cpt) list_for_each_entry(net, &the_lnet.ln_nets, net_list) { list_for_each_entry(ni, &net->net_ni_list, ni_netlist) { - if (ni->ni_nid == nid) { - lnet_ni_addref_locked(ni, cpt); + if (ni->ni_nid == nid) return ni; - } } } return NULL; } +struct lnet_ni * +lnet_nid2ni_addref(lnet_nid_t nid) +{ + struct lnet_ni *ni; + + lnet_net_lock(0); + ni = lnet_nid2ni_locked(nid, 0); + if (ni) + lnet_ni_addref_locked(ni, 0); + lnet_net_unlock(0); + + return ni; +} +EXPORT_SYMBOL(lnet_nid2ni_addref); + int lnet_islocalnid(lnet_nid_t nid) { @@ -812,8 +824,6 @@ lnet_islocalnid(lnet_nid_t nid) cpt = lnet_net_lock_current(); ni = lnet_nid2ni_locked(nid, cpt); - if (ni) - lnet_ni_decref_locked(ni, cpt); lnet_net_unlock(cpt); return !!ni; @@ -1412,6 +1422,7 @@ lnet_startup_lndnet(struct lnet_net *net, struct lnet_lnd_tunables *tun) if (rc < 0) goto failed1; + lnet_ni_addref(ni); list_add_tail(&ni->ni_netlist, &local_ni_list); ni_count++; @@ -2032,9 +2043,6 @@ lnet_dyn_del_ni(__u32 net) goto failed; } - /* decrement the reference counter taken by lnet_net2ni() */ - lnet_ni_decref_locked(ni, 0); - lnet_shutdown_lndni(ni); if (!lnet_count_acceptor_nets()) @@ -2264,7 +2272,6 @@ LNetCtl(unsigned int cmd, void *arg) else rc = ni->ni_net->net_lnd->lnd_ctl(ni, cmd, arg); - lnet_ni_decref(ni); return rc; } /* not reached */ diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c index 00a89221c9b3..60f34c4b85d3 100644 --- a/drivers/staging/lustre/lnet/lnet/lib-move.c +++ b/drivers/staging/lustre/lnet/lnet/lib-move.c @@ -1127,11 +1127,7 @@ lnet_send(lnet_nid_t src_nid, struct lnet_msg *msg, lnet_nid_t rtr_nid) if (!src_ni) { src_ni = local_ni; src_nid = src_ni->ni_nid; - } else if (src_ni == local_ni) { - lnet_ni_decref_locked(local_ni, cpt); - } else { - lnet_ni_decref_locked(local_ni, cpt); - lnet_ni_decref_locked(src_ni, cpt); + } else if (src_ni != local_ni) { lnet_net_unlock(cpt); LCONSOLE_WARN("No route to %s via from %s\n", libcfs_nid2str(dst_nid), @@ -1149,16 +1145,10 @@ lnet_send(lnet_nid_t src_nid, struct lnet_msg *msg, lnet_nid_t rtr_nid) /* No send credit hassles with LOLND */ lnet_net_unlock(cpt); lnet_ni_send(src_ni, msg); - - lnet_net_lock(cpt); - lnet_ni_decref_locked(src_ni, cpt); - lnet_net_unlock(cpt); return 0; } rc = lnet_nid2peer_locked(&lp, dst_nid, cpt); - /* lp has ref on src_ni; lose mine */ - lnet_ni_decref_locked(src_ni, cpt); if (rc) { lnet_net_unlock(cpt); LCONSOLE_WARN("Error %d finding peer %s\n", rc, @@ -1173,8 +1163,6 @@ lnet_send(lnet_nid_t src_nid, struct lnet_msg *msg, lnet_nid_t rtr_nid) src_ni->ni_net : NULL, dst_nid, rtr_nid); if (!lp) { - if (src_ni) - lnet_ni_decref_locked(src_ni, cpt); lnet_net_unlock(cpt); LCONSOLE_WARN("No route to %s via %s (all routers down)\n", @@ -1192,8 +1180,6 @@ lnet_send(lnet_nid_t src_nid, struct lnet_msg *msg, lnet_nid_t rtr_nid) if (rtr_nid != lp->lp_nid) { cpt2 = lp->lp_cpt; if (cpt2 != cpt) { - if (src_ni) - lnet_ni_decref_locked(src_ni, cpt); lnet_net_unlock(cpt); rtr_nid = lp->lp_nid; @@ -1212,7 +1198,6 @@ lnet_send(lnet_nid_t src_nid, struct lnet_msg *msg, lnet_nid_t rtr_nid) src_nid = src_ni->ni_nid; } else { LASSERT(src_ni->ni_net == lp->lp_net); - lnet_ni_decref_locked(src_ni, cpt); } lnet_peer_addref_locked(lp); From patchwork Fri Sep 7 00:49:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10591377 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 84C3D112B for ; Fri, 7 Sep 2018 00:54:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 77E5E2AF87 for ; Fri, 7 Sep 2018 00:54:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6C0AA2B17D; Fri, 7 Sep 2018 00:54:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 2BC9A2AF87 for ; Fri, 7 Sep 2018 00:54:39 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id BD6944E3254; Thu, 6 Sep 2018 17:54:38 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id EC74F4E3215 for ; Thu, 6 Sep 2018 17:54:36 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 22747AE56; Fri, 7 Sep 2018 00:54:36 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Fri, 07 Sep 2018 10:49:32 +1000 Message-ID: <153628137216.8267.15666994772553520296.stgit@noble> In-Reply-To: <153628058697.8267.6056114844033479774.stgit@noble> References: <153628058697.8267.6056114844033479774.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 23/34] lnet: don't need lock to test ln_shutdown. X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP ln_shutdown returns -ESHUTDOWN if ln_shutdown is already set. The lock is always taken to set ln_shutdown, but apparently we don't need to hold the lock for this test. I guess if it is set immediately after the test, and before we take the lock then.... can anything bad happen? This is part of 8cbb8cd3e771e7f7e0f99cafc19fad32770dc015 LU-7734 lnet: Multi-Rail local NI split Signed-off-by: NeilBrown Reviewed-by: Doug Oucharek --- drivers/staging/lustre/lnet/lnet/lib-move.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c index 60f34c4b85d3..46e593fbb44f 100644 --- a/drivers/staging/lustre/lnet/lnet/lib-move.c +++ b/drivers/staging/lustre/lnet/lnet/lib-move.c @@ -1099,12 +1099,9 @@ lnet_send(lnet_nid_t src_nid, struct lnet_msg *msg, lnet_nid_t rtr_nid) cpt = lnet_cpt_of_nid(rtr_nid == LNET_NID_ANY ? dst_nid : rtr_nid, local_ni); again: - lnet_net_lock(cpt); - - if (the_lnet.ln_shutdown) { - lnet_net_unlock(cpt); + if (the_lnet.ln_shutdown) return -ESHUTDOWN; - } + lnet_net_lock(cpt); if (src_nid == LNET_NID_ANY) { src_ni = NULL; From patchwork Fri Sep 7 00:49:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10591379 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 13AEE112B for ; Fri, 7 Sep 2018 00:54:46 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0686A2AF87 for ; Fri, 7 Sep 2018 00:54:46 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id EF0FC2B17D; Fri, 7 Sep 2018 00:54:45 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id B15BF2AF87 for ; Fri, 7 Sep 2018 00:54:45 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 6D0624E3259; Thu, 6 Sep 2018 17:54:45 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 994B14E31F6 for ; Thu, 6 Sep 2018 17:54:43 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 8F948AD72; Fri, 7 Sep 2018 00:54:42 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Fri, 07 Sep 2018 10:49:32 +1000 Message-ID: <153628137220.8267.9479780300443400770.stgit@noble> In-Reply-To: <153628058697.8267.6056114844033479774.stgit@noble> References: <153628058697.8267.6056114844033479774.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 24/34] lnet: don't take lock over lnet_net_unique() X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP holding ln_api_mutex is enough to keep the list stable. This is part of 8cbb8cd3e771e7f7e0f99cafc19fad32770dc015 LU-7734 lnet: Multi-Rail local NI split Signed-off-by: NeilBrown Reviewed-by: Doug Oucharek --- drivers/staging/lustre/lnet/lnet/api-ni.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index 42e775e2a669..2b5c25a1dc7c 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -1372,9 +1372,7 @@ lnet_startup_lndnet(struct lnet_net *net, struct lnet_lnd_tunables *tun) LASSERT(libcfs_isknown_lnd(lnd_type)); /* Make sure this new NI is unique. */ - lnet_net_lock(LNET_LOCK_EX); rc = lnet_net_unique(net->net_id, &the_lnet.ln_nets); - lnet_net_unlock(LNET_LOCK_EX); if (!rc) { if (lnd_type == LOLND) { lnet_net_free(net); From patchwork Fri Sep 7 00:49:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10591381 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9832B921 for ; Fri, 7 Sep 2018 00:54:52 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8A34F2AF87 for ; Fri, 7 Sep 2018 00:54:52 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7E6D72B17D; Fri, 7 Sep 2018 00:54:52 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 2BF282AF87 for ; Fri, 7 Sep 2018 00:54:52 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id EB3AB4E3288; Thu, 6 Sep 2018 17:54:51 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 8624D4E3259 for ; Thu, 6 Sep 2018 17:54:50 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 8AE40AC90; Fri, 7 Sep 2018 00:54:49 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Fri, 07 Sep 2018 10:49:32 +1000 Message-ID: <153628137225.8267.589541269937133692.stgit@noble> In-Reply-To: <153628058697.8267.6056114844033479774.stgit@noble> References: <153628058697.8267.6056114844033479774.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 25/34] lnet: swap 'then' and 'else' branches in lnet_startup_lndnet X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP This swap makes the diff for the next patch more readable. We also stop storing the return value from lnet_net_unique() as it is never used. This is part of 8cbb8cd3e771e7f7e0f99cafc19fad32770dc015 LU-7734 lnet: Multi-Rail local NI split Signed-off-by: NeilBrown Reviewed-by: Doug Oucharek --- drivers/staging/lustre/lnet/lnet/api-ni.c | 55 +++++++++++++++-------------- 1 file changed, 28 insertions(+), 27 deletions(-) diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index 2b5c25a1dc7c..ab4d093c04da 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -1372,8 +1372,34 @@ lnet_startup_lndnet(struct lnet_net *net, struct lnet_lnd_tunables *tun) LASSERT(libcfs_isknown_lnd(lnd_type)); /* Make sure this new NI is unique. */ - rc = lnet_net_unique(net->net_id, &the_lnet.ln_nets); - if (!rc) { + if (lnet_net_unique(net->net_id, &the_lnet.ln_nets)) { + mutex_lock(&the_lnet.ln_lnd_mutex); + lnd = lnet_find_lnd_by_type(lnd_type); + + if (lnd == NULL) { + mutex_unlock(&the_lnet.ln_lnd_mutex); + rc = request_module("%s", libcfs_lnd2modname(lnd_type)); + mutex_lock(&the_lnet.ln_lnd_mutex); + + lnd = lnet_find_lnd_by_type(lnd_type); + if (lnd == NULL) { + mutex_unlock(&the_lnet.ln_lnd_mutex); + CERROR("Can't load LND %s, module %s, rc=%d\n", + libcfs_lnd2str(lnd_type), + libcfs_lnd2modname(lnd_type), rc); + rc = -EINVAL; + goto failed0; + } + } + + lnet_net_lock(LNET_LOCK_EX); + lnd->lnd_refcount++; + lnet_net_unlock(LNET_LOCK_EX); + + net->net_lnd = lnd; + + mutex_unlock(&the_lnet.ln_lnd_mutex); + } else { if (lnd_type == LOLND) { lnet_net_free(net); return 0; @@ -1385,31 +1411,6 @@ lnet_startup_lndnet(struct lnet_net *net, struct lnet_lnd_tunables *tun) goto failed0; } - mutex_lock(&the_lnet.ln_lnd_mutex); - lnd = lnet_find_lnd_by_type(lnd_type); - - if (!lnd) { - mutex_unlock(&the_lnet.ln_lnd_mutex); - rc = request_module("%s", libcfs_lnd2modname(lnd_type)); - mutex_lock(&the_lnet.ln_lnd_mutex); - - lnd = lnet_find_lnd_by_type(lnd_type); - if (!lnd) { - mutex_unlock(&the_lnet.ln_lnd_mutex); - CERROR("Can't load LND %s, module %s, rc=%d\n", - libcfs_lnd2str(lnd_type), - libcfs_lnd2modname(lnd_type), rc); - rc = -EINVAL; - goto failed0; - } - } - - lnet_net_lock(LNET_LOCK_EX); - lnd->lnd_refcount++; - lnet_net_unlock(LNET_LOCK_EX); - net->net_lnd = lnd; - mutex_unlock(&the_lnet.ln_lnd_mutex); - while (!list_empty(&net->net_ni_added)) { ni = list_entry(net->net_ni_added.next, struct lnet_ni, ni_netlist); From patchwork Fri Sep 7 00:49:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10591383 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BA9D1921 for ; Fri, 7 Sep 2018 00:55:28 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AC1FF2AF87 for ; Fri, 7 Sep 2018 00:55:28 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A0C7B2B17D; Fri, 7 Sep 2018 00:55:28 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 574FE2AF87 for ; Fri, 7 Sep 2018 00:55:28 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 3658121F961; Thu, 6 Sep 2018 17:55:02 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 23AF0201325 for ; Thu, 6 Sep 2018 17:55:00 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 23D81AC90; Fri, 7 Sep 2018 00:54:59 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Fri, 07 Sep 2018 10:49:32 +1000 Message-ID: <153628137230.8267.13002746471298616108.stgit@noble> In-Reply-To: <153628058697.8267.6056114844033479774.stgit@noble> References: <153628058697.8267.6056114844033479774.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 26/34] lnet: only valid lnd_type when net_id is unique. X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP If it isn't unique, we won't add it, so no need to validate. This is part of 8cbb8cd3e771e7f7e0f99cafc19fad32770dc015 LU-7734 lnet: Multi-Rail local NI split Signed-off-by: NeilBrown Reviewed-by: Doug Oucharek --- drivers/staging/lustre/lnet/lnet/api-ni.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index ab4d093c04da..0dfd3004f735 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -1366,13 +1366,14 @@ lnet_startup_lndnet(struct lnet_net *net, struct lnet_lnd_tunables *tun) __u32 lnd_type; struct lnet_lnd *lnd; - lnd_type = LNET_NETTYP(net->net_id); - INIT_LIST_HEAD(&local_ni_list); - LASSERT(libcfs_isknown_lnd(lnd_type)); /* Make sure this new NI is unique. */ if (lnet_net_unique(net->net_id, &the_lnet.ln_nets)) { + lnd_type = LNET_NETTYP(net->net_id); + + LASSERT(libcfs_isknown_lnd(lnd_type)); + mutex_lock(&the_lnet.ln_lnd_mutex); lnd = lnet_find_lnd_by_type(lnd_type); From patchwork Fri Sep 7 00:49:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10591385 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E8C36112B for ; Fri, 7 Sep 2018 00:55:34 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DA7D32AF87 for ; Fri, 7 Sep 2018 00:55:34 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CF2CF2B17D; Fri, 7 Sep 2018 00:55:34 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 5D3EA2AF87 for ; Fri, 7 Sep 2018 00:55:34 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 5BAAD4E20C7; Thu, 6 Sep 2018 17:55:09 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 35C0C4E1EDD for ; Thu, 6 Sep 2018 17:55:07 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 3F2C2AD72; Fri, 7 Sep 2018 00:55:06 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Fri, 07 Sep 2018 10:49:32 +1000 Message-ID: <153628137234.8267.14872362382875902424.stgit@noble> In-Reply-To: <153628058697.8267.6056114844033479774.stgit@noble> References: <153628058697.8267.6056114844033479774.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 27/34] lnet: make it possible to add a new interface to a network X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP lnet_startup_lndnet() is enhanced to cope if the net already exists. This is part of 8cbb8cd3e771e7f7e0f99cafc19fad32770dc015 LU-7734 lnet: Multi-Rail local NI split Signed-off-by: NeilBrown Reviewed-by: Doug Oucharek --- .../staging/lustre/include/linux/lnet/lib-lnet.h | 3 + drivers/staging/lustre/lnet/lnet/api-ni.c | 69 +++++++++++++++----- drivers/staging/lustre/lnet/lnet/config.c | 12 ++- 3 files changed, 61 insertions(+), 23 deletions(-) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h index 6401d9a37b23..905213fc16c7 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h @@ -630,7 +630,8 @@ void lnet_swap_pinginfo(struct lnet_ping_info *info); int lnet_parse_ip2nets(char **networksp, char *ip2nets); int lnet_parse_routes(char *route_str, int *im_a_router); int lnet_parse_networks(struct list_head *nilist, char *networks); -bool lnet_net_unique(__u32 net, struct list_head *nilist); +bool lnet_net_unique(__u32 net_id, struct list_head *nilist, + struct lnet_net **net); int lnet_nid2peer_locked(struct lnet_peer **lpp, lnet_nid_t nid, int cpt); struct lnet_peer *lnet_find_peer_locked(struct lnet_peer_table *ptable, diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index 0dfd3004f735..042ab0d9e318 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -1298,14 +1298,9 @@ lnet_startup_lndni(struct lnet_ni *ni, struct lnet_lnd_tunables *tun) goto failed0; } - lnet_net_lock(LNET_LOCK_EX); - /* refcount for ln_nis */ - lnet_ni_addref_locked(ni, 0); - list_add_tail(&ni->ni_net->net_list, &the_lnet.ln_nets); - lnet_net_unlock(LNET_LOCK_EX); - ni->ni_state = LNET_NI_STATE_ACTIVE; + /* We keep a reference on the loopback net through the loopback NI */ if (net->net_lnd->lnd_type == LOLND) { lnet_ni_addref(ni); LASSERT(!the_lnet.ln_loni); @@ -1360,6 +1355,7 @@ static int lnet_startup_lndnet(struct lnet_net *net, struct lnet_lnd_tunables *tun) { struct lnet_ni *ni; + struct lnet_net *net_l = NULL; struct list_head local_ni_list; int rc; int ni_count = 0; @@ -1368,8 +1364,14 @@ lnet_startup_lndnet(struct lnet_net *net, struct lnet_lnd_tunables *tun) INIT_LIST_HEAD(&local_ni_list); - /* Make sure this new NI is unique. */ - if (lnet_net_unique(net->net_id, &the_lnet.ln_nets)) { + /* + * make sure that this net is unique. If it isn't then + * we are adding interfaces to an already existing network, and + * 'net' is just a convenient way to pass in the list. + * if it is unique we need to find the LND and load it if + * necessary. + */ + if (lnet_net_unique(net->net_id, &the_lnet.ln_nets, &net_l)) { lnd_type = LNET_NETTYP(net->net_id); LASSERT(libcfs_isknown_lnd(lnd_type)); @@ -1400,23 +1402,41 @@ lnet_startup_lndnet(struct lnet_net *net, struct lnet_lnd_tunables *tun) net->net_lnd = lnd; mutex_unlock(&the_lnet.ln_lnd_mutex); - } else { - if (lnd_type == LOLND) { - lnet_net_free(net); - return 0; - } - CERROR("Net %s is not unique\n", - libcfs_net2str(net->net_id)); - rc = -EEXIST; - goto failed0; + net_l = net; } + /* + * net_l: if the network being added is unique then net_l + * will point to that network + * if the network being added is not unique then + * net_l points to the existing network. + * + * When we enter the loop below, we'll pick NIs off he + * network beign added and start them up, then add them to + * a local ni list. Once we've successfully started all + * the NIs then we join the local NI list (of started up + * networks) with the net_l->net_ni_list, which should + * point to the correct network to add the new ni list to + * + * If any of the new NIs fail to start up, then we want to + * iterate through the local ni list, which should include + * any NIs which were successfully started up, and shut + * them down. + * + * After than we want to delete the network being added, + * to avoid a memory leak. + */ + while (!list_empty(&net->net_ni_added)) { ni = list_entry(net->net_ni_added.next, struct lnet_ni, ni_netlist); list_del_init(&ni->ni_netlist); + /* adjust the pointer the parent network, just in case it + * the net is a duplicate */ + ni->ni_net = net_l; + rc = lnet_startup_lndni(ni, tun); if (rc < 0) @@ -1427,9 +1447,22 @@ lnet_startup_lndnet(struct lnet_net *net, struct lnet_lnd_tunables *tun) ni_count++; } + lnet_net_lock(LNET_LOCK_EX); - list_splice_tail(&local_ni_list, &net->net_ni_list); + list_splice_tail(&local_ni_list, &net_l->net_ni_list); lnet_net_unlock(LNET_LOCK_EX); + + /* if the network is not unique then we don't want to keep + * it around after we're done. Free it. Otherwise add that + * net to the global the_lnet.ln_nets */ + if (net_l != net && net_l != NULL) { + lnet_net_free(net); + } else { + lnet_net_lock(LNET_LOCK_EX); + list_add_tail(&net->net_list, &the_lnet.ln_nets); + lnet_net_unlock(LNET_LOCK_EX); + } + return ni_count; failed1: diff --git a/drivers/staging/lustre/lnet/lnet/config.c b/drivers/staging/lustre/lnet/lnet/config.c index f886dcfc6d6e..fcae50676422 100644 --- a/drivers/staging/lustre/lnet/lnet/config.c +++ b/drivers/staging/lustre/lnet/lnet/config.c @@ -79,13 +79,17 @@ lnet_issep(char c) } bool -lnet_net_unique(__u32 net, struct list_head *netlist) +lnet_net_unique(__u32 net_id, struct list_head *netlist, + struct lnet_net **net) { - struct lnet_net *net_l; + struct lnet_net *net_l; list_for_each_entry(net_l, netlist, net_list) { - if (net_l->net_id == net) + if (net_l->net_id == net_id) { + if (net != NULL) + *net = net_l; return false; + } } return true; @@ -309,7 +313,7 @@ lnet_net_alloc(__u32 net_id, struct list_head *net_list) { struct lnet_net *net; - if (!lnet_net_unique(net_id, net_list)) { + if (!lnet_net_unique(net_id, net_list, NULL)) { CERROR("Duplicate net %s. Ignore\n", libcfs_net2str(net_id)); return NULL; From patchwork Fri Sep 7 00:49:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10591387 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6B38C112B for ; Fri, 7 Sep 2018 00:55:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5B5062AF87 for ; Fri, 7 Sep 2018 00:55:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4FA252B17D; Fri, 7 Sep 2018 00:55:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 0151E2AF87 for ; Fri, 7 Sep 2018 00:55:39 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id CAD1A4E2946; Thu, 6 Sep 2018 17:55:15 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 9CE084E2319 for ; Thu, 6 Sep 2018 17:55:13 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id ACE0DAC90; Fri, 7 Sep 2018 00:55:12 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Fri, 07 Sep 2018 10:49:32 +1000 Message-ID: <153628137237.8267.13304938702171788855.stgit@noble> In-Reply-To: <153628058697.8267.6056114844033479774.stgit@noble> References: <153628058697.8267.6056114844033479774.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 28/34] lnet: add checks to ensure network interface names are unique. X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP This is part of 8cbb8cd3e771e7f7e0f99cafc19fad32770dc015 LU-7734 lnet: Multi-Rail local NI split Signed-off-by: NeilBrown Reviewed-by: Doug Oucharek --- .../staging/lustre/include/linux/lnet/lib-lnet.h | 1 + drivers/staging/lustre/lnet/lnet/api-ni.c | 8 ++++++ drivers/staging/lustre/lnet/lnet/config.c | 25 ++++++++++++++++++++ 3 files changed, 34 insertions(+) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h index 905213fc16c7..ef551b571935 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h @@ -632,6 +632,7 @@ int lnet_parse_routes(char *route_str, int *im_a_router); int lnet_parse_networks(struct list_head *nilist, char *networks); bool lnet_net_unique(__u32 net_id, struct list_head *nilist, struct lnet_net **net); +bool lnet_ni_unique_net(struct list_head *nilist, char *iface); int lnet_nid2peer_locked(struct lnet_peer **lpp, lnet_nid_t nid, int cpt); struct lnet_peer *lnet_find_peer_locked(struct lnet_peer_table *ptable, diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index 042ab0d9e318..3f6f5ead8a03 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -1433,6 +1433,14 @@ lnet_startup_lndnet(struct lnet_net *net, struct lnet_lnd_tunables *tun) ni_netlist); list_del_init(&ni->ni_netlist); + /* make sure that the the NI we're about to start + * up is actually unique. if it's not fail. */ + if (!lnet_ni_unique_net(&net_l->net_ni_list, + ni->ni_interfaces[0])) { + rc = -EINVAL; + goto failed1; + } + /* adjust the pointer the parent network, just in case it * the net is a duplicate */ ni->ni_net = net_l; diff --git a/drivers/staging/lustre/lnet/lnet/config.c b/drivers/staging/lustre/lnet/lnet/config.c index fcae50676422..11d6dbc80507 100644 --- a/drivers/staging/lustre/lnet/lnet/config.c +++ b/drivers/staging/lustre/lnet/lnet/config.c @@ -95,6 +95,25 @@ lnet_net_unique(__u32 net_id, struct list_head *netlist, return true; } +/* check that the NI is unique within the list of NIs already added to + * a network */ +bool +lnet_ni_unique_net(struct list_head *nilist, char *iface) +{ + struct list_head *tmp; + struct lnet_ni *ni; + + list_for_each(tmp, nilist) { + ni = list_entry(tmp, struct lnet_ni, ni_netlist); + + if (ni->ni_interfaces[0] != NULL && + strncmp(ni->ni_interfaces[0], iface, strlen(iface)) == 0) + return false; + } + + return true; +} + static bool in_array(__u32 *array, __u32 size, __u32 value) { @@ -352,6 +371,12 @@ lnet_ni_alloc(struct lnet_net *net, struct cfs_expr_list *el, char *iface) int rc; int i; + if (iface != NULL) + /* make sure that this NI is unique in the net it's + * being added to */ + if (!lnet_ni_unique_net(&net->net_ni_added, iface)) + return NULL; + ni = kzalloc(sizeof(*ni), GFP_KERNEL); if (ni == NULL) { CERROR("Out of memory creating network interface %s%s\n", From patchwork Fri Sep 7 00:49:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10591389 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6BF28921 for ; Fri, 7 Sep 2018 00:55:44 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5E0142AF87 for ; Fri, 7 Sep 2018 00:55:44 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5263E2B17D; Fri, 7 Sep 2018 00:55:44 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 1241D2AF87 for ; Fri, 7 Sep 2018 00:55:44 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id D44404E2985; Thu, 6 Sep 2018 17:55:23 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 4B89B4E1DDF for ; Thu, 6 Sep 2018 17:55:22 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 6BEC5AD72; Fri, 7 Sep 2018 00:55:21 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Fri, 07 Sep 2018 10:49:32 +1000 Message-ID: <153628137241.8267.17639366442733480713.stgit@noble> In-Reply-To: <153628058697.8267.6056114844033479774.stgit@noble> References: <153628058697.8267.6056114844033479774.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 29/34] lnet: track tunables in lnet_startup_lndnet() X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP Not really sure what this is yet. This is part of 8cbb8cd3e771e7f7e0f99cafc19fad32770dc015 LU-7734 lnet: Multi-Rail local NI split Signed-off-by: NeilBrown Reviewed-by: Doug Oucharek --- drivers/staging/lustre/lnet/lnet/api-ni.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index 3f6f5ead8a03..f4efb48c4cf3 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -1361,6 +1361,12 @@ lnet_startup_lndnet(struct lnet_net *net, struct lnet_lnd_tunables *tun) int ni_count = 0; __u32 lnd_type; struct lnet_lnd *lnd; + int peer_timeout = + net->net_tunables.lct_peer_timeout; + int maxtxcredits = + net->net_tunables.lct_max_tx_credits; + int peerrtrcredits = + net->net_tunables.lct_peer_rtr_credits; INIT_LIST_HEAD(&local_ni_list); @@ -1447,6 +1453,9 @@ lnet_startup_lndnet(struct lnet_net *net, struct lnet_lnd_tunables *tun) rc = lnet_startup_lndni(ni, tun); + LASSERT(ni->ni_net->net_tunables.lct_peer_timeout <= 0 || + ni->ni_net->net_lnd->lnd_query != NULL); + if (rc < 0) goto failed1; @@ -1464,8 +1473,23 @@ lnet_startup_lndnet(struct lnet_net *net, struct lnet_lnd_tunables *tun) * it around after we're done. Free it. Otherwise add that * net to the global the_lnet.ln_nets */ if (net_l != net && net_l != NULL) { + /* + * TODO - note. currently the tunables can not be updated + * once added + */ lnet_net_free(net); } else { + /* + * restore tunables after it has been overwitten by the + * lnd + */ + if (peer_timeout != -1) + net->net_tunables.lct_peer_timeout = peer_timeout; + if (maxtxcredits != -1) + net->net_tunables.lct_max_tx_credits = maxtxcredits; + if (peerrtrcredits != -1) + net->net_tunables.lct_peer_rtr_credits = peerrtrcredits; + lnet_net_lock(LNET_LOCK_EX); list_add_tail(&net->net_list, &the_lnet.ln_nets); lnet_net_unlock(LNET_LOCK_EX); From patchwork Fri Sep 7 00:49:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10591391 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7CDD1112B for ; Fri, 7 Sep 2018 00:55:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6C5F82AF87 for ; Fri, 7 Sep 2018 00:55:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5EE8C2B17D; Fri, 7 Sep 2018 00:55:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 1C9922AF87 for ; Fri, 7 Sep 2018 00:55:49 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 801504E3257; Thu, 6 Sep 2018 17:55:30 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id D1F4F4E2DBD for ; Thu, 6 Sep 2018 17:55:28 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id F3E24AC90; Fri, 7 Sep 2018 00:55:27 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Fri, 07 Sep 2018 10:49:32 +1000 Message-ID: <153628137245.8267.15918400233109764157.stgit@noble> In-Reply-To: <153628058697.8267.6056114844033479774.stgit@noble> References: <153628058697.8267.6056114844033479774.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 30/34] lnet: fix typo X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP to -> too This is part of 8cbb8cd3e771e7f7e0f99cafc19fad32770dc015 LU-7734 lnet: Multi-Rail local NI split Signed-off-by: NeilBrown Reviewed-by: Doug Oucharek --- drivers/staging/lustre/lnet/lnet/api-ni.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index f4efb48c4cf3..cf0ffb8ac84b 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -1868,7 +1868,7 @@ lnet_fill_ni_info(struct lnet_ni *ni, struct lnet_ioctl_config_data *config) if (config->cfg_hdr.ioc_len > min_size) tunable_size = config->cfg_hdr.ioc_len - min_size; - /* Don't copy to much data to user space */ + /* Don't copy too much data to user space */ min_size = min(tunable_size, sizeof(ni->ni_lnd_tunables)); lnd_cfg = (struct lnet_ioctl_config_lnd_tunables *)net_config->cfg_bulk; From patchwork Fri Sep 7 00:49:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10591393 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AD7E9112B for ; Fri, 7 Sep 2018 00:55:54 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9EEF22AF87 for ; Fri, 7 Sep 2018 00:55:54 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 926582B17D; Fri, 7 Sep 2018 00:55:54 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 7BDCD2AF87 for ; Fri, 7 Sep 2018 00:55:53 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 4D9FD4E32BC; Thu, 6 Sep 2018 17:55:37 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 3E9144E1E13 for ; Thu, 6 Sep 2018 17:55:35 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 71087AD72; Fri, 7 Sep 2018 00:55:34 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Fri, 07 Sep 2018 10:49:32 +1000 Message-ID: <153628137248.8267.6726065772840936203.stgit@noble> In-Reply-To: <153628058697.8267.6056114844033479774.stgit@noble> References: <153628058697.8267.6056114844033479774.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 31/34] lnet: lnet_dyn_add_ni: fix ping_info count X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP Use the correct count of interfaces when calling lnet_ping_info_setup() in lnet_dyn_add_ni() Signed-off-by: NeilBrown Reviewed-by: Doug Oucharek --- drivers/staging/lustre/lnet/lnet/api-ni.c | 27 ++++++++++++++++++++++++++- 1 file changed, 26 insertions(+), 1 deletion(-) diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index cf0ffb8ac84b..2ce0a7212dc2 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -871,6 +871,18 @@ lnet_ping_info_create(int num_ni) return ping_info; } +static inline int +lnet_get_net_ni_count_locked(struct lnet_net *net) +{ + struct lnet_ni *ni; + int count = 0; + + list_for_each_entry(ni, &net->net_ni_list, ni_netlist) + count++; + + return count; +} + static inline int lnet_get_ni_count(void) { @@ -1977,6 +1989,7 @@ lnet_dyn_add_ni(lnet_pid_t requested_pid, struct lnet_ioctl_config_data *conf) struct list_head net_head; struct lnet_remotenet *rnet; int rc; + int net_ni_count; int num_acceptor_nets; __u32 net_type; struct lnet_ioctl_config_lnd_tunables *lnd_tunables = NULL; @@ -2014,7 +2027,19 @@ lnet_dyn_add_ni(lnet_pid_t requested_pid, struct lnet_ioctl_config_data *conf) goto failed0; } - rc = lnet_ping_info_setup(&pinfo, &md_handle, 1 + lnet_get_ni_count(), + /* + * make sure you calculate the correct number of slots in the ping + * info. Since the ping info is a flattened list of all the NIs, + * we should allocate enough slots to accomodate the number of NIs + * which will be added. + * + * We can use lnet_get_net_ni_count_locked() since the net is not + * on a public list yet, so locking is not a problem + */ + net_ni_count = lnet_get_net_ni_count_locked(net); + + rc = lnet_ping_info_setup(&pinfo, &md_handle, + net_ni_count + lnet_get_ni_count(), false); if (rc) goto failed0; From patchwork Fri Sep 7 00:49:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10591395 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3C8B8921 for ; Fri, 7 Sep 2018 00:55:57 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2D6C42B06C for ; Fri, 7 Sep 2018 00:55:57 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 218A22B17E; Fri, 7 Sep 2018 00:55:57 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id D65372B06C for ; Fri, 7 Sep 2018 00:55:56 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 431C94E2399; Thu, 6 Sep 2018 17:55:45 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 107E64E1DDF for ; Thu, 6 Sep 2018 17:55:44 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 26AE8AC90; Fri, 7 Sep 2018 00:55:43 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Fri, 07 Sep 2018 10:49:32 +1000 Message-ID: <153628137252.8267.5413803631901452139.stgit@noble> In-Reply-To: <153628058697.8267.6056114844033479774.stgit@noble> References: <153628058697.8267.6056114844033479774.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 32/34] lnet: lnet_dyn_del_ni: fix ping_info count X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP - use correct interface count for lnet_ping_info_setup(). - also rename 'net' to 'net_id' so the name 'net' is free to identify the lnet_net. This is part of 8cbb8cd3e771e7f7e0f99cafc19fad32770dc015 LU-7734 lnet: Multi-Rail local NI split Signed-off-by: NeilBrown Reviewed-by: Doug Oucharek --- drivers/staging/lustre/lnet/lnet/api-ni.c | 35 +++++++++++++++++------------ 1 file changed, 20 insertions(+), 15 deletions(-) diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index 2ce0a7212dc2..ff5149da2d79 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -2109,40 +2109,45 @@ lnet_dyn_add_ni(lnet_pid_t requested_pid, struct lnet_ioctl_config_data *conf) } int -lnet_dyn_del_ni(__u32 net) +lnet_dyn_del_ni(__u32 net_id) { - struct lnet_ni *ni; + struct lnet_net *net; struct lnet_ping_info *pinfo; struct lnet_handle_md md_handle; int rc; + int net_ni_count; /* don't allow userspace to shutdown the LOLND */ - if (LNET_NETTYP(net) == LOLND) + if (LNET_NETTYP(net_id) == LOLND) return -EINVAL; mutex_lock(&the_lnet.ln_api_mutex); + + lnet_net_lock(0); + + net = lnet_get_net_locked(net_id); + if (net == NULL) { + rc = -EINVAL; + goto out; + } + + net_ni_count = lnet_get_net_ni_count_locked(net); + + lnet_net_unlock(0); + /* create and link a new ping info, before removing the old one */ rc = lnet_ping_info_setup(&pinfo, &md_handle, - lnet_get_ni_count() - 1, false); + lnet_get_ni_count() - net_ni_count, false); if (rc) goto out; - ni = lnet_net2ni(net); - if (!ni) { - rc = -EINVAL; - goto failed; - } - - lnet_shutdown_lndni(ni); + lnet_shutdown_lndnet(net); if (!lnet_count_acceptor_nets()) lnet_acceptor_stop(); lnet_ping_target_update(pinfo, md_handle); - goto out; -failed: - lnet_ping_md_unlink(pinfo, &md_handle); - lnet_ping_info_free(pinfo); + out: mutex_unlock(&the_lnet.ln_api_mutex); From patchwork Fri Sep 7 00:49:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10591397 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 710A8921 for ; Fri, 7 Sep 2018 00:56:01 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 60EA02AF87 for ; Fri, 7 Sep 2018 00:56:01 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 54E0D2B17D; Fri, 7 Sep 2018 00:56:01 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 66FA02AF87 for ; Fri, 7 Sep 2018 00:56:00 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 595964E32EE; Thu, 6 Sep 2018 17:55:54 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 69C244E2C9B for ; Thu, 6 Sep 2018 17:55:52 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 36E5DAD72; Fri, 7 Sep 2018 00:55:51 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Fri, 07 Sep 2018 10:49:32 +1000 Message-ID: <153628137255.8267.7924367116215461900.stgit@noble> In-Reply-To: <153628058697.8267.6056114844033479774.stgit@noble> References: <153628058697.8267.6056114844033479774.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 33/34] Completely re-write lnet_parse_networks(). X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Amir Shehata Was: LU-7734 lnet: Multi-Rail local NI split This patch allows the configuration of multiple NIs under one Net. It is now possible to have multiple NIDs on the same network: Ex: @tcp, @tcp. This can be configured using the following syntax: Ex: tcp(eth0, eth1) The data structures for the example above can be visualized as follows NET(tcp) | ----------------- | | NI(eth0) NI(eth1) For more details refer to the Mult-Rail Requirements and HLD documents Signed-off-by: Amir Shehata Change-Id: Id7c73b9b811a3082b61e53b9e9f95743188cbd51 Reviewed-on: http://review.whamcloud.com/18274 Tested-by: Jenkins Reviewed-by: Doug Oucharek Tested-by: Maloo Reviewed-by: Olaf Weber Reviewed-by: Doug Oucharek --- drivers/staging/lustre/lnet/lnet/config.c | 341 ++++++++++++++++++----------- 1 file changed, 217 insertions(+), 124 deletions(-) diff --git a/drivers/staging/lustre/lnet/lnet/config.c b/drivers/staging/lustre/lnet/lnet/config.c index 11d6dbc80507..0571fa6a7249 100644 --- a/drivers/staging/lustre/lnet/lnet/config.c +++ b/drivers/staging/lustre/lnet/lnet/config.c @@ -48,8 +48,11 @@ static int lnet_tbnob; /* track text buf allocation */ #define LNET_MAX_TEXTBUF_NOB (64 << 10) /* bound allocation */ #define LNET_SINGLE_TEXTBUF_NOB (4 << 10) +#define SPACESTR " \t\v\r\n" +#define DELIMITERS ":()[]" + static void -lnet_syntax(char *name, char *str, int offset, int width) +lnet_syntax(const char *name, const char *str, int offset, int width) { static char dots[LNET_SINGLE_TEXTBUF_NOB]; static char dashes[LNET_SINGLE_TEXTBUF_NOB]; @@ -363,6 +366,42 @@ lnet_net_alloc(__u32 net_id, struct list_head *net_list) return net; } +static int +lnet_ni_add_interface(struct lnet_ni *ni, char *iface) +{ + int niface = 0; + + if (ni == NULL) + return -ENOMEM; + + /* Allocate a separate piece of memory and copy + * into it the string, so we don't have + * a depencency on the tokens string. This way we + * can free the tokens at the end of the function. + * The newly allocated ni_interfaces[] can be + * freed when freeing the NI */ + while (niface < LNET_MAX_INTERFACES && + ni->ni_interfaces[niface] != NULL) + niface++; + + if (niface >= LNET_MAX_INTERFACES) { + LCONSOLE_ERROR_MSG(0x115, "Too many interfaces " + "for net %s\n", + libcfs_net2str(LNET_NIDNET(ni->ni_nid))); + return -EINVAL; + } + + ni->ni_interfaces[niface] = kstrdup(iface, GFP_KERNEL); + + if (ni->ni_interfaces[niface] == NULL) { + CERROR("Can't allocate net interface name\n"); + return -ENOMEM; + } + + return 0; +} + +/* allocate and add to the provided network */ struct lnet_ni * lnet_ni_alloc(struct lnet_net *net, struct cfs_expr_list *el, char *iface) { @@ -439,24 +478,33 @@ lnet_ni_alloc(struct lnet_net *net, struct cfs_expr_list *el, char *iface) goto failed; list_add_tail(&ni->ni_netlist, &net->net_ni_added); + /* if an interface name is provided then make sure to add in that + * interface name in NI */ + if (iface != NULL) + if (lnet_ni_add_interface(ni, iface) != 0) + goto failed; + return ni; failed: lnet_ni_free(ni); return NULL; } +/* + * Parse the networks string and create the matching set of NIs on the + * nilist. + */ int lnet_parse_networks(struct list_head *netlist, char *networks) { - struct cfs_expr_list *el = NULL; + struct cfs_expr_list *net_el = NULL; + struct cfs_expr_list *ni_el = NULL; char *tokens; char *str; - char *tmp; struct lnet_net *net; struct lnet_ni *ni = NULL; __u32 net_id; int nnets = 0; - struct list_head *temp_node; if (!networks) { CERROR("networks string is undefined\n"); @@ -476,84 +524,108 @@ lnet_parse_networks(struct list_head *netlist, char *networks) return -ENOMEM; } - tmp = tokens; str = tokens; - while (str && *str) { - char *comma = strchr(str, ','); - char *bracket = strchr(str, '('); - char *square = strchr(str, '['); - char *iface; - int niface; + /* + * Main parser loop. + * + * NB we don't check interface conflicts here; it's the LNDs + * responsibility (if it cares at all) + */ + do { + char *nistr; + char *elstr; + char *name; int rc; /* - * NB we don't check interface conflicts here; it's the LNDs - * responsibility (if it cares at all) + * Parse a network string into its components. + * + * {"("...")"}{"[""]"} */ - if (square && (!comma || square < comma)) { - /* - * i.e: o2ib0(ib0)[1,2], number between square - * brackets are CPTs this NI needs to be bond - */ - if (bracket && bracket > square) { - tmp = square; + + /* Network name (mandatory) + */ + while (isspace(*str)) + *str++ = '\0'; + if (!*str) + break; + name = str; + str += strcspn(str, SPACESTR ":()[],"); + while (isspace(*str)) + *str++ = '\0'; + + /* Interface list (optional) */ + if (*str == '(') { + *str++ = '\0'; + nistr = str; + str += strcspn(str, ")"); + if (*str != ')') { + str = nistr; goto failed_syntax; } + do { + *str++ = '\0'; + } while (isspace(*str)); + } else { + nistr = NULL; + } - tmp = strchr(square, ']'); - if (!tmp) { - tmp = square; + /* CPT expression (optional) */ + if (*str == '[') { + elstr = str; + str += strcspn(str, "]"); + if (*str != ']') { + str = elstr; goto failed_syntax; } - - rc = cfs_expr_list_parse(square, tmp - square + 1, - 0, LNET_CPT_NUMBER - 1, &el); + rc = cfs_expr_list_parse(elstr, str - elstr + 1, + 0, LNET_CPT_NUMBER - 1, + &net_el); if (rc) { - tmp = square; + str = elstr; goto failed_syntax; } - - while (square <= tmp) - *square++ = ' '; + *elstr = '\0'; + do { + *str++ = '\0'; + } while (isspace(*str)); } - if (!bracket || (comma && comma < bracket)) { - /* no interface list specified */ + /* Bad delimiters */ + if (*str && (strchr(DELIMITERS, *str) != NULL)) + goto failed_syntax; - if (comma) - *comma++ = 0; - net_id = libcfs_str2net(strim(str)); + /* go to the next net if it exits */ + str += strcspn(str, ","); + if (*str == ',') + *str++ = '\0'; - if (net_id == LNET_NIDNET(LNET_NID_ANY)) { - LCONSOLE_ERROR_MSG(0x113, - "Unrecognised network type\n"); - tmp = str; - goto failed_syntax; - } - - if (LNET_NETTYP(net_id) != LOLND) { /* LO is implicit */ - net = lnet_net_alloc(net_id, netlist); - if (!net || - !lnet_ni_alloc(net, el, NULL)) - goto failed; - } + /* + * At this point the name is properly terminated. + */ + net_id = libcfs_str2net(name); + if (net_id == LNET_NIDNET(LNET_NID_ANY)) { + LCONSOLE_ERROR_MSG(0x113, + "Unrecognised network type\n"); + str = name; + goto failed_syntax; + } - if (el) { - cfs_expr_list_free(el); - el = NULL; + if (LNET_NETTYP(net_id) == LOLND) { + /* Loopback is implicit, and there can be only one. */ + if (net_el) { + cfs_expr_list_free(net_el); + net_el = NULL; } - - str = comma; + /* Should we error out instead? */ continue; } - *bracket = 0; - net_id = libcfs_str2net(strim(str)); - if (net_id == LNET_NIDNET(LNET_NID_ANY)) { - tmp = str; - goto failed_syntax; - } + /* + * All network paramaters are now known. + */ + nnets++; /* always allocate a net, since we will eventually add an * interface to it, or we will fail, in which case we'll @@ -562,88 +634,107 @@ lnet_parse_networks(struct list_head *netlist, char *networks) if (IS_ERR_OR_NULL(net)) goto failed; - ni = lnet_ni_alloc(net, el, NULL); - if (IS_ERR_OR_NULL(ni)) - goto failed; - - if (el) { - cfs_expr_list_free(el); - el = NULL; - } - - niface = 0; - iface = bracket + 1; + if (!nistr) { + /* + * No interface list was specified, allocate a + * ni using the defaults. + */ + ni = lnet_ni_alloc(net, net_el, NULL); + if (IS_ERR_OR_NULL(ni)) + goto failed; - bracket = strchr(iface, ')'); - if (!bracket) { - tmp = iface; - goto failed_syntax; + if (net_el) { + cfs_expr_list_free(net_el); + net_el = NULL; + } + continue; } - *bracket = 0; do { - comma = strchr(iface, ','); - if (comma) - *comma++ = 0; - - iface = strim(iface); - if (!*iface) { - tmp = iface; - goto failed_syntax; + elstr = NULL; + + /* Interface name (mandatory) */ + while (isspace(*nistr)) + *nistr++ = '\0'; + name = nistr; + nistr += strcspn(nistr, SPACESTR "[],"); + while (isspace(*nistr)) + *nistr++ = '\0'; + + /* CPT expression (optional) */ + if (*nistr == '[') { + elstr = nistr; + nistr += strcspn(nistr, "]"); + if (*nistr != ']') { + str = elstr; + goto failed_syntax; + } + rc = cfs_expr_list_parse(elstr, + nistr - elstr + 1, + 0, LNET_CPT_NUMBER - 1, + &ni_el); + if (rc != 0) { + str = elstr; + goto failed_syntax; + } + *elstr = '\0'; + do { + *nistr++ = '\0'; + } while (isspace(*nistr)); + } else { + ni_el = net_el; } - if (niface == LNET_MAX_INTERFACES) { - LCONSOLE_ERROR_MSG(0x115, - "Too many interfaces for net %s\n", - libcfs_net2str(net_id)); - goto failed; + /* + * End of single interface specificaton, + * advance to the start of the next one, if + * any. + */ + if (*nistr == ',') { + do { + *nistr++ = '\0'; + } while (isspace(*nistr)); + if (!*nistr) { + str = nistr; + goto failed_syntax; + } + } else if (*nistr) { + str = nistr; + goto failed_syntax; } /* - * Allocate a separate piece of memory and copy - * into it the string, so we don't have - * a depencency on the tokens string. This way we - * can free the tokens at the end of the function. - * The newly allocated ni_interfaces[] can be - * freed when freeing the NI + * At this point the name + is properly terminated. */ - ni->ni_interfaces[niface] = kstrdup(iface, GFP_KERNEL); - if (!ni->ni_interfaces[niface]) { - CERROR("Can't allocate net interface name\n"); - goto failed; - } - niface++; - iface = comma; - } while (iface); - - str = bracket + 1; - comma = strchr(bracket + 1, ','); - if (comma) { - *comma = 0; - str = strim(str); - if (*str) { - tmp = str; + if (!*name) { + str = name; goto failed_syntax; } - str = comma + 1; - continue; - } - str = strim(str); - if (*str) { - tmp = str; - goto failed_syntax; - } - } + ni = lnet_ni_alloc(net, ni_el, name); + if (IS_ERR_OR_NULL(ni)) + goto failed; - list_for_each(temp_node, netlist) - nnets++; + if (ni_el) { + if (ni_el != net_el) { + cfs_expr_list_free(ni_el); + ni_el = NULL; + } + } + } while (*nistr); + + if (net_el) { + cfs_expr_list_free(net_el); + net_el = NULL; + } + } while (*str); kfree(tokens); return nnets; failed_syntax: - lnet_syntax("networks", networks, (int)(tmp - tokens), strlen(tmp)); + lnet_syntax("networks", networks, (int)(str - tokens), strlen(str)); failed: /* free the net list and all the nis on each net */ while (!list_empty(netlist)) { @@ -653,8 +744,10 @@ lnet_parse_networks(struct list_head *netlist, char *networks) lnet_net_free(net); } - if (el) - cfs_expr_list_free(el); + if (ni_el && ni_el != net_el) + cfs_expr_list_free(ni_el); + if (net_el) + cfs_expr_list_free(net_el); kfree(tokens); From patchwork Fri Sep 7 00:49:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10591399 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 595E8921 for ; Fri, 7 Sep 2018 00:56:06 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4B9E12AF87 for ; Fri, 7 Sep 2018 00:56:06 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 404352B17D; Fri, 7 Sep 2018 00:56:06 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id BB5C12AF87 for ; Fri, 7 Sep 2018 00:56:05 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 38B8C4E33DD; Thu, 6 Sep 2018 17:56:01 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 43DC84E33C2 for ; Thu, 6 Sep 2018 17:55:59 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 6DE34AC90; Fri, 7 Sep 2018 00:55:58 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Fri, 07 Sep 2018 10:49:32 +1000 Message-ID: <153628137259.8267.2306617574801795911.stgit@noble> In-Reply-To: <153628058697.8267.6056114844033479774.stgit@noble> References: <153628058697.8267.6056114844033479774.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 34/34] lnet: introduce use_tcp_bonding mod param X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP This is part of 8cbb8cd3e771e7f7e0f99cafc19fad32770dc015 LU-7734 lnet: Multi-Rail local NI split Signed-off-by: NeilBrown Reviewed-by: Doug Oucharek --- .../staging/lustre/include/linux/lnet/lib-lnet.h | 3 + drivers/staging/lustre/lnet/lnet/api-ni.c | 22 ++++++++- drivers/staging/lustre/lnet/lnet/config.c | 50 ++++++++++++++++---- 3 files changed, 61 insertions(+), 14 deletions(-) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h index ef551b571935..5ee770cd7a5f 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h @@ -629,7 +629,8 @@ void lnet_swap_pinginfo(struct lnet_ping_info *info); int lnet_parse_ip2nets(char **networksp, char *ip2nets); int lnet_parse_routes(char *route_str, int *im_a_router); -int lnet_parse_networks(struct list_head *nilist, char *networks); +int lnet_parse_networks(struct list_head *nilist, char *networks, + bool use_tcp_bonding); bool lnet_net_unique(__u32 net_id, struct list_head *nilist, struct lnet_net **net); bool lnet_ni_unique_net(struct list_head *nilist, char *iface); diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index ff5149da2d79..8ff386992c99 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -59,6 +59,11 @@ static int rnet_htable_size = LNET_REMOTE_NETS_HASH_DEFAULT; module_param(rnet_htable_size, int, 0444); MODULE_PARM_DESC(rnet_htable_size, "size of remote network hash table"); +static int use_tcp_bonding = false; +module_param(use_tcp_bonding, int, 0444); +MODULE_PARM_DESC(use_tcp_bonding, + "Set to 1 to use socklnd bonding. 0 to use Multi-Rail"); + static int lnet_ping(struct lnet_process_id id, signed long timeout, struct lnet_process_id __user *ids, int n_ids); @@ -1446,6 +1451,18 @@ lnet_startup_lndnet(struct lnet_net *net, struct lnet_lnd_tunables *tun) * to avoid a memory leak. */ + /* + * When a network uses TCP bonding then all its interfaces + * must be specified when the network is first defined: the + * TCP bonding code doesn't allow for interfaces to be added + * or removed. + */ + if (net_l != net && net_l != NULL && use_tcp_bonding && + LNET_NETTYP(net_l->net_id) == SOCKLND) { + rc = -EINVAL; + goto failed0; + } + while (!list_empty(&net->net_ni_added)) { ni = list_entry(net->net_ni_added.next, struct lnet_ni, ni_netlist); @@ -1702,7 +1719,8 @@ LNetNIInit(lnet_pid_t requested_pid) * routes if it has been loaded */ if (!the_lnet.ln_nis_from_mod_params) { - rc = lnet_parse_networks(&net_head, lnet_get_networks()); + rc = lnet_parse_networks(&net_head, lnet_get_networks(), + use_tcp_bonding); if (rc < 0) goto err_empty_list; } @@ -2000,7 +2018,7 @@ lnet_dyn_add_ni(lnet_pid_t requested_pid, struct lnet_ioctl_config_data *conf) lnd_tunables = (struct lnet_ioctl_config_lnd_tunables *)conf->cfg_bulk; /* Create a net/ni structures for the network string */ - rc = lnet_parse_networks(&net_head, nets); + rc = lnet_parse_networks(&net_head, nets, use_tcp_bonding); if (rc <= 0) return !rc ? -EINVAL : rc; diff --git a/drivers/staging/lustre/lnet/lnet/config.c b/drivers/staging/lustre/lnet/lnet/config.c index 0571fa6a7249..abfc5d8dc219 100644 --- a/drivers/staging/lustre/lnet/lnet/config.c +++ b/drivers/staging/lustre/lnet/lnet/config.c @@ -117,6 +117,21 @@ lnet_ni_unique_net(struct list_head *nilist, char *iface) return true; } +/* check that the NI is unique to the interfaces with in the same NI. + * This is only a consideration if use_tcp_bonding is set */ +static bool +lnet_ni_unique_ni(char *iface_list[LNET_MAX_INTERFACES], char *iface) +{ + int i; + for (i = 0; i < LNET_MAX_INTERFACES; i++) { + if (iface_list[i] != NULL && + strncmp(iface_list[i], iface, strlen(iface)) == 0) + return false; + } + + return true; +} + static bool in_array(__u32 *array, __u32 size, __u32 value) { @@ -374,6 +389,9 @@ lnet_ni_add_interface(struct lnet_ni *ni, char *iface) if (ni == NULL) return -ENOMEM; + if (!lnet_ni_unique_ni(ni->ni_interfaces, iface)) + return -EINVAL; + /* Allocate a separate piece of memory and copy * into it the string, so we don't have * a depencency on the tokens string. This way we @@ -495,7 +513,8 @@ lnet_ni_alloc(struct lnet_net *net, struct cfs_expr_list *el, char *iface) * nilist. */ int -lnet_parse_networks(struct list_head *netlist, char *networks) +lnet_parse_networks(struct list_head *netlist, char *networks, + bool use_tcp_bonding) { struct cfs_expr_list *net_el = NULL; struct cfs_expr_list *ni_el = NULL; @@ -634,7 +653,8 @@ lnet_parse_networks(struct list_head *netlist, char *networks) if (IS_ERR_OR_NULL(net)) goto failed; - if (!nistr) { + if (!nistr || + (use_tcp_bonding && LNET_NETTYP(net_id) == SOCKLND)) { /* * No interface list was specified, allocate a * ni using the defaults. @@ -643,11 +663,13 @@ lnet_parse_networks(struct list_head *netlist, char *networks) if (IS_ERR_OR_NULL(ni)) goto failed; - if (net_el) { - cfs_expr_list_free(net_el); - net_el = NULL; + if (!nistr) { + if (net_el) { + cfs_expr_list_free(net_el); + net_el = NULL; + } + continue; } - continue; } do { @@ -704,17 +726,23 @@ lnet_parse_networks(struct list_head *netlist, char *networks) } /* - * At this point the name - is properly terminated. + * At this point the name is properly terminated. */ if (!*name) { str = name; goto failed_syntax; } - ni = lnet_ni_alloc(net, ni_el, name); - if (IS_ERR_OR_NULL(ni)) - goto failed; + if (use_tcp_bonding && + LNET_NETTYP(net->net_id) == SOCKLND) { + rc = lnet_ni_add_interface(ni, name); + if (rc != 0) + goto failed; + } else { + ni = lnet_ni_alloc(net, ni_el, name); + if (IS_ERR_OR_NULL(ni)) + goto failed; + } if (ni_el) { if (ni_el != net_el) {