From patchwork Fri Sep 7 00:49:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10591365 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 372DD112B for ; Fri, 7 Sep 2018 00:54:02 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 26D6E2AF87 for ; Fri, 7 Sep 2018 00:54:02 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 197532B17D; Fri, 7 Sep 2018 00:54:02 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id A62B62AF87 for ; Fri, 7 Sep 2018 00:54:01 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 464154E31F3; Thu, 6 Sep 2018 17:54:01 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 3B2D94E31CA for ; Thu, 6 Sep 2018 17:53:59 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 59766AD72; Fri, 7 Sep 2018 00:53:58 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Fri, 07 Sep 2018 10:49:31 +1000 Message-ID: <153628137195.8267.16400748098054215181.stgit@noble> In-Reply-To: <153628058697.8267.6056114844033479774.stgit@noble> References: <153628058697.8267.6056114844033479774.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 18/34] lnet: add ni_state X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP This is barely used. This is part of 8cbb8cd3e771e7f7e0f99cafc19fad32770dc015 LU-7734 lnet: Multi-Rail local NI split Signed-off-by: NeilBrown Reviewed-by: Doug Oucharek Reviewed-by: Doug Oucharek Signed-off-by: NeilBrown --- .../staging/lustre/include/linux/lnet/lib-lnet.h | 1 + .../staging/lustre/include/linux/lnet/lib-types.h | 16 ++++++++++++++++ drivers/staging/lustre/lnet/lnet/api-ni.c | 16 ++++++++++++++++ drivers/staging/lustre/lnet/lnet/config.c | 1 + 4 files changed, 34 insertions(+) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h index faa3f19dd844..54a93235834c 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h @@ -400,6 +400,7 @@ int lnet_cpt_of_nid(lnet_nid_t nid, struct lnet_ni *ni); struct lnet_ni *lnet_nid2ni_locked(lnet_nid_t nid, int cpt); struct lnet_ni *lnet_net2ni_locked(__u32 net, int cpt); struct lnet_ni *lnet_net2ni(__u32 net); +bool lnet_is_ni_healthy_locked(struct lnet_ni *ni); extern int portal_rotor; diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h index 1d372672e2de..6c34ecf22021 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-types.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h @@ -256,6 +256,19 @@ struct lnet_tx_queue { struct list_head tq_delayed; /* delayed TXs */ }; +enum lnet_ni_state { + /* set when NI block is allocated */ + LNET_NI_STATE_INIT = 0, + /* set when NI is started successfully */ + LNET_NI_STATE_ACTIVE, + /* set when LND notifies NI failed */ + LNET_NI_STATE_FAILED, + /* set when LND notifies NI degraded */ + LNET_NI_STATE_DEGRADED, + /* set when shuttding down NI */ + LNET_NI_STATE_DELETING +}; + struct lnet_net { /* chain on the ln_nets */ struct list_head net_list; @@ -324,6 +337,9 @@ struct lnet_ni { /* my health status */ struct lnet_ni_status *ni_status; + /* NI FSM */ + enum lnet_ni_state ni_state; + /* per NI LND tunables */ struct lnet_lnd_tunables ni_lnd_tunables; diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index 46c5ca71bc07..618fdf8141f0 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -780,6 +780,16 @@ lnet_islocalnet(__u32 net) return !!ni; } +bool +lnet_is_ni_healthy_locked(struct lnet_ni *ni) +{ + if (ni->ni_state == LNET_NI_STATE_ACTIVE || + ni->ni_state == LNET_NI_STATE_DEGRADED) + return true; + + return false; +} + struct lnet_ni * lnet_nid2ni_locked(lnet_nid_t nid, int cpt) { @@ -1117,6 +1127,9 @@ lnet_clear_zombies_nis_locked(struct lnet_net *net) ni = list_entry(zombie_list->next, struct lnet_ni, ni_netlist); list_del_init(&ni->ni_netlist); + /* the ni should be in deleting state. If it's not it's + * a bug */ + LASSERT(ni->ni_state == LNET_NI_STATE_DELETING); cfs_percpt_for_each(ref, j, ni->ni_refs) { if (!*ref) continue; @@ -1163,6 +1176,7 @@ lnet_shutdown_lndni(struct lnet_ni *ni) struct lnet_net *net = ni->ni_net; lnet_net_lock(LNET_LOCK_EX); + ni->ni_state = LNET_NI_STATE_DELETING; lnet_ni_unlink_locked(ni); lnet_net_unlock(LNET_LOCK_EX); @@ -1291,6 +1305,8 @@ lnet_startup_lndni(struct lnet_ni *ni, struct lnet_lnd_tunables *tun) lnet_net_unlock(LNET_LOCK_EX); + ni->ni_state = LNET_NI_STATE_ACTIVE; + if (net->net_lnd->lnd_type == LOLND) { lnet_ni_addref(ni); LASSERT(!the_lnet.ln_loni); diff --git a/drivers/staging/lustre/lnet/lnet/config.c b/drivers/staging/lustre/lnet/lnet/config.c index 2588d67fea1b..081812e19b13 100644 --- a/drivers/staging/lustre/lnet/lnet/config.c +++ b/drivers/staging/lustre/lnet/lnet/config.c @@ -393,6 +393,7 @@ lnet_ni_alloc(struct lnet_net *net, struct cfs_expr_list *el, char *iface) ni->ni_net_ns = NULL; ni->ni_last_alive = ktime_get_real_seconds(); + ni->ni_state = LNET_NI_STATE_INIT; rc = lnet_net_append_cpts(ni->ni_cpts, ni->ni_ncpts, net); if (rc != 0) goto failed;