From patchwork Sat May 15 13:05:58 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12259787 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6EA49C433B4 for ; Sat, 15 May 2021 13:06:43 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 26D45611C9 for ; Sat, 15 May 2021 13:06:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 26D45611C9 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 131FF21FB1C; Sat, 15 May 2021 06:06:29 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 0D81721CAD2 for ; Sat, 15 May 2021 06:06:14 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 7860D100676E; Sat, 15 May 2021 09:06:12 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 6EA3193BCC; Sat, 15 May 2021 09:06:12 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Sat, 15 May 2021 09:05:58 -0400 Message-Id: <1621083970-32463-2-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1621083970-32463-1-git-send-email-jsimmons@infradead.org> References: <1621083970-32463-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 01/13] lnet: Allow delayed sends X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Horn , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Chris Horn The net_delay_add has some code related to delaying sends, but it isn't fully implemented. Modify lnet_post_send_locked() to check whether the message being sent matches a rule and should be delayed. Fix some bugs with how the delay timers were set and checked. HPE-bug-id: LUS-7651 WC-bug-id: https://jira.whamcloud.com/browse/LU-14627 Lustre-commit: ab14f3bc852e7081 ("LU-14627 lnet: Allow delayed sends") Signed-off-by: Chris Horn Reviewed-on: https://review.whamcloud.com/43416 Reviewed-by: Serguei Smirnov Reviewed-by: James Simmons Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- include/linux/lnet/lib-lnet.h | 1 + net/lnet/lnet/lib-move.c | 8 +++++++- net/lnet/lnet/net_fault.c | 24 +++++++++++++++++++----- 3 files changed, 27 insertions(+), 6 deletions(-) diff --git a/include/linux/lnet/lib-lnet.h b/include/linux/lnet/lib-lnet.h index 674f9d1..6b9e926 100644 --- a/include/linux/lnet/lib-lnet.h +++ b/include/linux/lnet/lib-lnet.h @@ -630,6 +630,7 @@ void lnet_recv(struct lnet_ni *ni, void *private, struct lnet_msg *msg, void lnet_ni_recv(struct lnet_ni *ni, void *private, struct lnet_msg *msg, int delayed, unsigned int offset, unsigned int mlen, unsigned int rlen); +void lnet_ni_send(struct lnet_ni *ni, struct lnet_msg *msg); struct lnet_msg *lnet_create_reply_msg(struct lnet_ni *ni, struct lnet_msg *get_msg); diff --git a/net/lnet/lnet/lib-move.c b/net/lnet/lnet/lib-move.c index cb0943e..6d0637c 100644 --- a/net/lnet/lnet/lib-move.c +++ b/net/lnet/lnet/lib-move.c @@ -530,7 +530,7 @@ void lnet_usr_translate_stats(struct lnet_ioctl_element_msg_stats *msg_stats, msg->msg_hdr.payload_length = cpu_to_le32(len); } -static void +void lnet_ni_send(struct lnet_ni *ni, struct lnet_msg *msg) { void *priv = msg->msg_private; @@ -733,6 +733,12 @@ void lnet_usr_translate_stats(struct lnet_ioctl_element_msg_stats *msg_stats, } } + if (unlikely(!list_empty(&the_lnet.ln_delay_rules)) && + lnet_delay_rule_match_locked(&msg->msg_hdr, msg)) { + msg->msg_tx_delayed = 1; + return LNET_CREDIT_WAIT; + } + /* unset the tx_delay flag as we're going to send it now */ msg->msg_tx_delayed = 0; diff --git a/net/lnet/lnet/net_fault.c b/net/lnet/lnet/net_fault.c index 515aa05..0d19da4 100644 --- a/net/lnet/lnet/net_fault.c +++ b/net/lnet/lnet/net_fault.c @@ -536,6 +536,7 @@ struct delay_daemon_data { { struct lnet_fault_attr *attr = &rule->dl_attr; bool delay; + time64_t now = ktime_get_seconds(); if (!lnet_fault_attr_match(attr, src, LNET_NID_ANY, dst, type, portal)) @@ -544,8 +545,6 @@ struct delay_daemon_data { /* match this rule, check delay rate now */ spin_lock(&rule->dl_lock); if (rule->dl_delay_time) { /* time based delay */ - time64_t now = ktime_get_seconds(); - rule->dl_stat.fs_count++; delay = now >= rule->dl_delay_time; if (delay) { @@ -587,10 +586,11 @@ struct delay_daemon_data { rule->dl_stat.u.delay.ls_delayed++; list_add_tail(&msg->msg_list, &rule->dl_msg_list); - msg->msg_delay_send = ktime_get_seconds() + attr->u.delay.la_latency; + msg->msg_delay_send = now + attr->u.delay.la_latency; if (rule->dl_msg_send == -1) { rule->dl_msg_send = msg->msg_delay_send; - mod_timer(&rule->dl_timer, jiffies + rule->dl_msg_send * HZ); + mod_timer(&rule->dl_timer, + jiffies + attr->u.delay.la_latency * HZ); } spin_unlock(&rule->dl_lock); @@ -662,7 +662,8 @@ struct delay_daemon_data { msg = list_first_entry(&rule->dl_msg_list, struct lnet_msg, msg_list); rule->dl_msg_send = msg->msg_delay_send; - mod_timer(&rule->dl_timer, jiffies + rule->dl_msg_send * HZ); + mod_timer(&rule->dl_timer, + jiffies + (msg->msg_delay_send - now) * HZ); } spin_unlock(&rule->dl_lock); } @@ -678,6 +679,19 @@ struct delay_daemon_data { int cpt; int rc; + if (msg->msg_sending) { + /* Delayed send */ + list_del_init(&msg->msg_list); + ni = msg->msg_txni; + CDEBUG(D_NET, "TRACE: msg %p %s -> %s : %s\n", msg, + libcfs_nid2str(ni->ni_nid), + libcfs_nid2str(msg->msg_txpeer->lpni_nid), + lnet_msgtyp2str(msg->msg_type)); + lnet_ni_send(ni, msg); + continue; + } + + /* Delayed receive */ LASSERT(msg->msg_rxpeer); LASSERT(msg->msg_rxni); From patchwork Sat May 15 13:05:59 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12259789 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A8098C433B4 for ; Sat, 15 May 2021 13:06:45 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 5BE3A61355 for ; Sat, 15 May 2021 13:06:45 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5BE3A61355 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 327C121F792; Sat, 15 May 2021 06:06:30 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 666E921CAD2 for ; Sat, 15 May 2021 06:06:14 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 7A61D1006771; Sat, 15 May 2021 09:06:12 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 7180E93BCE; Sat, 15 May 2021 09:06:12 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Sat, 15 May 2021 09:05:59 -0400 Message-Id: <1621083970-32463-3-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1621083970-32463-1-git-send-email-jsimmons@infradead.org> References: <1621083970-32463-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 02/13] lustre: lov: correctly handling sub-lock init failure X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Bobi Jam In lov_lock_sub_init(), if a sublock initialization fails, it needs to bail out of the outer loop as well as the inner one. WC-bug-id: https://jira.whamcloud.com/browse/LU-14618 Lustre-commit: 1a5169f9962e254 ("LU-14618 lov: correctly handling sub-lock init failure") Signed-off-by: Bobi Jam Reviewed-on: https://review.whamcloud.com/43345 Reviewed-by: John L. Hammond Reviewed-by: Andreas Dilger Reviewed-by: Wang Shilong Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/lov/lov_lock.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/lustre/lov/lov_lock.c b/fs/lustre/lov/lov_lock.c index efaca37..d137614 100644 --- a/fs/lustre/lov/lov_lock.c +++ b/fs/lustre/lov/lov_lock.c @@ -198,6 +198,8 @@ static struct lov_lock *lov_lock_sub_init(const struct lu_env *env, lls->sub_initialized = 1; nr++; } + if (result < 0) + break; } LASSERT(ergo(result == 0, nr == lovlck->lls_nr)); From patchwork Sat May 15 13:06:00 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12259791 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CDD16C433ED for ; Sat, 15 May 2021 13:06:48 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 83C26611C9 for ; Sat, 15 May 2021 13:06:48 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 83C26611C9 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id F229721FA83; Sat, 15 May 2021 06:06:31 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 9F50121CAD2 for ; Sat, 15 May 2021 06:06:14 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 7DD671006772; Sat, 15 May 2021 09:06:12 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 73D9998124; Sat, 15 May 2021 09:06:12 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Sat, 15 May 2021 09:06:00 -0400 Message-Id: <1621083970-32463-4-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1621083970-32463-1-git-send-email-jsimmons@infradead.org> References: <1621083970-32463-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 03/13] lnet: Local NI must be on same net as next-hop X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Horn , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Chris Horn When sending to a remote peer we need to restrict our selection of a local NI to those on the same peer net as the next-hop. The code currently selects a local NI on the peer net specified by the lr_lnet field of the lnet_route returned by lnet_find_route_locked(). However, lnet_find_route_locked() may select a next-hop peer NI on any local peer net - not just lr_lnet. A redundant assignment to sd->sd_msg->msg_src_nid_param is also removed. That variable is always set appropriately in lnet_select_pathway(). HPE-bug-id: LUS-9095 WC-bug-id: https://jira.whamcloud.com/browse/LU-13781 Lustre-commit: 031c087f3847777c ("LU-13781 lnet: Local NI must be on same net as next-hop") Signed-off-by: Chris Horn Reviewed-on: https://review.whamcloud.com/39352 Reviewed-by: Neil Brown Reviewed-by: James Simmons Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- net/lnet/lnet/lib-move.c | 26 +++++++++----------------- 1 file changed, 9 insertions(+), 17 deletions(-) diff --git a/net/lnet/lnet/lib-move.c b/net/lnet/lnet/lib-move.c index 6d0637c..3ae0209 100644 --- a/net/lnet/lnet/lib-move.c +++ b/net/lnet/lnet/lib-move.c @@ -1907,7 +1907,6 @@ struct lnet_ni * struct lnet_peer **gw_peer) { int rc; - u32 local_lnet; struct lnet_peer *gw; struct lnet_peer *lp; struct lnet_peer_net *lpn; @@ -1936,10 +1935,8 @@ struct lnet_ni * if (gwni) { gw = gwni->lpni_peer_net->lpn_peer; lnet_peer_ni_decref_locked(gwni); - if (gw->lp_rtr_refcount) { - local_lnet = LNET_NIDNET(sd->sd_rtr_nid); + if (gw->lp_rtr_refcount) route_found = true; - } } else { CWARN("No peer NI for gateway %s. Attempting to find an alternative route.\n", libcfs_nid2str(sd->sd_rtr_nid)); @@ -2054,31 +2051,26 @@ struct lnet_ni * gw = best_route->lr_gateway; LASSERT(gw == gwni->lpni_peer_net->lpn_peer); - local_lnet = best_route->lr_lnet; } /* Discover this gateway if it hasn't already been discovered. * This means we might delay the message until discovery has * completed */ - sd->sd_msg->msg_src_nid_param = sd->sd_src_nid; rc = lnet_initiate_peer_discovery(gwni, sd->sd_msg, sd->sd_cpt); if (rc) return rc; if (!sd->sd_best_ni) { - struct lnet_peer_net *lpeer; - - lpeer = lnet_peer_get_net_locked(gw, local_lnet); - sd->sd_best_ni = lnet_find_best_ni_on_spec_net(NULL, gw, lpeer, + lpn = gwni->lpni_peer_net; + sd->sd_best_ni = lnet_find_best_ni_on_spec_net(NULL, gw, lpn, sd->sd_md_cpt); - } - - if (!sd->sd_best_ni) { - CERROR("Internal Error. Expected local ni on %s but non found :%s\n", - libcfs_net2str(local_lnet), - libcfs_nid2str(sd->sd_src_nid)); - return -EFAULT; + if (!sd->sd_best_ni) { + CERROR("Internal Error. Expected local ni on %s but non found :%s\n", + libcfs_net2str(lpn->lpn_net_id), + libcfs_nid2str(sd->sd_src_nid)); + return -EFAULT; + } } *gw_lpni = gwni; From patchwork Sat May 15 13:06:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12259795 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B33C3C433B4 for ; Sat, 15 May 2021 13:06:53 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 74E2F61355 for ; Sat, 15 May 2021 13:06:53 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 74E2F61355 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 2C0AB21FAC2; Sat, 15 May 2021 06:06:35 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id E9F4821CAD2 for ; Sat, 15 May 2021 06:06:14 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 833581006773; Sat, 15 May 2021 09:06:12 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 75EE19814B; Sat, 15 May 2021 09:06:12 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Sat, 15 May 2021 09:06:01 -0400 Message-Id: <1621083970-32463-5-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1621083970-32463-1-git-send-email-jsimmons@infradead.org> References: <1621083970-32463-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 04/13] lnet: socklnd: add conns_per_peer parameter X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Serguei Smirnov , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Serguei Smirnov Introduce conns_per_peer ksocklnd module parameter. In typed mode, this parameter shall control the number of BULK_IN and BULK_OUT tcp connections, while the number of CONTROL connections shall stay at 1. In untyped mode, this parameter shall control the number of untyped connections. The default conns_per_peer is 1. Max is 127. WC-bug-id: https://jira.whamcloud.com/browse/LU-12815 Lustre-commit: 71b2476e4ddb95aa ("LU-12815 socklnd: add conns_per_peer parameter") Signed-off-by: Serguei Smirnov Reviewed-on: https://review.whamcloud.com/41056 Reviewed-by: Andreas Dilger Reviewed-by: James Simmons Reviewed-by: Chris Horn Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- net/lnet/klnds/socklnd/socklnd.c | 92 ++++++++++++++++++++++++++++-- net/lnet/klnds/socklnd/socklnd.h | 23 ++++++-- net/lnet/klnds/socklnd/socklnd_cb.c | 3 +- net/lnet/klnds/socklnd/socklnd_modparams.c | 9 +++ 4 files changed, 115 insertions(+), 12 deletions(-) diff --git a/net/lnet/klnds/socklnd/socklnd.c b/net/lnet/klnds/socklnd/socklnd.c index 4c79d1a..3a667e5 100644 --- a/net/lnet/klnds/socklnd/socklnd.c +++ b/net/lnet/klnds/socklnd/socklnd.c @@ -132,6 +132,9 @@ static int ksocknal_ip2index(struct sockaddr *addr, struct lnet_ni *ni) conn_cb->ksnr_connected = 0; conn_cb->ksnr_deleted = 0; conn_cb->ksnr_conn_count = 0; + conn_cb->ksnr_ctrl_conn_count = 0; + conn_cb->ksnr_blki_conn_count = 0; + conn_cb->ksnr_blko_conn_count = 0; return conn_cb; } @@ -364,6 +367,73 @@ struct ksock_peer_ni * return rc; } +static unsigned int +ksocknal_get_conn_count_by_type(struct ksock_conn_cb *conn_cb, + int type) +{ + unsigned int count = 0; + + switch (type) { + case SOCKLND_CONN_CONTROL: + count = conn_cb->ksnr_ctrl_conn_count; + break; + case SOCKLND_CONN_BULK_IN: + count = conn_cb->ksnr_blki_conn_count; + break; + case SOCKLND_CONN_BULK_OUT: + count = conn_cb->ksnr_blko_conn_count; + break; + case SOCKLND_CONN_ANY: + count = conn_cb->ksnr_conn_count; + break; + default: + LBUG(); + break; + } + + return count; +} + +static void +ksocknal_incr_conn_count(struct ksock_conn_cb *conn_cb, + int type) +{ + conn_cb->ksnr_conn_count++; + + /* check if all connections of the given type got created */ + switch (type) { + case SOCKLND_CONN_CONTROL: + conn_cb->ksnr_ctrl_conn_count++; + /* there's a single control connection per peer */ + conn_cb->ksnr_connected |= BIT(type); + break; + case SOCKLND_CONN_BULK_IN: + conn_cb->ksnr_blki_conn_count++; + if (conn_cb->ksnr_blki_conn_count >= + *ksocknal_tunables.ksnd_conns_per_peer) + conn_cb->ksnr_connected |= BIT(type); + break; + case SOCKLND_CONN_BULK_OUT: + conn_cb->ksnr_blko_conn_count++; + if (conn_cb->ksnr_blko_conn_count >= + *ksocknal_tunables.ksnd_conns_per_peer) + conn_cb->ksnr_connected |= BIT(type); + break; + case SOCKLND_CONN_ANY: + if (conn_cb->ksnr_conn_count >= + *ksocknal_tunables.ksnd_conns_per_peer) + conn_cb->ksnr_connected |= BIT(type); + break; + default: + LBUG(); + break; + } + + CDEBUG(D_NET, "Add conn type %d, ksnr_connected %x conns_per_peer %d\n", + type, conn_cb->ksnr_connected, + *ksocknal_tunables.ksnd_conns_per_peer); +} + static void ksocknal_associate_cb_conn_locked(struct ksock_conn_cb *conn_cb, struct ksock_conn *conn) @@ -407,8 +477,7 @@ struct ksock_peer_ni * iface->ksni_nroutes++; } - conn_cb->ksnr_connected |= (1 << type); - conn_cb->ksnr_conn_count++; + ksocknal_incr_conn_count(conn_cb, type); /* Successful connection => further attempts can * proceed immediately @@ -728,6 +797,7 @@ struct ksock_peer_ni * int rc2; int rc; int active; + int num_dup = 0; char *warn = NULL; active = !!conn_cb; @@ -928,11 +998,14 @@ struct ksock_peer_ni * conn2->ksnc_type != conn->ksnc_type) continue; - /* - * Reply on a passive connection attempt so the peer + num_dup++; + if (num_dup < *ksocknal_tunables.ksnd_conns_per_peer) + continue; + + /* Reply on a passive connection attempt so the peer_ni * realises we're connected. */ - LASSERT(!rc); + LASSERT(rc == 0); if (!active) rc = EALREADY; @@ -1148,7 +1221,14 @@ struct ksock_peer_ni * if (conn_cb) { /* dissociate conn from cb... */ LASSERT(!conn_cb->ksnr_deleted); - LASSERT(conn_cb->ksnr_connected & BIT(conn->ksnc_type)); + + /* connected bit is set only if all connections + * of the given type got created + */ + if (ksocknal_get_conn_count_by_type(conn_cb, conn->ksnc_type) == + *ksocknal_tunables.ksnd_conns_per_peer) + LASSERT((conn_cb->ksnr_connected & + BIT(conn->ksnc_type)) != 0); list_for_each_entry(conn2, &peer_ni->ksnp_conns, ksnc_list) { if (conn2->ksnc_conn_cb == conn_cb && diff --git a/net/lnet/klnds/socklnd/socklnd.h b/net/lnet/klnds/socklnd/socklnd.h index 9f8fe8a..dac8559 100644 --- a/net/lnet/klnds/socklnd/socklnd.h +++ b/net/lnet/klnds/socklnd/socklnd.h @@ -163,6 +163,11 @@ struct ksock_tunables { int *ksnd_zc_recv_min_nfrags; /* minimum # of fragments to * enable ZC receive */ + int *ksnd_conns_per_peer; /* for typed mode, yields: + * 1 + 2*conns_per_peer total + * for untyped: + * conns_per_peer total + */ }; struct ksock_net { @@ -371,6 +376,8 @@ struct ksock_conn { */ }; +#define SOCKNAL_CONN_COUNT_MAX_BITS 8 /* max conn count bits */ + struct ksock_conn_cb { struct list_head ksnr_connd_list; /* chain on ksnr_connd_routes */ struct ksock_peer_ni *ksnr_peer; /* owning peer_ni */ @@ -389,8 +396,11 @@ struct ksock_conn_cb { * type */ unsigned int ksnr_deleted:1; /* been removed from peer_ni? */ - int ksnr_conn_count; /* # conns established by this - * route + unsigned int ksnr_ctrl_conn_count:1; /* # conns by type */ + unsigned int ksnr_blki_conn_count:8; + unsigned int ksnr_blko_conn_count:8; + int ksnr_conn_count; /* total # conns for + * this cb */ }; @@ -595,9 +605,12 @@ struct ksock_proto { static inline int ksocknal_timeout(void) { - return *ksocknal_tunables.ksnd_timeout ? - *ksocknal_tunables.ksnd_timeout : - lnet_get_lnd_timeout(); + return *ksocknal_tunables.ksnd_timeout ?: lnet_get_lnd_timeout(); +} + +static inline int ksocknal_conns_per_peer(void) +{ + return *ksocknal_tunables.ksnd_conns_per_peer ?: 1; } int ksocknal_startup(struct lnet_ni *ni); diff --git a/net/lnet/klnds/socklnd/socklnd_cb.c b/net/lnet/klnds/socklnd/socklnd_cb.c index 43658b2..bfb98f5 100644 --- a/net/lnet/klnds/socklnd/socklnd_cb.c +++ b/net/lnet/klnds/socklnd/socklnd_cb.c @@ -1818,7 +1818,8 @@ void ksocknal_write_callback(struct ksock_conn *conn) type = SOCKLND_CONN_ANY; } else if (wanted & BIT(SOCKLND_CONN_CONTROL)) { type = SOCKLND_CONN_CONTROL; - } else if (wanted & BIT(SOCKLND_CONN_BULK_IN)) { + } else if (wanted & BIT(SOCKLND_CONN_BULK_IN) && + conn_cb->ksnr_blki_conn_count <= conn_cb->ksnr_blko_conn_count) { type = SOCKLND_CONN_BULK_IN; } else { LASSERT(wanted & BIT(SOCKLND_CONN_BULK_OUT)); diff --git a/net/lnet/klnds/socklnd/socklnd_modparams.c b/net/lnet/klnds/socklnd/socklnd_modparams.c index 017627f..bc772e4 100644 --- a/net/lnet/klnds/socklnd/socklnd_modparams.c +++ b/net/lnet/klnds/socklnd/socklnd_modparams.c @@ -139,6 +139,10 @@ module_param(zc_recv_min_nfrags, int, 0644); MODULE_PARM_DESC(zc_recv_min_nfrags, "minimum # of fragments to enable ZC recv"); +static unsigned int conns_per_peer = 1; +module_param(conns_per_peer, uint, 0444); +MODULE_PARM_DESC(conns_per_peer, "number of connections per peer"); + #if SOCKNAL_VERSION_DEBUG static int protocol = 3; module_param(protocol, int, 0644); @@ -177,6 +181,11 @@ int ksocknal_tunables_init(void) ksocknal_tunables.ksnd_zc_min_payload = &zc_min_payload; ksocknal_tunables.ksnd_zc_recv = &zc_recv; ksocknal_tunables.ksnd_zc_recv_min_nfrags = &zc_recv_min_nfrags; + if (conns_per_peer > ((1 << SOCKNAL_CONN_COUNT_MAX_BITS) - 1)) { + CWARN("socklnd conns_per_peer is capped at %u.\n", + (1 << SOCKNAL_CONN_COUNT_MAX_BITS) - 1); + } + ksocknal_tunables.ksnd_conns_per_peer = &conns_per_peer; #if SOCKNAL_VERSION_DEBUG ksocknal_tunables.ksnd_protocol = &protocol; From patchwork Sat May 15 13:06:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12259773 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3FCAAC433B4 for ; Sat, 15 May 2021 13:06:19 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D039961355 for ; Sat, 15 May 2021 13:06:18 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D039961355 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id C873521E06B; Sat, 15 May 2021 06:06:16 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 4110321CAD2 for ; Sat, 15 May 2021 06:06:15 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 87FFB1006774; Sat, 15 May 2021 09:06:12 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 7977A98158; Sat, 15 May 2021 09:06:12 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Sat, 15 May 2021 09:06:02 -0400 Message-Id: <1621083970-32463-6-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1621083970-32463-1-git-send-email-jsimmons@infradead.org> References: <1621083970-32463-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 05/13] lustre: readahead: export pages directly without RA X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Wang Shilong , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Wang Shilong With Readahead disabled, @vpg_defer_uptodate should not be set as we don't reserve credits for such read. In vvp_page_completion_read() we will call ll_ra_count_put() which makes @ra_cur_pages negative. Fixes: 3b1dfe4b4b ("lustre: llite: fix to submit complete read block with ra disabled") WC-bug-id: https://jira.whamcloud.com/browse/LU-14616 Lustre-commit: 9f1c0bfd10d619a3 ("LU-14616 readahead: export pages directly without RA") Signed-off-by: Wang Shilong Reviewed-on: https://review.whamcloud.com/43338 Reviewed-by: Andreas Dilger Reviewed-by: Bobi Jam Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/llite/rw.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/fs/lustre/llite/rw.c b/fs/lustre/llite/rw.c index 08ab25d..8dcbef3 100644 --- a/fs/lustre/llite/rw.c +++ b/fs/lustre/llite/rw.c @@ -237,8 +237,10 @@ static int ll_read_ahead_page(const struct lu_env *env, struct cl_io *io, cl_page_assume(env, io, page); vpg = cl2vvp_page(cl_object_page_slice(clob, page)); if (!vpg->vpg_defer_uptodate && !PageUptodate(vmpage)) { - vpg->vpg_defer_uptodate = 1; - vpg->vpg_ra_used = 0; + if (hint == MAYNEED) { + vpg->vpg_defer_uptodate = 1; + vpg->vpg_ra_used = 0; + } cl_page_list_add(queue, page); } else { /* skip completed pages */ From patchwork Sat May 15 13:06:03 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12259777 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BC30FC433ED for ; Sat, 15 May 2021 13:06:26 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3933361287 for ; Sat, 15 May 2021 13:06:26 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3933361287 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 6D7AF21F9E8; Sat, 15 May 2021 06:06:20 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 7913321CAD2 for ; Sat, 15 May 2021 06:06:15 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 8823F100677B; Sat, 15 May 2021 09:06:12 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 7B7139815B; Sat, 15 May 2021 09:06:12 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Sat, 15 May 2021 09:06:03 -0400 Message-Id: <1621083970-32463-7-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1621083970-32463-1-git-send-email-jsimmons@infradead.org> References: <1621083970-32463-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 06/13] lustre: readahead: fix reserving for unaliged read X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Wang Shilong , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Wang Shilong If read is [2K, 3K] on x86 platform, we only need read one page, but it was calculated as 2 pages. This could be problem, as we need reserve more pages credits, vvp_page_completion_read() will only free actual reading pages, which cause @ra_cur_pages leaked. Fixes: cc603a90cca ("lustre: llite: Fix page count for unaligned reads") WC-bug-id: https://jira.whamcloud.com/browse/LU-14616 Lustre-commit: 5e7e9240d27a4b74 ("LU-14616 readahead: fix reserving for unaliged read") Signed-off-by: Wang Shilong Reviewed-on: https://review.whamcloud.com/43377 Reviewed-by: Andreas Dilger Reviewed-by: Bobi Jam Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/llite/rw.c | 7 +++++++ fs/lustre/llite/vvp_io.c | 18 ++++++++++++------ 2 files changed, 19 insertions(+), 6 deletions(-) diff --git a/fs/lustre/llite/rw.c b/fs/lustre/llite/rw.c index 8dcbef3..184e5e8 100644 --- a/fs/lustre/llite/rw.c +++ b/fs/lustre/llite/rw.c @@ -90,6 +90,13 @@ static unsigned long ll_ra_count_get(struct ll_sb_info *sbi, * LRU pages, otherwise, it could cause deadlock. */ pages = min(sbi->ll_cache->ccc_lru_max >> 2, pages); + /** + * if this happen, we reserve more pages than needed, + * this will make us leak @ra_cur_pages, because + * ll_ra_count_put() acutally freed @pages. + */ + if (WARN_ON_ONCE(pages_min > pages)) + pages_min = pages; /* * If read-ahead pages left are less than 1M, do not do read-ahead, diff --git a/fs/lustre/llite/vvp_io.c b/fs/lustre/llite/vvp_io.c index e98792b..12a28d9 100644 --- a/fs/lustre/llite/vvp_io.c +++ b/fs/lustre/llite/vvp_io.c @@ -798,6 +798,7 @@ static int vvp_io_read_start(const struct lu_env *env, int exceed = 0; int result; struct iov_iter iter; + pgoff_t page_offset; CLOBINVRNT(env, obj, vvp_object_invariant(obj)); @@ -839,15 +840,20 @@ static int vvp_io_read_start(const struct lu_env *env, if (!vio->vui_ra_valid) { vio->vui_ra_valid = true; vio->vui_ra_start_idx = cl_index(obj, pos); - vio->vui_ra_pages = cl_index(obj, tot + PAGE_SIZE - 1); - /* If both start and end are unaligned, we read one more page - * than the index math suggests. - */ - if ((pos & ~PAGE_MASK) != 0 && ((pos + tot) & ~PAGE_MASK) != 0) + vio->vui_ra_pages = 0; + page_offset = pos & ~PAGE_MASK; + if (page_offset) { vio->vui_ra_pages++; + if (tot > PAGE_SIZE - page_offset) + tot -= (PAGE_SIZE - page_offset); + else + tot = 0; + } + vio->vui_ra_pages += (tot + PAGE_SIZE - 1) >> PAGE_SHIFT; CDEBUG(D_READA, "tot %zu, ra_start %lu, ra_count %lu\n", - tot, vio->vui_ra_start_idx, vio->vui_ra_pages); + vio->vui_tot_count, vio->vui_ra_start_idx, + vio->vui_ra_pages); } /* BUG: 5972 */ From patchwork Sat May 15 13:06:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12259781 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6DD57C433B4 for ; Sat, 15 May 2021 13:06:33 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 14814611C9 for ; Sat, 15 May 2021 13:06:33 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 14814611C9 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id ABF8821FABF; Sat, 15 May 2021 06:06:23 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id B24E321CAD2 for ; Sat, 15 May 2021 06:06:15 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 88D10100677C; Sat, 15 May 2021 09:06:12 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 7E96DA93C3; Sat, 15 May 2021 09:06:12 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Sat, 15 May 2021 09:06:04 -0400 Message-Id: <1621083970-32463-8-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1621083970-32463-1-git-send-email-jsimmons@infradead.org> References: <1621083970-32463-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 07/13] lustre: sec: rework includes for client encryption X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Sebastien Buisson Simplify includes for crypto, by not repeating stubs in case CONFIG_FS_ENCRYPTION is not defined. Expose encoding routines that are going to be used in the Lustre code (both client and server sides) with filename encryption. WC-bug-id: https://jira.whamcloud.com/browse/LU-13717 Lustre-commit: 028281ae195927e9751 ("LU-13717 sec: rework includes for client encryption") Signed-off-by: Sebastien Buisson Reviewed-on: https://review.whamcloud.com/43386 Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/include/lustre_crypto.h | 158 +++++++++++++++++++++----------------- fs/lustre/llite/crypto.c | 6 +- fs/lustre/llite/dir.c | 20 ++--- fs/lustre/llite/file.c | 22 +++--- fs/lustre/llite/llite_internal.h | 18 +---- fs/lustre/llite/llite_lib.c | 6 +- fs/lustre/llite/namei.c | 26 +++---- fs/lustre/llite/super25.c | 4 +- fs/lustre/osc/osc_request.c | 12 +-- 9 files changed, 135 insertions(+), 137 deletions(-) diff --git a/fs/lustre/include/lustre_crypto.h b/fs/lustre/include/lustre_crypto.h index 01b5e85..b19bb420 100644 --- a/fs/lustre/include/lustre_crypto.h +++ b/fs/lustre/include/lustre_crypto.h @@ -30,87 +30,101 @@ #ifndef _LUSTRE_CRYPTO_H_ #define _LUSTRE_CRYPTO_H_ +#include + struct ll_sb_info; +#ifdef CONFIG_FS_ENCRYPTION int ll_set_encflags(struct inode *inode, void *encctx, u32 encctxlen, bool preload); bool ll_sbi_has_test_dummy_encryption(struct ll_sb_info *sbi); bool ll_sbi_has_encrypt(struct ll_sb_info *sbi); void ll_sbi_set_encrypt(struct ll_sb_info *sbi, bool set); +#else +static inline int ll_set_encflags(struct inode *inode, void *encctx, + u32 encctxlen, bool preload) +{ + return 0; +} -#ifdef CONFIG_FS_ENCRYPTION -#define __FS_HAS_ENCRYPTION 1 -#include +static inline bool ll_sbi_has_test_dummy_encryption(struct ll_sb_info *sbi) +{ + return false; +} -#define llcrypt_operations fscrypt_operations -#define llcrypt_symlink_data fscrypt_symlink_data -#define llcrypt_dummy_context_enabled(inode) \ - fscrypt_dummy_context_enabled(inode) -#define llcrypt_has_encryption_key(inode) fscrypt_has_encryption_key(inode) -#define llcrypt_encrypt_pagecache_blocks(page, len, offs, gfp_flags) \ - fscrypt_encrypt_pagecache_blocks(page, len, offs, gfp_flags) -#define llcrypt_encrypt_block_inplace(inode, page, len, offs, lblk, gfp_flags) \ - fscrypt_encrypt_block_inplace(inode, page, len, offs, lblk, gfp_flags) -#define llcrypt_decrypt_pagecache_blocks(page, len, offs) \ - fscrypt_decrypt_pagecache_blocks(page, len, offs) -#define llcrypt_decrypt_block_inplace(inode, page, len, offs, lblk_num) \ - fscrypt_decrypt_block_inplace(inode, page, len, offs, lblk_num) -#define llcrypt_inherit_context(parent, child, fs_data, preload) \ - fscrypt_inherit_context(parent, child, fs_data, preload) -#define llcrypt_get_encryption_info(inode) fscrypt_get_encryption_info(inode) -#define llcrypt_put_encryption_info(inode) fscrypt_put_encryption_info(inode) -#define llcrypt_free_inode(inode) fscrypt_free_inode(inode) -#define llcrypt_finalize_bounce_page(pagep) fscrypt_finalize_bounce_page(pagep) -#define llcrypt_file_open(inode, filp) fscrypt_file_open(inode, filp) -#define llcrypt_ioctl_set_policy(filp, arg) fscrypt_ioctl_set_policy(filp, arg) -#define llcrypt_ioctl_get_policy_ex(filp, arg) \ - fscrypt_ioctl_get_policy_ex(filp, arg) -#define llcrypt_ioctl_add_key(filp, arg) fscrypt_ioctl_add_key(filp, arg) -#define llcrypt_ioctl_remove_key(filp, arg) fscrypt_ioctl_remove_key(filp, arg) -#define llcrypt_ioctl_remove_key_all_users(filp, arg) \ - fscrypt_ioctl_remove_key_all_users(filp, arg) -#define llcrypt_ioctl_get_key_status(filp, arg) \ - fscrypt_ioctl_get_key_status(filp, arg) -#define llcrypt_drop_inode(inode) fscrypt_drop_inode(inode) -#define llcrypt_prepare_rename(olddir, olddentry, newdir, newdentry, flags) \ - fscrypt_prepare_rename(olddir, olddentry, newdir, newdentry, flags) -#define llcrypt_prepare_link(old_dentry, dir, dentry) \ - fscrypt_prepare_link(old_dentry, dir, dentry) -#define llcrypt_prepare_setattr(dentry, attr) \ - fscrypt_prepare_setattr(dentry, attr) -#define llcrypt_set_ops(sb, cop) fscrypt_set_ops(sb, cop) -#else /* !CONFIG_FS_ENCRYPTION */ -#undef IS_ENCRYPTED -#define IS_ENCRYPTED(x) 0 -#define llcrypt_dummy_context_enabled(inode) NULL -/* copied from include/linux/fscrypt.h */ -#define llcrypt_has_encryption_key(inode) false -#define llcrypt_encrypt_pagecache_blocks(page, len, offs, gfp_flags) \ - ERR_PTR(-EOPNOTSUPP) -#define llcrypt_encrypt_block_inplace(inode, page, len, offs, lblk, gfp_flags) \ - -EOPNOTSUPP -#define llcrypt_decrypt_pagecache_blocks(page, len, offs) -EOPNOTSUPP -#define llcrypt_decrypt_block_inplace(inode, page, len, offs, lblk_num) \ - -EOPNOTSUPP -#define llcrypt_inherit_context(parent, child, fs_data, preload) -EOPNOTSUPP -#define llcrypt_get_encryption_info(inode) -EOPNOTSUPP -#define llcrypt_put_encryption_info(inode) do {} while (0) -#define llcrypt_free_inode(inode) do {} while (0) -#define llcrypt_finalize_bounce_page(pagep) do {} while (0) -static inline int llcrypt_file_open(struct inode *inode, struct file *filp) +static inline bool ll_sbi_has_encrypt(struct ll_sb_info *sbi) { - return IS_ENCRYPTED(inode) ? -EOPNOTSUPP : 0; + return false; } -#define llcrypt_ioctl_set_policy(filp, arg) -EOPNOTSUPP -#define llcrypt_ioctl_get_policy_ex(filp, arg) -EOPNOTSUPP -#define llcrypt_ioctl_add_key(filp, arg) -EOPNOTSUPP -#define llcrypt_ioctl_remove_key(filp, arg) -EOPNOTSUPP -#define llcrypt_ioctl_remove_key_all_users(filp, arg) -EOPNOTSUPP -#define llcrypt_ioctl_get_key_status(filp, arg) -EOPNOTSUPP -#define llcrypt_drop_inode(inode) 0 -#define llcrypt_prepare_rename(olddir, olddentry, newdir, newdentry, flags) 0 -#define llcrypt_prepare_link(old_dentry, dir, dentry) 0 -#define llcrypt_prepare_setattr(dentry, attr) 0 -#define llcrypt_set_ops(sb, cop) do {} while (0) -#endif /* CONFIG_FS_ENCRYPTION */ + +static inline void ll_sbi_set_encrypt(struct ll_sb_info *sbi, bool set) { } +#endif + +/* Encoding/decoding routines inspired from yEnc principles. + * We just take care of a few critical characters: + * NULL, LF, CR, /, DEL and =. + * If such a char is found, it is replaced with '=' followed by + * the char value + 64. + * All other chars are left untouched. + * Efficiency of this encoding depends on the occurences of the + * critical chars, but statistically on binary data it can be much higher + * than base64 for instance. + */ +static inline int critical_encode(const u8 *src, int len, char *dst) +{ + u8 *p = (u8 *)src, *q = dst; + + while (p - src < len) { + /* escape NULL, LF, CR, /, DEL and = */ + if (unlikely(*p == 0x0 || *p == 0xA || *p == 0xD || + *p == '/' || *p == 0x7F || *p == '=')) { + *(q++) = '='; + *(q++) = *(p++) + 64; + } else { + *(q++) = *(p++); + } + } + + return (char *)q - dst; +} + +/* returns the number of chars encoding would produce */ +static inline int critical_chars(const u8 *src, int len) +{ + u8 *p = (u8 *)src; + int newlen = len; + + while (p - src < len) { + /* NULL, LF, CR, /, DEL and = cost an additional '=' */ + if (unlikely(*p == 0x0 || *p == 0xA || *p == 0xD || + *p == '/' || *p == 0x7F || *p == '=')) + newlen++; + p++; + } + + return newlen; +} + +/* decoding routine - returns the number of chars in output */ +static inline int critical_decode(const u8 *src, int len, char *dst) +{ + u8 *p = (u8 *)src, *q = dst; + + while (p - src < len) { + if (unlikely(*p == '=')) { + *(q++) = *(++p) - 64; + p++; + } else { + *(q++) = *(p++); + } + } + + return (char *)q - dst; +} + +/* Extracts the second-to-last ciphertext block */ +#define LLCRYPT_FNAME_DIGEST(name, len) \ + ((name) + round_down((len) - FS_CRYPTO_BLOCK_SIZE - 1, \ + FS_CRYPTO_BLOCK_SIZE)) +#define LLCRYPT_FNAME_DIGEST_SIZE FS_CRYPTO_BLOCK_SIZE #endif /* _LUSTRE_CRYPTO_H_ */ diff --git a/fs/lustre/llite/crypto.c b/fs/lustre/llite/crypto.c index 8bbb766..34d0ad1 100644 --- a/fs/lustre/llite/crypto.c +++ b/fs/lustre/llite/crypto.c @@ -64,7 +64,7 @@ int ll_set_encflags(struct inode *inode, void *encctx, u32 encctxlen, if (rc) return rc; - return preload ? llcrypt_get_encryption_info(inode) : 0; + return preload ? fscrypt_get_encryption_info(inode) : 0; } /* ll_set_context has 2 distinct behaviors, depending on the value of inode @@ -143,7 +143,7 @@ void ll_sbi_set_encrypt(struct ll_sb_info *sbi, bool set) static bool ll_empty_dir(struct inode *inode) { - /* used by llcrypt_ioctl_set_policy(), because a policy can only be set + /* used by fscrypt_ioctl_set_policy(), because a policy can only be set * on an empty dir. */ /* Here we choose to return true, meaning we always call .set_context. @@ -153,7 +153,7 @@ static bool ll_empty_dir(struct inode *inode) return true; } -const struct llcrypt_operations lustre_cryptops = { +const struct fscrypt_operations lustre_cryptops = { .key_prefix = "lustre:", .get_context = ll_get_context, .set_context = ll_set_context, diff --git a/fs/lustre/llite/dir.c b/fs/lustre/llite/dir.c index 06ca329..13676c1 100644 --- a/fs/lustre/llite/dir.c +++ b/fs/lustre/llite/dir.c @@ -450,11 +450,11 @@ static int ll_dir_setdirstripe(struct dentry *dparent, struct lmv_user_md *lump, if (ll_sbi_has_encrypt(sbi) && (IS_ENCRYPTED(parent) || - unlikely(llcrypt_dummy_context_enabled(parent)))) { - err = llcrypt_get_encryption_info(parent); + unlikely(fscrypt_dummy_context_enabled(parent)))) { + err = fscrypt_get_encryption_info(parent); if (err) goto out_op_data; - if (!llcrypt_has_encryption_key(parent)) { + if (!fscrypt_has_encryption_key(parent)) { err = -ENOKEY; goto out_op_data; } @@ -476,7 +476,7 @@ static int ll_dir_setdirstripe(struct dentry *dparent, struct lmv_user_md *lump, } if (encrypt) { - err = llcrypt_inherit_context(parent, NULL, op_data, false); + err = fscrypt_inherit_context(parent, NULL, op_data, false); if (err) goto out_op_data; } @@ -2149,28 +2149,28 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg) case FS_IOC_SET_ENCRYPTION_POLICY: if (!ll_sbi_has_encrypt(ll_i2sbi(inode))) return -EOPNOTSUPP; - return llcrypt_ioctl_set_policy(file, (const void __user *)arg); + return fscrypt_ioctl_set_policy(file, (const void __user *)arg); case FS_IOC_GET_ENCRYPTION_POLICY_EX: if (!ll_sbi_has_encrypt(ll_i2sbi(inode))) return -EOPNOTSUPP; - return llcrypt_ioctl_get_policy_ex(file, (void __user *)arg); + return fscrypt_ioctl_get_policy_ex(file, (void __user *)arg); case FS_IOC_ADD_ENCRYPTION_KEY: if (!ll_sbi_has_encrypt(ll_i2sbi(inode))) return -EOPNOTSUPP; - return llcrypt_ioctl_add_key(file, (void __user *)arg); + return fscrypt_ioctl_add_key(file, (void __user *)arg); case FS_IOC_REMOVE_ENCRYPTION_KEY: if (!ll_sbi_has_encrypt(ll_i2sbi(inode))) return -EOPNOTSUPP; - return llcrypt_ioctl_remove_key(file, (void __user *)arg); + return fscrypt_ioctl_remove_key(file, (void __user *)arg); case FS_IOC_REMOVE_ENCRYPTION_KEY_ALL_USERS: if (!ll_sbi_has_encrypt(ll_i2sbi(inode))) return -EOPNOTSUPP; - return llcrypt_ioctl_remove_key_all_users(file, + return fscrypt_ioctl_remove_key_all_users(file, (void __user *)arg); case FS_IOC_GET_ENCRYPTION_KEY_STATUS: if (!ll_sbi_has_encrypt(ll_i2sbi(inode))) return -EOPNOTSUPP; - return llcrypt_ioctl_get_key_status(file, (void __user *)arg); + return fscrypt_ioctl_get_key_status(file, (void __user *)arg); #endif default: return obd_iocontrol(cmd, sbi->ll_dt_exp, 0, NULL, diff --git a/fs/lustre/llite/file.c b/fs/lustre/llite/file.c index 78f3469..ffddec6 100644 --- a/fs/lustre/llite/file.c +++ b/fs/lustre/llite/file.c @@ -443,7 +443,7 @@ static inline int ll_dom_readpage(void *data, struct page *page) kunmap_atomic(kaddr); if (inode && IS_ENCRYPTED(inode) && S_ISREG(inode->i_mode)) { - if (!llcrypt_has_encryption_key(inode)) { + if (!fscrypt_has_encryption_key(inode)) { CDEBUG(D_SEC, "no enc key for " DFID "\n", PFID(ll_inode2fid(inode))); } else { @@ -456,7 +456,7 @@ static inline int ll_dom_readpage(void *data, struct page *page) LUSTRE_ENCRYPTION_UNIT_SIZE) == 0) break; - rc = llcrypt_decrypt_pagecache_blocks(page, + rc = fscrypt_decrypt_pagecache_blocks(page, LUSTRE_ENCRYPTION_UNIT_SIZE, 0); if (rc) @@ -776,7 +776,7 @@ int ll_file_open(struct inode *inode, struct file *file) file->private_data = NULL; /* prevent ll_local_open assertion */ if (S_ISREG(inode->i_mode)) { - rc = llcrypt_file_open(inode, file); + rc = fscrypt_file_open(inode, file); if (rc) goto out_nofiledata; } @@ -4063,28 +4063,28 @@ static int ll_heat_set(struct inode *inode, enum lu_heat_flag flags) case FS_IOC_SET_ENCRYPTION_POLICY: if (!ll_sbi_has_encrypt(ll_i2sbi(inode))) return -EOPNOTSUPP; - return llcrypt_ioctl_set_policy(file, (const void __user *)arg); + return fscrypt_ioctl_set_policy(file, (const void __user *)arg); case FS_IOC_GET_ENCRYPTION_POLICY_EX: if (!ll_sbi_has_encrypt(ll_i2sbi(inode))) return -EOPNOTSUPP; - return llcrypt_ioctl_get_policy_ex(file, (void __user *)arg); + return fscrypt_ioctl_get_policy_ex(file, (void __user *)arg); case FS_IOC_ADD_ENCRYPTION_KEY: if (!ll_sbi_has_encrypt(ll_i2sbi(inode))) return -EOPNOTSUPP; - return llcrypt_ioctl_add_key(file, (void __user *)arg); + return fscrypt_ioctl_add_key(file, (void __user *)arg); case FS_IOC_REMOVE_ENCRYPTION_KEY: if (!ll_sbi_has_encrypt(ll_i2sbi(inode))) return -EOPNOTSUPP; - return llcrypt_ioctl_remove_key(file, (void __user *)arg); + return fscrypt_ioctl_remove_key(file, (void __user *)arg); case FS_IOC_REMOVE_ENCRYPTION_KEY_ALL_USERS: if (!ll_sbi_has_encrypt(ll_i2sbi(inode))) return -EOPNOTSUPP; - return llcrypt_ioctl_remove_key_all_users(file, + return fscrypt_ioctl_remove_key_all_users(file, (void __user *)arg); case FS_IOC_GET_ENCRYPTION_KEY_STATUS: if (!ll_sbi_has_encrypt(ll_i2sbi(inode))) return -EOPNOTSUPP; - return llcrypt_ioctl_get_key_status(file, (void __user *)arg); + return fscrypt_ioctl_get_key_status(file, (void __user *)arg); #endif case LL_IOC_UNLOCK_FOREIGN: { @@ -4551,10 +4551,10 @@ int ll_migrate(struct inode *parent, struct file *file, struct lmv_user_md *lum, } if (IS_ENCRYPTED(child_inode)) { - rc = llcrypt_get_encryption_info(child_inode); + rc = fscrypt_get_encryption_info(child_inode); if (rc) goto out_iput; - if (!llcrypt_has_encryption_key(child_inode)) { + if (!fscrypt_has_encryption_key(child_inode)) { CDEBUG(D_SEC, "no enc key for "DFID"\n", PFID(ll_inode2fid(child_inode))); rc = -ENOKEY; diff --git a/fs/lustre/llite/llite_internal.h b/fs/lustre/llite/llite_internal.h index b3e8a96..03d2796 100644 --- a/fs/lustre/llite/llite_internal.h +++ b/fs/lustre/llite/llite_internal.h @@ -1681,25 +1681,9 @@ static inline struct pcc_super *ll_info2pccs(struct ll_inode_info *lli) return ll_i2pccs(ll_info2i(lli)); } -#ifdef CONFIG_FS_ENCRYPTION /* crypto.c */ -extern const struct llcrypt_operations lustre_cryptops; - -#else /* !CONFIG_FS_ENCRYPTION */ -inline bool ll_sbi_has_test_dummy_encryption(struct ll_sb_info *sbi) -{ - return false; -} +extern const struct fscrypt_operations lustre_cryptops; -inline bool ll_sbi_has_encrypt(struct ll_sb_info *sbi) -{ - return false; -} - -inline void ll_sbi_set_encrypt(struct ll_sb_info *sbi, bool set) -{ -} -#endif /* !CONFIG_FS_ENCRYPTION */ /* llite/llite_foreign.c */ int ll_manage_foreign(struct inode *inode, struct lustre_md *lmd); bool ll_foreign_is_openable(struct dentry *dentry, unsigned int flags); diff --git a/fs/lustre/llite/llite_lib.c b/fs/lustre/llite/llite_lib.c index 1b3eef0..ada2b625c 100644 --- a/fs/lustre/llite/llite_lib.c +++ b/fs/lustre/llite/llite_lib.c @@ -616,8 +616,8 @@ static int client_common_fill_super(struct super_block *sb, char *md, char *dt) #if THREAD_SIZE >= 8192 /*b=17630*/ sb->s_export_op = &lustre_export_operations; #endif - llcrypt_set_ops(sb, &lustre_cryptops); + fscrypt_set_ops(sb, &lustre_cryptops); /* make root inode * XXX: move this to after cbd setup? */ @@ -1682,7 +1682,7 @@ void ll_clear_inode(struct inode *inode) */ cl_inode_fini(inode); - llcrypt_put_encryption_info(inode); + fscrypt_put_encryption_info(inode); } static int ll_md_setattr(struct dentry *dentry, struct md_op_data *op_data) @@ -2140,7 +2140,7 @@ int ll_setattr(struct dentry *de, struct iattr *attr) enum op_xvalid xvalid = 0; int rc; - rc = llcrypt_prepare_setattr(de, attr); + rc = fscrypt_prepare_setattr(de, attr); if (rc) return rc; diff --git a/fs/lustre/llite/namei.c b/fs/lustre/llite/namei.c index 2da33d0..658da49 100644 --- a/fs/lustre/llite/namei.c +++ b/fs/lustre/llite/namei.c @@ -674,7 +674,7 @@ static int ll_lookup_it_finish(struct ptlrpc_request *request, ll_i2sbi(inode)->ll_fsname, PFID(ll_inode2fid(inode)), rc); } else if (encrypt) { - rc = llcrypt_get_encryption_info(inode); + rc = fscrypt_get_encryption_info(inode); if (rc) CDEBUG(D_SEC, "cannot get enc info for " DFID ": rc = %d\n", @@ -744,10 +744,10 @@ static int ll_lookup_it_finish(struct ptlrpc_request *request, d_lustre_revalidate(*de); if (encrypt) { - rc = llcrypt_get_encryption_info(inode); + rc = fscrypt_get_encryption_info(inode); if (rc) goto out; - if (!llcrypt_has_encryption_key(inode)) { + if (!fscrypt_has_encryption_key(inode)) { rc = -ENOKEY; goto out; } @@ -878,7 +878,7 @@ static struct dentry *ll_lookup_it(struct inode *parent, struct dentry *dentry, *secctxlen = 0; } if (it->it_op & IT_CREAT && encrypt) { - rc = llcrypt_inherit_context(parent, NULL, op_data, false); + rc = fscrypt_inherit_context(parent, NULL, op_data, false); if (rc) { retval = ERR_PTR(rc); goto out; @@ -1134,11 +1134,11 @@ static int ll_atomic_open(struct inode *dir, struct dentry *dentry, /* in case of create, this is going to be a regular file because * we set S_IFREG bit on it->it_create_mode above */ - rc = llcrypt_get_encryption_info(dir); + rc = fscrypt_get_encryption_info(dir); if (rc) goto out_release; if (open_flags & O_CREAT) { - if (!llcrypt_has_encryption_key(dir)) { + if (!fscrypt_has_encryption_key(dir)) { rc = -ENOKEY; goto out_release; } @@ -1390,11 +1390,11 @@ static int ll_new_node(struct inode *dir, struct dentry *dentry, if (ll_sbi_has_encrypt(sbi) && ((IS_ENCRYPTED(dir) && (S_ISREG(mode) || S_ISDIR(mode) || S_ISLNK(mode))) || - (unlikely(llcrypt_dummy_context_enabled(dir)) && S_ISDIR(mode)))) { - err = llcrypt_get_encryption_info(dir); + (unlikely(fscrypt_dummy_context_enabled(dir)) && S_ISDIR(mode)))) { + err = fscrypt_get_encryption_info(dir); if (err) goto err_exit; - if (!llcrypt_has_encryption_key(dir)) { + if (!fscrypt_has_encryption_key(dir)) { err = -ENOKEY; goto err_exit; } @@ -1402,7 +1402,7 @@ static int ll_new_node(struct inode *dir, struct dentry *dentry, } if (encrypt) { - err = llcrypt_inherit_context(dir, NULL, op_data, false); + err = fscrypt_inherit_context(dir, NULL, op_data, false); if (err) goto err_exit; } @@ -1504,7 +1504,7 @@ static int ll_new_node(struct inode *dir, struct dentry *dentry, d_instantiate(dentry, inode); if (encrypt) { - err = llcrypt_inherit_context(dir, inode, NULL, true); + err = fscrypt_inherit_context(dir, inode, NULL, true); if (err) goto err_exit; } @@ -1740,7 +1740,7 @@ static int ll_link(struct dentry *old_dentry, struct inode *dir, PFID(ll_inode2fid(src)), src, PFID(ll_inode2fid(dir)), dir, new_dentry); - err = llcrypt_prepare_link(old_dentry, dir, new_dentry); + err = fscrypt_prepare_link(old_dentry, dir, new_dentry); if (err) return err; @@ -1785,7 +1785,7 @@ static int ll_rename(struct inode *src, struct dentry *src_dchild, if (unlikely(d_mountpoint(src_dchild) || d_mountpoint(tgt_dchild))) return -EBUSY; - err = llcrypt_prepare_rename(src, src_dchild, tgt, tgt_dchild, flags); + err = fscrypt_prepare_rename(src, src_dchild, tgt, tgt_dchild, flags); if (err) return err; diff --git a/fs/lustre/llite/super25.c b/fs/lustre/llite/super25.c index decfa2f..f50c23a 100644 --- a/fs/lustre/llite/super25.c +++ b/fs/lustre/llite/super25.c @@ -63,7 +63,7 @@ static void ll_inode_destroy_callback(struct rcu_head *head) struct inode *inode = container_of(head, struct inode, i_rcu); struct ll_inode_info *ptr = ll_i2info(inode); - llcrypt_free_inode(inode); + fscrypt_free_inode(inode); kmem_cache_free(ll_inode_cachep, ptr); } @@ -77,7 +77,7 @@ static int ll_drop_inode(struct inode *inode) int drop = generic_drop_inode(inode); if (!drop) - drop = llcrypt_drop_inode(inode); + drop = fscrypt_drop_inode(inode); return drop; } diff --git a/fs/lustre/osc/osc_request.c b/fs/lustre/osc/osc_request.c index e49d73f..0d590ed 100644 --- a/fs/lustre/osc/osc_request.c +++ b/fs/lustre/osc/osc_request.c @@ -1371,11 +1371,11 @@ static inline void osc_release_bounce_pages(struct brw_page **pga, for (i = 0; i < page_count; i++) { /* Bounce pages allocated by a call to - * llcrypt_encrypt_pagecache_blocks() in osc_brw_prep_request() + * fscrypt_encrypt_pagecache_blocks() in osc_brw_prep_request() * are identified thanks to the PageChecked flag. */ if (PageChecked(pga[i]->pg)) - llcrypt_finalize_bounce_page(&pga[i]->pg); + fscrypt_finalize_bounce_page(&pga[i]->pg); pga[i]->count -= pga[i]->bp_count_diff; pga[i]->off += pga[i]->bp_off_diff; } @@ -1463,7 +1463,7 @@ static int osc_brw_prep_request(int cmd, struct client_obd *cli, pg->pg->index = pg->off >> PAGE_SHIFT; } data_page = - llcrypt_encrypt_pagecache_blocks(pg->pg, + fscrypt_encrypt_pagecache_blocks(pg->pg, nunits, 0, GFP_NOFS); if (directio) { @@ -2145,7 +2145,7 @@ static int osc_brw_fini_request(struct ptlrpc_request *req, int rc) if (inode && IS_ENCRYPTED(inode)) { int idx; - if (!llcrypt_has_encryption_key(inode)) { + if (!fscrypt_has_encryption_key(inode)) { CDEBUG(D_SEC, "no enc key for ino %lu\n", inode->i_ino); goto out; } @@ -2181,7 +2181,7 @@ static int osc_brw_fini_request(struct ptlrpc_request *req, int rc) for (i = offs; i < offs + LUSTRE_ENCRYPTION_UNIT_SIZE; i += blocksize, lblk_num++) { - rc = llcrypt_decrypt_block_inplace(inode, + rc = fscrypt_decrypt_block_inplace(inode, pg->pg, blocksize, i, lblk_num); @@ -2189,7 +2189,7 @@ static int osc_brw_fini_request(struct ptlrpc_request *req, int rc) break; } } else { - rc = llcrypt_decrypt_pagecache_blocks(pg->pg, + rc = fscrypt_decrypt_pagecache_blocks(pg->pg, LUSTRE_ENCRYPTION_UNIT_SIZE, offs); } From patchwork Sat May 15 13:06:05 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12259775 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B629DC433B4 for ; Sat, 15 May 2021 13:06:25 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4B92A61287 for ; Sat, 15 May 2021 13:06:25 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4B92A61287 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id F02F021EBCC; Sat, 15 May 2021 06:06:19 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 0853E21CAD2 for ; Sat, 15 May 2021 06:06:16 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 8C74F1006C90; Sat, 15 May 2021 09:06:12 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 824BA64D42; Sat, 15 May 2021 09:06:12 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Sat, 15 May 2021 09:06:05 -0400 Message-Id: <1621083970-32463-9-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1621083970-32463-1-git-send-email-jsimmons@infradead.org> References: <1621083970-32463-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 08/13] lustre: ptlrpc: remove might_sleep() in sptlrpc_gc_del_sec() X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Nikitas Angelinas , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Nikitas Angelinas sptlrpc_gc_del_sec() calls mutex_lock() which calls might_sleep(), so the explicit might_sleep() call can be removed as redundant. WC-bug-id: https://jira.whamcloud.com/browse/LU-14628 Lustre-commit: c31fb42f9aa561ae ("LU-14628 ptlrpc: remove might_sleep() in sptlrpc_gc_del_sec()") Signed-off-by: Nikitas Angelinas Reviewed-on: https://review.whamcloud.com/43397 Reviewed-by: Andreas Dilger Reviewed-by: Shaun Tancheff Reviewed-by: James Simmons Reviewed-by: Sebastien Buisson Signed-off-by: James Simmons --- fs/lustre/ptlrpc/sec_gc.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/fs/lustre/ptlrpc/sec_gc.c b/fs/lustre/ptlrpc/sec_gc.c index bc76323..fedcf2c 100644 --- a/fs/lustre/ptlrpc/sec_gc.c +++ b/fs/lustre/ptlrpc/sec_gc.c @@ -76,8 +76,6 @@ void sptlrpc_gc_del_sec(struct ptlrpc_sec *sec) if (list_empty(&sec->ps_gc_list)) return; - might_sleep(); - /* signal before list_del to make iteration in gc thread safe */ atomic_inc(&sec_gc_wait_del); From patchwork Sat May 15 13:06:06 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12259793 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CB403C433ED for ; Sat, 15 May 2021 13:06:50 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8E30461355 for ; Sat, 15 May 2021 13:06:50 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8E30461355 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 4894F21FB63; Sat, 15 May 2021 06:06:33 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 899BD21E047 for ; Sat, 15 May 2021 06:06:16 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 8E1271006CA0; Sat, 15 May 2021 09:06:12 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 85D5993BCC; Sat, 15 May 2021 09:06:12 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Sat, 15 May 2021 09:06:06 -0400 Message-Id: <1621083970-32463-10-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1621083970-32463-1-git-send-email-jsimmons@infradead.org> References: <1621083970-32463-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 09/13] lustre; obdclass: server qos penalty miscaculated X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lai Siyao , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Lai Siyao Server qos penalty calculation uses active target count, but it should use server count, which will make it larger than expected, then weight of targets are often 0, and finally cause MDT0 is often chosen in qos allocation. Fixes: 3f2a3e1d4 ("lustre: obdclass: lu_tgt_descs cleanup") WC-bug-id: https://jira.whamcloud.com/browse/LU-13440 Lustre-commit: 0ccce7ecb72f847f ("LU-13440 obdclass: server qos penalty miscaculated") Signed-off-by: Lai Siyao Reviewed-on: https://review.whamcloud.com/43385 Reviewed-by: Andreas Dilger Reviewed-by: Hongchao Zhang Signed-off-by: James Simmons --- fs/lustre/obdclass/lu_tgt_descs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/lustre/obdclass/lu_tgt_descs.c b/fs/lustre/obdclass/lu_tgt_descs.c index cb62ce4..9f33d22 100644 --- a/fs/lustre/obdclass/lu_tgt_descs.c +++ b/fs/lustre/obdclass/lu_tgt_descs.c @@ -633,7 +633,7 @@ int ltd_qos_update(struct lu_tgt_descs *ltd, struct lu_tgt_desc *tgt, ltq->ltq_penalty += ltq->ltq_penalty_per_obj * ltd->ltd_lov_desc.ld_active_tgt_count; svr->lsq_penalty += svr->lsq_penalty_per_obj * - ltd->ltd_lov_desc.ld_active_tgt_count; + qos->lq_active_svr_count; /* Decrease all MDS penalties */ list_for_each_entry(svr, &qos->lq_svr_list, lsq_svr_list) { From patchwork Sat May 15 13:06:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12259797 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C7941C433B4 for ; Sat, 15 May 2021 13:06:57 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8421A611C9 for ; Sat, 15 May 2021 13:06:57 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8421A611C9 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 355FE21FA9B; Sat, 15 May 2021 06:06:38 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 4025C21CB06 for ; Sat, 15 May 2021 06:06:16 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 8E9EC1006CBC; Sat, 15 May 2021 09:06:12 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 8989993BCE; Sat, 15 May 2021 09:06:12 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Sat, 15 May 2021 09:06:07 -0400 Message-Id: <1621083970-32463-11-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1621083970-32463-1-git-send-email-jsimmons@infradead.org> References: <1621083970-32463-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 10/13] lustre: lmv: add default LMV inherit depth X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lai Siyao , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Lai Siyao A new field "u8 lum_max_inherit" is added into struct lmv_user_md, which represents the inherit depth of default LMV. It will be decreased by 1 for subdirectories. The valid value of lum_max_inherit is [0, 255]: * 0 means unlimited inherit. * 1 means inherit end. * 250 is the max inherit depth. * [251, 254] are reserved. * 255 means it's not set. A new field "u8 lum_max_inherit_rr" is added, if default stripe offset is -1, lum_max_inherit_rr is non-zero, and system is balanced, new directories are created in roundrobin mannner, otherwise they are created on the MDT where their parents are located to avoid creating remote directories. And similarly this value will be decreased by 1 for each level of subdirectories. The valid value of lum_max_inherit_rr is different: * 0 means not set. * 1 means inherit end. * 250 is the max inherit depth. * [251, 254] are reserved. * 255 means unlimited inherit. However for the user interface of "lfs", the valid value is [-1, 250]: * -1 means unlimited inherit. * 0 means not set. * others are the same. WC-bug-id: https://jira.whamcloud.com/browse/LU-13440 Lustre-commit: 01d34a6b3b2e34f7 ("LU-13440 lmv: add default LMV inherit depth") Signed-off-by: Lai Siyao Reviewed-on: https://review.whamcloud.com/43131 Reviewed-by: Andreas Dilger Reviewed-by: Hongchao Zhang Signed-off-by: James Simmons --- fs/lustre/include/lu_object.h | 24 ++++++++++++- fs/lustre/include/lustre_lmv.h | 8 +++-- fs/lustre/llite/namei.c | 4 +++ fs/lustre/lmv/lmv_obd.c | 62 +++++++++++++++++++++++++++------ fs/lustre/obdclass/lu_tgt_descs.c | 16 --------- fs/lustre/ptlrpc/pack_generic.c | 5 ++- include/uapi/linux/lustre/lustre_user.h | 37 +++++++++++++++++++- 7 files changed, 124 insertions(+), 32 deletions(-) diff --git a/fs/lustre/include/lu_object.h b/fs/lustre/include/lu_object.h index a270631..3a71d6b 100644 --- a/fs/lustre/include/lu_object.h +++ b/fs/lustre/include/lu_object.h @@ -1537,11 +1537,33 @@ struct lu_tgt_descs { void lu_tgt_descs_fini(struct lu_tgt_descs *ltd); int ltd_add_tgt(struct lu_tgt_descs *ltd, struct lu_tgt_desc *tgt); void ltd_del_tgt(struct lu_tgt_descs *ltd, struct lu_tgt_desc *tgt); -bool ltd_qos_is_usable(struct lu_tgt_descs *ltd); int ltd_qos_penalties_calc(struct lu_tgt_descs *ltd); int ltd_qos_update(struct lu_tgt_descs *ltd, struct lu_tgt_desc *tgt, u64 *total_wt); +/** + * Whether MDT inode and space usages are balanced. + */ +static inline bool ltd_qos_is_balanced(struct lu_tgt_descs *ltd) +{ + return !test_bit(LQ_DIRTY, <d->ltd_qos.lq_flags) && + test_bit(LQ_SAME_SPACE, <d->ltd_qos.lq_flags); +} + +/** + * Whether QoS data is up-to-date and QoS can be applied. + */ +static inline bool ltd_qos_is_usable(struct lu_tgt_descs *ltd) +{ + if (ltd_qos_is_balanced(ltd)) + return false; + + if (ltd->ltd_lov_desc.ld_active_tgt_count < 2) + return false; + + return true; +} + static inline struct lu_tgt_desc *ltd_first_tgt(struct lu_tgt_descs *ltd) { int index; diff --git a/fs/lustre/include/lustre_lmv.h b/fs/lustre/include/lustre_lmv.h index aee8342..a74f0a5 100644 --- a/fs/lustre/include/lustre_lmv.h +++ b/fs/lustre/include/lustre_lmv.h @@ -46,6 +46,8 @@ struct lmv_stripe_md { u32 lsm_md_stripe_count; u32 lsm_md_master_mdt_index; u32 lsm_md_hash_type; + u8 lsm_md_max_inherit; + u8 lsm_md_max_inherit_rr; u32 lsm_md_layout_version; u32 lsm_md_migrate_offset; u32 lsm_md_migrate_hash; @@ -119,11 +121,11 @@ static inline void lsm_md_dump(int mask, const struct lmv_stripe_md *lsm) * terminated string so only print LOV_MAXPOOLNAME bytes. */ CDEBUG(mask, - "magic %#x stripe count %d master mdt %d hash type %#x version %d migrate offset %d migrate hash %#x pool %.*s\n", + "magic %#x stripe count %d master mdt %d hash type %#x max inherit %hhu version %d migrate offset %d migrate hash %#x pool %.*s\n", lsm->lsm_md_magic, lsm->lsm_md_stripe_count, lsm->lsm_md_master_mdt_index, lsm->lsm_md_hash_type, - lsm->lsm_md_layout_version, lsm->lsm_md_migrate_offset, - lsm->lsm_md_migrate_hash, + lsm->lsm_md_max_inherit, lsm->lsm_md_layout_version, + lsm->lsm_md_migrate_offset, lsm->lsm_md_migrate_hash, LOV_MAXPOOLNAME, lsm->lsm_md_pool_name); if (!lmv_dir_striped(lsm)) diff --git a/fs/lustre/llite/namei.c b/fs/lustre/llite/namei.c index 658da49..6ed2943 100644 --- a/fs/lustre/llite/namei.c +++ b/fs/lustre/llite/namei.c @@ -1451,6 +1451,10 @@ static int ll_new_node(struct inode *dir, struct dentry *dentry, md.default_lmv->lsm_md_master_mdt_index = lum->lum_stripe_offset; md.default_lmv->lsm_md_hash_type = lum->lum_hash_type; + md.default_lmv->lsm_md_max_inherit = + lum->lum_max_inherit; + md.default_lmv->lsm_md_max_inherit_rr = + lum->lum_max_inherit_rr; err = ll_update_inode(dir, &md); md_free_lustre_md(sbi->ll_md_exp, &md); diff --git a/fs/lustre/lmv/lmv_obd.c b/fs/lustre/lmv/lmv_obd.c index 4fa441e..552ef07 100644 --- a/fs/lustre/lmv/lmv_obd.c +++ b/fs/lustre/lmv/lmv_obd.c @@ -1695,6 +1695,22 @@ int lmv_old_layout_lookup(struct lmv_obd *lmv, struct md_op_data *op_data) return rc; } +static inline bool lmv_op_user_qos_mkdir(const struct md_op_data *op_data) +{ + const struct lmv_user_md *lum = op_data->op_data; + + return (op_data->op_cli_flags & CLI_SET_MEA) && lum && + le32_to_cpu(lum->lum_magic) == LMV_USER_MAGIC && + le32_to_cpu(lum->lum_stripe_offset) == LMV_OFFSET_DEFAULT; +} + +static inline bool lmv_op_default_qos_mkdir(const struct md_op_data *op_data) +{ + const struct lmv_stripe_md *lsm = op_data->op_default_mea1; + + return lsm && lsm->lsm_md_master_mdt_index == LMV_OFFSET_DEFAULT; +} + /* mkdir by QoS in two cases: * 1. 'lfs mkdir -i -1' * 2. parent default LMV master_mdt_index is -1 @@ -1704,27 +1720,38 @@ int lmv_old_layout_lookup(struct lmv_obd *lmv, struct md_op_data *op_data) */ static inline bool lmv_op_qos_mkdir(const struct md_op_data *op_data) { - const struct lmv_stripe_md *lsm = op_data->op_default_mea1; - const struct lmv_user_md *lum = op_data->op_data; - if (op_data->op_code != LUSTRE_OPC_MKDIR) return false; if (lmv_dir_striped(op_data->op_mea1)) return false; - if (op_data->op_cli_flags & CLI_SET_MEA && lum && - (le32_to_cpu(lum->lum_magic) == LMV_USER_MAGIC || - le32_to_cpu(lum->lum_magic) == LMV_USER_MAGIC_SPECIFIC) && - le32_to_cpu(lum->lum_stripe_offset) == LMV_OFFSET_DEFAULT) + if (lmv_op_user_qos_mkdir(op_data)) return true; - if (lsm && lsm->lsm_md_master_mdt_index == LMV_OFFSET_DEFAULT) + if (lmv_op_default_qos_mkdir(op_data)) return true; return false; } +/* if default LMV is set, and its index is LMV_OFFSET_DEFAULT, and + * 1. max_inherit_rr is set and is not LMV_INHERIT_RR_NONE + * 2. or parent is ROOT + * mkdir roundrobin. + * NB, this also needs to check server is balanced, which is checked by caller. + */ +static inline bool lmv_op_default_rr_mkdir(const struct md_op_data *op_data) +{ + const struct lmv_stripe_md *lsm = op_data->op_default_mea1; + + if (!lmv_op_default_qos_mkdir(op_data)) + return false; + + return lsm->lsm_md_max_inherit_rr != LMV_INHERIT_RR_NONE || + fid_is_root(&op_data->op_fid1); +} + /* 'lfs mkdir -i ' */ static inline bool lmv_op_user_specific_mkdir(const struct md_op_data *op_data) { @@ -1746,6 +1773,7 @@ static inline bool lmv_op_user_specific_mkdir(const struct md_op_data *op_data) op_data->op_default_mea1->lsm_md_master_mdt_index != LMV_OFFSET_DEFAULT; } + int lmv_create(struct obd_export *exp, struct md_op_data *op_data, const void *data, size_t datalen, umode_t mode, uid_t uid, gid_t gid, kernel_cap_t cap_effective, u64 rdev, @@ -1793,11 +1821,23 @@ int lmv_create(struct obd_export *exp, struct md_op_data *op_data, if (!tgt) return -ENODEV; } else if (lmv_op_qos_mkdir(op_data)) { + struct lmv_tgt_desc *tmp = tgt; + tgt = lmv_locate_tgt_qos(lmv, &op_data->op_mds); - if (tgt == ERR_PTR(-EAGAIN)) - tgt = lmv_locate_tgt_rr(lmv, &op_data->op_mds); + if (tgt == ERR_PTR(-EAGAIN)) { + if (ltd_qos_is_balanced(&lmv->lmv_mdt_descs) && + !lmv_op_default_rr_mkdir(op_data) && + !lmv_op_user_qos_mkdir(op_data)) + /* if it's not necessary, don't create remote + * directory. + */ + tgt = tmp; + else + tgt = lmv_locate_tgt_rr(lmv, &op_data->op_mds); + } if (IS_ERR(tgt)) return PTR_ERR(tgt); + /* * only update statfs after QoS mkdir, this means the cached * statfs may be stale, and current mkdir may not follow QoS @@ -3110,6 +3150,8 @@ static inline int lmv_unpack_user_md(struct obd_export *exp, lsm->lsm_md_stripe_count = le32_to_cpu(lmu->lum_stripe_count); lsm->lsm_md_master_mdt_index = le32_to_cpu(lmu->lum_stripe_offset); lsm->lsm_md_hash_type = le32_to_cpu(lmu->lum_hash_type); + lsm->lsm_md_max_inherit = lmu->lum_max_inherit; + lsm->lsm_md_max_inherit_rr = lmu->lum_max_inherit_rr; lsm->lsm_md_pool_name[LOV_MAXPOOLNAME] = 0; return 0; diff --git a/fs/lustre/obdclass/lu_tgt_descs.c b/fs/lustre/obdclass/lu_tgt_descs.c index 9f33d22..83f4675 100644 --- a/fs/lustre/obdclass/lu_tgt_descs.c +++ b/fs/lustre/obdclass/lu_tgt_descs.c @@ -403,22 +403,6 @@ void ltd_del_tgt(struct lu_tgt_descs *ltd, struct lu_tgt_desc *tgt) EXPORT_SYMBOL(ltd_del_tgt); /** - * Whether QoS data is up-to-date and QoS can be applied. - */ -bool ltd_qos_is_usable(struct lu_tgt_descs *ltd) -{ - if (!test_bit(LQ_DIRTY, <d->ltd_qos.lq_flags) && - test_bit(LQ_SAME_SPACE, <d->ltd_qos.lq_flags)) - return false; - - if (ltd->ltd_lov_desc.ld_active_tgt_count < 2) - return false; - - return true; -} -EXPORT_SYMBOL(ltd_qos_is_usable); - -/** * Calculate penalties per-tgt and per-server * * Re-calculate penalties when the configuration changes, active targets diff --git a/fs/lustre/ptlrpc/pack_generic.c b/fs/lustre/ptlrpc/pack_generic.c index 5dbab3d..047573a 100644 --- a/fs/lustre/ptlrpc/pack_generic.c +++ b/fs/lustre/ptlrpc/pack_generic.c @@ -2067,7 +2067,10 @@ void lustre_swab_lmv_user_md(struct lmv_user_md *lum) __swab32s(&lum->lum_stripe_offset); __swab32s(&lum->lum_hash_type); __swab32s(&lum->lum_type); - BUILD_BUG_ON(!offsetof(typeof(*lum), lum_padding1)); + /* lum_max_inherit and lum_max_inherit_rr do not need to be swabbed */ + BUILD_BUG_ON(offsetof(typeof(*lum), lum_padding1) == 0); + BUILD_BUG_ON(offsetof(typeof(*lum), lum_padding2) == 0); + BUILD_BUG_ON(offsetof(typeof(*lum), lum_padding3) == 0); switch (lum->lum_magic) { case LMV_USER_MAGIC_SPECIFIC: count = lum->lum_stripe_count; diff --git a/include/uapi/linux/lustre/lustre_user.h b/include/uapi/linux/lustre/lustre_user.h index 542d2d3..bcb9f86 100644 --- a/include/uapi/linux/lustre/lustre_user.h +++ b/include/uapi/linux/lustre/lustre_user.h @@ -789,7 +789,11 @@ struct lmv_user_md_v1 { __u32 lum_stripe_offset; /* MDT idx for default dirstripe */ __u32 lum_hash_type; /* Dir stripe policy */ __u32 lum_type; /* LMV type: default */ - __u32 lum_padding1; + __u8 lum_max_inherit; /* inherit depth of default LMV */ + __u8 lum_max_inherit_rr; /* inherit depth of default LMV to + * round-robin mkdir + */ + __u16 lum_padding1; __u32 lum_padding2; __u32 lum_padding3; char lum_pool_name[LOV_MAXPOOLNAME + 1]; @@ -815,6 +819,37 @@ enum lmv_type { LMV_TYPE_DEFAULT = 0x0000, }; +/* lum_max_inherit will be decreased by 1 after each inheritance if it's not + * LMV_INHERIT_UNLIMITED or > LMV_INHERIT_MAX. + */ +enum { + /* for historical reason, 0 means unlimited inheritance */ + LMV_INHERIT_UNLIMITED = 0, + /* unlimited lum_max_inherit by default */ + LMV_INHERIT_DEFAULT = 0, + /* not inherit any more */ + LMV_INHERIT_END = 1, + /* max inherit depth */ + LMV_INHERIT_MAX = 250, + /* [251, 254] are reserved */ + /* not set, or when inherit depth goes beyond end, */ + LMV_INHERIT_NONE = 255, +}; + +enum { + /* not set, or when inherit_rr depth goes beyond end, */ + LMV_INHERIT_RR_NONE = 0, + /* disable lum_max_inherit_rr by default */ + LMV_INHERIT_RR_DEFAULT = 0, + /* not inherit any more */ + LMV_INHERIT_RR_END = 1, + /* max inherit depth */ + LMV_INHERIT_RR_MAX = 250, + /* [251, 254] are reserved */ + /* unlimited inheritance */ + LMV_INHERIT_RR_UNLIMITED = 255, +}; + static inline int lmv_user_md_size(int stripes, int lmm_magic) { int size = sizeof(struct lmv_user_md); From patchwork Sat May 15 13:06:08 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12259785 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D1750C433B4 for ; Sat, 15 May 2021 13:06:39 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 88311611C9 for ; Sat, 15 May 2021 13:06:39 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 88311611C9 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 0F8B021FAFA; Sat, 15 May 2021 06:06:27 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id C5A5F21E068 for ; Sat, 15 May 2021 06:06:16 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 91A3D1006EA0; Sat, 15 May 2021 09:06:12 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 8D54498124; Sat, 15 May 2021 09:06:12 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Sat, 15 May 2021 09:06:08 -0400 Message-Id: <1621083970-32463-12-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1621083970-32463-1-git-send-email-jsimmons@infradead.org> References: <1621083970-32463-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 11/13] lustre: lmv: qos stay on current MDT if less full X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lai Siyao , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Andreas Dilger Keep "space balanced" subdirectories on the parent MDT if it is less full than average, since it doesn't make sense to select another MDT which may occasionally be *more* full. This also reduces random "MDT jumping" and needless remote directories. Reduce the QOS threshold for space balanced LMV layouts, so that the MDTs don't become too imbalanced before trying to fix the problem. Change the LUSTRE_OP_MKDIR opcode to be 1 instead of 0, so it can be seen that a valid opcode has been stored into the structure. WC-bug-id: https://jira.whamcloud.com/browse/LU-13439 Lustre-commit: 94da640afc0f ("LU-13439 lmv: qos stay on current MDT if less full") Signed-off-by: Lai Siyao Signed-off-by: Andreas Dilger Reviewed-on: https://review.whamcloud.com/43445 Reviewed-by: Mike Pershin Reviewed-by: Hongchao Zhang Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/include/lu_object.h | 6 ++++++ fs/lustre/include/obd.h | 10 +++++----- fs/lustre/lmv/lmv_obd.c | 22 +++++++++++++++++++--- fs/lustre/obdclass/lu_tgt_descs.c | 18 +++++++++++++----- 4 files changed, 43 insertions(+), 13 deletions(-) diff --git a/fs/lustre/include/lu_object.h b/fs/lustre/include/lu_object.h index 3a71d6b..b1d7577 100644 --- a/fs/lustre/include/lu_object.h +++ b/fs/lustre/include/lu_object.h @@ -1457,6 +1457,12 @@ struct lu_tgt_qos { }; /* target descriptor */ +#define LOV_QOS_DEF_THRESHOLD_RR_PCT 17 +#define LMV_QOS_DEF_THRESHOLD_RR_PCT 5 + +#define LOV_QOS_DEF_PRIO_FREE 90 +#define LMV_QOS_DEF_PRIO_FREE 90 + struct lu_tgt_desc { union { struct dt_device *ltd_tgt; diff --git a/fs/lustre/include/obd.h b/fs/lustre/include/obd.h index efd4538..678953a 100644 --- a/fs/lustre/include/obd.h +++ b/fs/lustre/include/obd.h @@ -718,11 +718,11 @@ enum md_cli_flags { }; enum md_op_code { - LUSTRE_OPC_MKDIR = 0, - LUSTRE_OPC_SYMLINK = 1, - LUSTRE_OPC_MKNOD = 2, - LUSTRE_OPC_CREATE = 3, - LUSTRE_OPC_ANY = 5, + LUSTRE_OPC_MKDIR = 1, + LUSTRE_OPC_SYMLINK, + LUSTRE_OPC_MKNOD, + LUSTRE_OPC_CREATE, + LUSTRE_OPC_ANY, }; /** diff --git a/fs/lustre/lmv/lmv_obd.c b/fs/lustre/lmv/lmv_obd.c index 552ef07..fb89047 100644 --- a/fs/lustre/lmv/lmv_obd.c +++ b/fs/lustre/lmv/lmv_obd.c @@ -1429,9 +1429,10 @@ static int lmv_close(struct obd_export *exp, struct md_op_data *op_data, static struct lu_tgt_desc *lmv_locate_tgt_qos(struct lmv_obd *lmv, u32 *mdt) { - struct lu_tgt_desc *tgt; + struct lu_tgt_desc *tgt, *cur = NULL; u64 total_weight = 0; u64 cur_weight = 0; + int total_usable = 0; u64 rand; int rc; @@ -1452,15 +1453,30 @@ static struct lu_tgt_desc *lmv_locate_tgt_qos(struct lmv_obd *lmv, u32 *mdt) } lmv_foreach_tgt(lmv, tgt) { - tgt->ltd_qos.ltq_usable = 0; - if (!tgt->ltd_exp || !tgt->ltd_active) + if (!tgt->ltd_exp || !tgt->ltd_active) { + tgt->ltd_qos.ltq_usable = 0; continue; + } tgt->ltd_qos.ltq_usable = 1; lu_tgt_qos_weight_calc(tgt); + if (tgt->ltd_index == *mdt) { + cur = tgt; + cur_weight = tgt->ltd_qos.ltq_weight; + } total_weight += tgt->ltd_qos.ltq_weight; + total_usable++; + } + + /* if current MDT has higher-than-average space, stay on same MDT */ + rand = total_weight / total_usable; + if (cur_weight >= rand) { + tgt = cur; + rc = 0; + goto unlock; } + cur_weight = 0; rand = lu_prandom_u64_max(total_weight); lmv_foreach_connected_tgt(lmv, tgt) { diff --git a/fs/lustre/obdclass/lu_tgt_descs.c b/fs/lustre/obdclass/lu_tgt_descs.c index 83f4675..2a2b30a 100644 --- a/fs/lustre/obdclass/lu_tgt_descs.c +++ b/fs/lustre/obdclass/lu_tgt_descs.c @@ -265,13 +265,21 @@ int lu_tgt_descs_init(struct lu_tgt_descs *ltd, bool is_mdt) init_rwsem(<d->ltd_qos.lq_rw_sem); set_bit(LQ_DIRTY, <d->ltd_qos.lq_flags); set_bit(LQ_RESET, <d->ltd_qos.lq_flags); - /* Default priority is toward free space balance */ - ltd->ltd_qos.lq_prio_free = 232; - /* Default threshold for rr (roughly 17%) */ - ltd->ltd_qos.lq_threshold_rr = 43; ltd->ltd_is_mdt = is_mdt; - if (is_mdt) + /* MDT imbalance threshold is low to balance across MDTs + * relatively quickly, because each directory may result + * in a large number of files/subdirs created therein. + */ + if (is_mdt) { ltd->ltd_lmv_desc.ld_pattern = LMV_HASH_TYPE_DEFAULT; + ltd->ltd_qos.lq_prio_free = LMV_QOS_DEF_PRIO_FREE * 256 / 100; + ltd->ltd_qos.lq_threshold_rr = + LMV_QOS_DEF_THRESHOLD_RR_PCT * 256 / 100; + } else { + ltd->ltd_qos.lq_prio_free = LOV_QOS_DEF_PRIO_FREE * 256 / 100; + ltd->ltd_qos.lq_threshold_rr = + LOV_QOS_DEF_THRESHOLD_RR_PCT * 256 / 100; + } return 0; } From patchwork Sat May 15 13:06:09 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12259783 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 861D5C433ED for ; Sat, 15 May 2021 13:06:37 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3A6B461355 for ; Sat, 15 May 2021 13:06:37 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3A6B461355 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id C86D421F245; Sat, 15 May 2021 06:06:25 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 67E6E21EBDE for ; Sat, 15 May 2021 06:06:17 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 973EF1006EA3; Sat, 15 May 2021 09:06:12 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 912129814B; Sat, 15 May 2021 09:06:12 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Sat, 15 May 2021 09:06:09 -0400 Message-Id: <1621083970-32463-13-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1621083970-32463-1-git-send-email-jsimmons@infradead.org> References: <1621083970-32463-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 12/13] lnet: Correct the router ping interval calculation X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Horn , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Chris Horn The router ping interval is being divided by the number of local nets which results in sending pings more frequently than defined by the alive_router_check_interval. In addition, the current code is structured such that we may not find a peer net in need of a ping until after inspecting the router list multiple times. Re-work the code so that the loop that inspects a router's peer nets will look at all of them until it either loops back around the list or it finds one that actually needs to be pinged. We also move the check of LNET_PEER_RTR_DISCOVERY so that we avoid the work of inspecting the router's peer nets if the router is already being discovered. HPE-bug-id: LUS-9237 WC-bug-id: https://jira.whamcloud.com/browse/LU-13912 Lustre-commit: 0131d39a622f1efc ("LU-13912 lnet: Correct the router ping interval calculation") Signed-off-by: Chris Horn Reviewed-on: https://review.whamcloud.com/39694 Reviewed-by: Serguei Smirnov Reviewed-by: Neil Brown Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- include/linux/lnet/lib-types.h | 4 +-- net/lnet/lnet/router.c | 57 ++++++++++++++++++++++++++---------------- 2 files changed, 37 insertions(+), 24 deletions(-) diff --git a/include/linux/lnet/lib-types.h b/include/linux/lnet/lib-types.h index f199b15..d898066 100644 --- a/include/linux/lnet/lib-types.h +++ b/include/linux/lnet/lib-types.h @@ -798,8 +798,8 @@ struct lnet_peer_net { /* peer net health */ int lpn_healthv; - /* time of last router net check attempt */ - time64_t lpn_rtrcheck_timestamp; + /* time of next router ping on this net */ + time64_t lpn_next_ping; /* selection sequence number */ u32 lpn_seq; diff --git a/net/lnet/lnet/router.c b/net/lnet/lnet/router.c index e179997..9003d47 100644 --- a/net/lnet/lnet/router.c +++ b/net/lnet/lnet/router.c @@ -603,6 +603,7 @@ static void lnet_shuffle_seed(void) unsigned int offset = 0; unsigned int len = 0; struct list_head *e; + time64_t now; lnet_shuffle_seed(); @@ -623,9 +624,10 @@ static void lnet_shuffle_seed(void) /* force a router check on the gateway to make sure the route is * alive */ + now = ktime_get_real_seconds(); list_for_each_entry(lpn, &route->lr_gateway->lp_peer_nets, lpn_peer_nets) { - lpn->lpn_rtrcheck_timestamp = 0; + lpn->lpn_next_ping = now; } the_lnet.ln_remote_nets_version++; @@ -1105,11 +1107,12 @@ bool lnet_router_checker_active(void) void lnet_check_routers(void) { - struct lnet_peer_net *first_lpn = NULL; + struct lnet_peer_net *first_lpn; struct lnet_peer_net *lpn; struct lnet_peer_ni *lpni; struct lnet_peer *rtr; bool push = false; + bool needs_ping; bool found_lpn; u64 version; u32 net_id; @@ -1122,14 +1125,18 @@ bool lnet_router_checker_active(void) version = the_lnet.ln_routers_version; list_for_each_entry(rtr, &the_lnet.ln_routers, lp_rtr_list) { + /* If we're currently discovering the peer then don't + * issue another discovery + */ + if (rtr->lp_state & LNET_PEER_RTR_DISCOVERY) + continue; + now = ktime_get_real_seconds(); - /* only discover the router if we've passed - * alive_router_check_interval seconds. Some of the router - * interfaces could be down and in that case they would be - * undergoing recovery separately from this discovery. - */ - /* find next peer net which is also local */ + /* find the next local peer net which needs to be ping'd */ + needs_ping = false; + first_lpn = NULL; + found_lpn = false; net_id = rtr->lp_disc_net_id; do { lpn = lnet_get_next_peer_net_locked(rtr, net_id); @@ -1138,13 +1145,27 @@ bool lnet_router_checker_active(void) libcfs_nid2str(rtr->lp_primary_nid)); break; } + + /* We looped back to the first peer net */ if (first_lpn == lpn) break; if (!first_lpn) first_lpn = lpn; - found_lpn = lnet_islocalnet_locked(lpn->lpn_net_id); + net_id = lpn->lpn_net_id; - } while (!found_lpn); + if (!lnet_islocalnet_locked(net_id)) + continue; + + found_lpn = true; + + CDEBUG(D_NET, "rtr %s(%p) %s(%p) next ping %lld\n", + libcfs_nid2str(rtr->lp_primary_nid), rtr, + libcfs_net2str(net_id), lpn, + lpn->lpn_next_ping); + + needs_ping = now >= lpn->lpn_next_ping; + + } while (!needs_ping); if (!found_lpn || !lpn) { CERROR("no local network found for gateway %s\n", @@ -1152,18 +1173,10 @@ bool lnet_router_checker_active(void) continue; } - if (now - lpn->lpn_rtrcheck_timestamp < - alive_router_check_interval / lnet_current_net_count) + if (!needs_ping) continue; - /* If we're currently discovering the peer then don't - * issue another discovery - */ spin_lock(&rtr->lp_lock); - if (rtr->lp_state & LNET_PEER_RTR_DISCOVERY) { - spin_unlock(&rtr->lp_lock); - continue; - } /* make sure we fully discover the router */ rtr->lp_state &= ~LNET_PEER_NIDS_UPTODATE; rtr->lp_state |= LNET_PEER_FORCE_PING | LNET_PEER_FORCE_PUSH | @@ -1188,16 +1201,16 @@ bool lnet_router_checker_active(void) libcfs_nid2str(lpni->lpni_nid), cpt); rc = lnet_discover_peer_locked(lpni, cpt, false); - /* decrement ref count acquired by find_peer_ni_locked() */ + /* drop ref taken above */ lnet_peer_ni_decref_locked(lpni); if (!rc) - lpn->lpn_rtrcheck_timestamp = now; + lpn->lpn_next_ping = now + alive_router_check_interval; else CERROR("Failed to discover router %s\n", libcfs_nid2str(rtr->lp_primary_nid)); - /* NB dropped lock */ + /* NB cpt lock was dropped in lnet_discover_peer_locked() */ if (version != the_lnet.ln_routers_version) { /* the routers list has changed */ goto rescan; From patchwork Sat May 15 13:06:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12259779 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 50D5EC433ED for ; Sat, 15 May 2021 13:06:31 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 1295861355 for ; Sat, 15 May 2021 13:06:31 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1295861355 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id CAC5121FA9D; Sat, 15 May 2021 06:06:22 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 1A8D621E068 for ; Sat, 15 May 2021 06:06:17 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 9694E1006EA1; Sat, 15 May 2021 09:06:12 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 948BA98158; Sat, 15 May 2021 09:06:12 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Sat, 15 May 2021 09:06:10 -0400 Message-Id: <1621083970-32463-14-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1621083970-32463-1-git-send-email-jsimmons@infradead.org> References: <1621083970-32463-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 13/13] lustre: llite: Introduce inode open heat counter X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Oleg Drokin Initial framework to support detection of naive apps that assume open-closes are "free" and proceed to open/close same files between minute operations. We will track number of file opens per inode and last time inode was closed. Initially we'll expose these controls: llite/opencache_threshold_count - enables functionality and controls after how many opens open lock is requested llite/opencache_threshold_ms - if any reopen happens within this time (in ms), the open would trigger open lock request llite/opencache_max_ms - If last close was longer than this many ms ago - start counting opens from zero again Once enough useful data is collected we can look into adding a heatmap or another similar mechanism to better manage it and enable it by default with sensible settings. WC-bug-id: https://jira.whamcloud.com/browse/LU-10948 Lustre-commit: 41d99c4902836b726 ("LU-10948 llite: Introduce inode open heat counter") Signed-off-by: Oleg Drokin Reviewed-on: https://review.whamcloud.com/32158 Reviewed-by: Andreas Dilger Reviewed-by: Yingjin Qian Signed-off-by: James Simmons --- fs/lustre/llite/file.c | 70 ++++++++++++++++++++---- fs/lustre/llite/llite_internal.h | 32 +++++++++-- fs/lustre/llite/llite_lib.c | 5 ++ fs/lustre/llite/lproc_llite.c | 112 +++++++++++++++++++++++++++++++++++++-- fs/lustre/llite/namei.c | 7 +++ 5 files changed, 210 insertions(+), 16 deletions(-) diff --git a/fs/lustre/llite/file.c b/fs/lustre/llite/file.c index ffddec6..26aa7be 100644 --- a/fs/lustre/llite/file.c +++ b/fs/lustre/llite/file.c @@ -43,6 +43,7 @@ #include #include #include +#include #include #include @@ -414,6 +415,8 @@ int ll_file_release(struct inode *inode, struct file *file) lli->lli_async_rc = 0; } + lli->lli_close_fd_time = ktime_get(); + rc = ll_md_close(inode, file); if (CFS_FAIL_TIMEOUT_MS(OBD_FAIL_PTLRPC_DUMP_LOG, cfs_fail_val)) @@ -745,6 +748,29 @@ static int ll_local_open(struct file *file, struct lookup_intent *it, return 0; } +void ll_track_file_opens(struct inode *inode) +{ + struct ll_inode_info *lli = ll_i2info(inode); + struct ll_sb_info *sbi = ll_i2sbi(inode); + + /* do not skew results with delays from never-opened inodes */ + if (ktime_to_ns(lli->lli_close_fd_time)) + ll_stats_ops_tally(sbi, LPROC_LL_INODE_OPCLTM, + ktime_us_delta(ktime_get(), lli->lli_close_fd_time)); + + if (ktime_after(ktime_get(), + ktime_add_ms(lli->lli_close_fd_time, + sbi->ll_oc_max_ms))) { + lli->lli_open_fd_count = 1; + lli->lli_close_fd_time = ns_to_ktime(0); + } else { + lli->lli_open_fd_count++; + } + + ll_stats_ops_tally(ll_i2sbi(inode), LPROC_LL_INODE_OCOUNT, + lli->lli_open_fd_count); +} + /* Open a file, and (for the very first open) create objects on the OSTs at * this time. If opened with O_LOV_DELAY_CREATE, then we don't do the object * creation or open until ll_lov_setstripe() ioctl is called. @@ -791,6 +817,7 @@ int ll_file_open(struct inode *inode, struct file *file) if (S_ISDIR(inode->i_mode)) ll_authorize_statahead(inode, fd); + ll_track_file_opens(inode); if (is_root_inode(inode)) { file->private_data = fd; return 0; @@ -868,6 +895,7 @@ int ll_file_open(struct inode *inode, struct file *file) LASSERT(*och_usecount == 0); if (!it->it_disposition) { struct dentry *dentry = file_dentry(file); + struct ll_sb_info *sbi = ll_i2sbi(inode); struct ll_dentry_data *ldd; /* We cannot just request lock handle now, new ELC code @@ -884,20 +912,42 @@ int ll_file_open(struct inode *inode, struct file *file) * handle to be returned from LOOKUP|OPEN request, * for example if the target entry was a symlink. * - * Only fetch MDS_OPEN_LOCK if this is in NFS path, - * marked by a bit set in ll_iget_for_nfs. Clear the - * bit so that it's not confusing later callers. + * In NFS path we know there's pathologic behavior + * so we always enable open lock caching when coming + * from there. It's detected by setting a flag in + * ll_iget_for_nfs. * - * NB; when ldd is NULL, it must have come via normal - * lookup path only, since ll_iget_for_nfs always calls - * ll_d_init(). + * After reaching number of opens of this inode + * we always ask for an open lock on it to handle + * bad userspace actors that open and close files + * in a loop for absolutely no good reason */ ldd = ll_d2d(dentry); - if (ldd && ldd->lld_nfs_dentry) { + if (filename_is_volatile(dentry->d_name.name, + dentry->d_name.len, + NULL)) { + /* There really is nothing here, but this + * make this more readable I think. + * We do not want openlock for volatile + * files under any circumstances + */ + } else if (ldd && ldd->lld_nfs_dentry) { + /* NFS path. This also happens to catch + * open by fh files I guess + */ + it->it_flags |= MDS_OPEN_LOCK; + /* clear the flag for future lookups */ ldd->lld_nfs_dentry = 0; - if (!filename_is_volatile(dentry->d_name.name, - dentry->d_name.len, - NULL)) + } else if (sbi->ll_oc_thrsh_count > 0) { + /* Take MDS_OPEN_LOCK with many opens */ + if (lli->lli_open_fd_count >= + sbi->ll_oc_thrsh_count) + it->it_flags |= MDS_OPEN_LOCK; + + /* If this is open after we just closed */ + else if (ktime_before(ktime_get(), + ktime_add_ms(lli->lli_close_fd_time, + sbi->ll_oc_thrsh_ms))) it->it_flags |= MDS_OPEN_LOCK; } diff --git a/fs/lustre/llite/llite_internal.h b/fs/lustre/llite/llite_internal.h index 03d2796..72aa564 100644 --- a/fs/lustre/llite/llite_internal.h +++ b/fs/lustre/llite/llite_internal.h @@ -137,9 +137,15 @@ struct ll_inode_info { struct obd_client_handle *lli_mds_read_och; struct obd_client_handle *lli_mds_write_och; struct obd_client_handle *lli_mds_exec_och; - u64 lli_open_fd_read_count; - u64 lli_open_fd_write_count; - u64 lli_open_fd_exec_count; + u64 lli_open_fd_read_count; + u64 lli_open_fd_write_count; + u64 lli_open_fd_exec_count; + + /* Number of times this inode was opened */ + u64 lli_open_fd_count; + /* When last close was performed on this inode */ + ktime_t lli_close_fd_time; + /* Protects access to och pointers and their usage counters */ struct mutex lli_och_mutex; @@ -765,6 +771,19 @@ struct ll_sb_info { unsigned int ll_heat_decay_weight; unsigned int ll_heat_period_second; + /* Opens of the same inode before we start requesting open lock */ + u32 ll_oc_thrsh_count; + + /* Time in ms between last inode close and next open to be considered + * instant back to back and would trigger an open lock request + */ + u32 ll_oc_thrsh_ms; + + /* Time in ms after last file close that we no longer count prior + * opens + */ + u32 ll_oc_max_ms; + /* filesystem fsname */ char ll_fsname[LUSTRE_MAXFSNAME + 1]; @@ -788,6 +807,10 @@ struct ll_sb_info { #define SBI_DEFAULT_HEAT_DECAY_WEIGHT ((80 * 256 + 50) / 100) #define SBI_DEFAULT_HEAT_PERIOD_SECOND (60) +#define SBI_DEFAULT_OPENCACHE_THRESHOLD_COUNT (5) +#define SBI_DEFAULT_OPENCACHE_THRESHOLD_MS (100) /* 0.1 second */ +#define SBI_DEFAULT_OPENCACHE_THRESHOLD_MAX_MS (60000) /* 1 minute */ + /* * per file-descriptor read-ahead data. */ @@ -1029,6 +1052,8 @@ enum { LPROC_LL_REMOVEXATTR, LPROC_LL_INODE_PERM, LPROC_LL_FALLOCATE, + LPROC_LL_INODE_OCOUNT, + LPROC_LL_INODE_OPCLTM, LPROC_LL_FILE_OPCODES }; @@ -1088,6 +1113,7 @@ enum ldlm_mode ll_take_md_lock(struct inode *inode, u64 bits, int ll_file_release(struct inode *inode, struct file *file); int ll_release_openhandle(struct inode *inode, struct lookup_intent *it); int ll_md_real_close(struct inode *inode, fmode_t fmode); +void ll_track_file_opens(struct inode *inode); int ll_getattr(const struct path *path, struct kstat *stat, u32 request_mask, unsigned int flags); int ll_getattr_dentry(struct dentry *de, struct kstat *stat, u32 request_mask, diff --git a/fs/lustre/llite/llite_lib.c b/fs/lustre/llite/llite_lib.c index ada2b625c..0c914c9 100644 --- a/fs/lustre/llite/llite_lib.c +++ b/fs/lustre/llite/llite_lib.c @@ -190,6 +190,11 @@ static struct ll_sb_info *ll_init_sbi(void) /* Per-filesystem file heat */ sbi->ll_heat_decay_weight = SBI_DEFAULT_HEAT_DECAY_WEIGHT; sbi->ll_heat_period_second = SBI_DEFAULT_HEAT_PERIOD_SECOND; + + /* Per-fs open heat level before requesting open lock */ + sbi->ll_oc_thrsh_count = SBI_DEFAULT_OPENCACHE_THRESHOLD_COUNT; + sbi->ll_oc_max_ms = SBI_DEFAULT_OPENCACHE_THRESHOLD_MAX_MS; + sbi->ll_oc_thrsh_ms = SBI_DEFAULT_OPENCACHE_THRESHOLD_MS; return sbi; out_destroy_ra: kfree(sbi->ll_foreign_symlink_upcall); diff --git a/fs/lustre/llite/lproc_llite.c b/fs/lustre/llite/lproc_llite.c index 16d1497..cd8394c 100644 --- a/fs/lustre/llite/lproc_llite.c +++ b/fs/lustre/llite/lproc_llite.c @@ -1369,6 +1369,105 @@ static ssize_t heat_period_second_store(struct kobject *kobj, } LUSTRE_RW_ATTR(heat_period_second); +static ssize_t opencache_threshold_count_show(struct kobject *kobj, + struct attribute *attr, + char *buf) +{ + struct ll_sb_info *sbi = container_of(kobj, struct ll_sb_info, + ll_kset.kobj); + + if (sbi->ll_oc_thrsh_count) + return snprintf(buf, PAGE_SIZE, "%u\n", + sbi->ll_oc_thrsh_count); + else + return snprintf(buf, PAGE_SIZE, "off\n"); +} + +static ssize_t opencache_threshold_count_store(struct kobject *kobj, + struct attribute *attr, + const char *buffer, + size_t count) +{ + struct ll_sb_info *sbi = container_of(kobj, struct ll_sb_info, + ll_kset.kobj); + unsigned int val; + int rc; + + rc = kstrtouint(buffer, 10, &val); + if (rc) { + bool enable; + /* also accept "off" to disable and "on" to always cache */ + rc = kstrtobool(buffer, &enable); + if (rc) + return rc; + val = enable; + } + sbi->ll_oc_thrsh_count = val; + + return count; +} +LUSTRE_RW_ATTR(opencache_threshold_count); + +static ssize_t opencache_threshold_ms_show(struct kobject *kobj, + struct attribute *attr, + char *buf) +{ + struct ll_sb_info *sbi = container_of(kobj, struct ll_sb_info, + ll_kset.kobj); + + return snprintf(buf, PAGE_SIZE, "%u\n", sbi->ll_oc_thrsh_ms); +} + +static ssize_t opencache_threshold_ms_store(struct kobject *kobj, + struct attribute *attr, + const char *buffer, + size_t count) +{ + struct ll_sb_info *sbi = container_of(kobj, struct ll_sb_info, + ll_kset.kobj); + unsigned int val; + int rc; + + rc = kstrtouint(buffer, 10, &val); + if (rc) + return rc; + + sbi->ll_oc_thrsh_ms = val; + + return count; +} +LUSTRE_RW_ATTR(opencache_threshold_ms); + +static ssize_t opencache_max_ms_show(struct kobject *kobj, + struct attribute *attr, + char *buf) +{ + struct ll_sb_info *sbi = container_of(kobj, struct ll_sb_info, + ll_kset.kobj); + + return snprintf(buf, PAGE_SIZE, "%u\n", sbi->ll_oc_max_ms); +} + +static ssize_t opencache_max_ms_store(struct kobject *kobj, + struct attribute *attr, + const char *buffer, + size_t count) +{ + struct ll_sb_info *sbi = container_of(kobj, struct ll_sb_info, + ll_kset.kobj); + unsigned int val; + int rc; + + rc = kstrtouint(buffer, 10, &val); + if (rc) + return rc; + + sbi->ll_oc_max_ms = val; + + return count; +} +LUSTRE_RW_ATTR(opencache_max_ms); + static int ll_unstable_stats_seq_show(struct seq_file *m, void *v) { struct super_block *sb = m->private; @@ -1568,6 +1667,8 @@ struct ldebugfs_vars lprocfs_llite_obd_vars[] = { &lustre_attr_max_read_ahead_mb.attr, &lustre_attr_max_read_ahead_per_file_mb.attr, &lustre_attr_max_read_ahead_whole_mb.attr, + &lustre_attr_max_read_ahead_async_active.attr, + &lustre_attr_read_ahead_async_file_threshold_mb.attr, &lustre_attr_read_ahead_range_kb.attr, &lustre_attr_checksums.attr, &lustre_attr_checksum_pages.attr, @@ -1587,8 +1688,9 @@ struct ldebugfs_vars lprocfs_llite_obd_vars[] = { &lustre_attr_file_heat.attr, &lustre_attr_heat_decay_percentage.attr, &lustre_attr_heat_period_second.attr, - &lustre_attr_max_read_ahead_async_active.attr, - &lustre_attr_read_ahead_async_file_threshold_mb.attr, + &lustre_attr_opencache_threshold_count.attr, + &lustre_attr_opencache_threshold_ms.attr, + &lustre_attr_opencache_max_ms.attr, NULL, }; @@ -1624,12 +1726,16 @@ static void sbi_kobj_release(struct kobject *kobj) { LPROC_LL_LLSEEK, LPROCFS_TYPE_LATENCY, "seek" }, { LPROC_LL_FSYNC, LPROCFS_TYPE_LATENCY, "fsync" }, { LPROC_LL_READDIR, LPROCFS_TYPE_LATENCY, "readdir" }, + { LPROC_LL_INODE_OCOUNT, LPROCFS_TYPE_REQS | + LPROCFS_CNTR_AVGMINMAX | + LPROCFS_CNTR_STDDEV, "opencount" }, + { LPROC_LL_INODE_OPCLTM, LPROCFS_TYPE_LATENCY, "openclosetime" }, /* inode operation */ { LPROC_LL_SETATTR, LPROCFS_TYPE_LATENCY, "setattr" }, { LPROC_LL_TRUNC, LPROCFS_TYPE_LATENCY, "truncate" }, { LPROC_LL_FLOCK, LPROCFS_TYPE_LATENCY, "flock" }, { LPROC_LL_GETATTR, LPROCFS_TYPE_LATENCY, "getattr" }, - { LPROC_LL_FALLOCATE, LPROCFS_TYPE_LATENCY, "fallocate" }, + { LPROC_LL_FALLOCATE, LPROCFS_TYPE_LATENCY, "fallocate" }, /* dir inode operation */ { LPROC_LL_CREATE, LPROCFS_TYPE_LATENCY, "create" }, { LPROC_LL_LINK, LPROCFS_TYPE_LATENCY, "link" }, diff --git a/fs/lustre/llite/namei.c b/fs/lustre/llite/namei.c index 6ed2943..f5f34b0 100644 --- a/fs/lustre/llite/namei.c +++ b/fs/lustre/llite/namei.c @@ -1148,6 +1148,13 @@ static int ll_atomic_open(struct inode *dir, struct dentry *dentry, OBD_FAIL_TIMEOUT(OBD_FAIL_LLITE_CREATE_FILE_PAUSE2, cfs_fail_val); + /* We can only arrive at this path when we have no inode, so + * we only need to request open lock if it was requested + * for every open + */ + if (ll_i2sbi(dir)->ll_oc_thrsh_count == 1) + it->it_flags |= MDS_OPEN_LOCK; + /* Dentry added to dcache tree in ll_lookup_it */ de = ll_lookup_it(dir, dentry, it, &secctx, &secctxlen, &pca, encrypt, &encctx, &encctxlen);