From patchwork Thu Jan 21 17:16:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12037165 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A2D76C433E6 for ; Thu, 21 Jan 2021 17:17:45 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 5ED4723A57 for ; Thu, 21 Jan 2021 17:17:45 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5ED4723A57 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 9A00D21FD91; Thu, 21 Jan 2021 09:17:32 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 9109621FA40 for ; Thu, 21 Jan 2021 09:17:06 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 241F21007887; Thu, 21 Jan 2021 12:17:05 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 1A58CF0A7; Thu, 21 Jan 2021 12:17:05 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 21 Jan 2021 12:16:24 -0500 Message-Id: <1611249422-556-2-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org> References: <1611249422-556-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 01/39] lustre: ldlm: page discard speedup X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Alexander Zarochentsev , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Alexander Zarochentsev Improving check_and_discard_cb, allowing to cache negative result of dlm lock lookup and avoid excessive osc_dlm_lock_at_pgoff() calls. HPE-bug-id: LUS-6432 WC-bug-id: https://jira.whamcloud.com/browse/LU-11290 Lustre-commit: 0f48cd0b9856fe ("LU-11290 ldlm: page discard speedup") Signed-off-by: Alexander Zarochentsev Reviewed-on: https://review.whamcloud.com/39327 Reviewed-by: Vitaly Fertman Reviewed-by: Andrew Perepechko Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/include/lustre_dlm.h | 1 + fs/lustre/include/lustre_osc.h | 5 +++++ fs/lustre/ldlm/ldlm_lock.c | 16 +++++++++----- fs/lustre/osc/osc_cache.c | 48 +++++++++++++++++++++++++++++++----------- fs/lustre/osc/osc_lock.c | 3 +++ 5 files changed, 56 insertions(+), 17 deletions(-) diff --git a/fs/lustre/include/lustre_dlm.h b/fs/lustre/include/lustre_dlm.h index f056c2d..e4c95a2 100644 --- a/fs/lustre/include/lustre_dlm.h +++ b/fs/lustre/include/lustre_dlm.h @@ -858,6 +858,7 @@ enum ldlm_match_flags { LDLM_MATCH_UNREF = BIT(0), LDLM_MATCH_AST = BIT(1), LDLM_MATCH_AST_ANY = BIT(2), + LDLM_MATCH_RIGHT = BIT(3), }; /** diff --git a/fs/lustre/include/lustre_osc.h b/fs/lustre/include/lustre_osc.h index ef5237b..e7bf392 100644 --- a/fs/lustre/include/lustre_osc.h +++ b/fs/lustre/include/lustre_osc.h @@ -186,6 +186,7 @@ struct osc_thread_info { */ pgoff_t oti_next_index; pgoff_t oti_fn_index; /* first non-overlapped index */ + pgoff_t oti_ng_index; /* negative lock caching */ struct cl_sync_io oti_anchor; struct cl_req_attr oti_req_attr; struct lu_buf oti_ladvise_buf; @@ -248,6 +249,10 @@ enum osc_dap_flags { * check ast data is present, requested to cancel cb */ OSC_DAP_FL_AST = BIT(2), + /** + * look at right region for the desired lock + */ + OSC_DAP_FL_RIGHT = BIT(3), }; /* diff --git a/fs/lustre/ldlm/ldlm_lock.c b/fs/lustre/ldlm/ldlm_lock.c index 56f1550..b7ce0bb 100644 --- a/fs/lustre/ldlm/ldlm_lock.c +++ b/fs/lustre/ldlm/ldlm_lock.c @@ -1093,8 +1093,9 @@ static bool lock_matches(struct ldlm_lock *lock, void *vdata) switch (lock->l_resource->lr_type) { case LDLM_EXTENT: - if (lpol->l_extent.start > data->lmd_policy->l_extent.start || - lpol->l_extent.end < data->lmd_policy->l_extent.end) + if (!(data->lmd_match & LDLM_MATCH_RIGHT) && + (lpol->l_extent.start > data->lmd_policy->l_extent.start || + lpol->l_extent.end < data->lmd_policy->l_extent.end)) return false; if (unlikely(match == LCK_GROUP) && @@ -1160,10 +1161,17 @@ static bool lock_matches(struct ldlm_lock *lock, void *vdata) struct ldlm_lock *search_itree(struct ldlm_resource *res, struct ldlm_match_data *data) { + struct ldlm_extent ext = { + .start = data->lmd_policy->l_extent.start, + .end = data->lmd_policy->l_extent.end + }; int idx; data->lmd_lock = NULL; + if (data->lmd_match & LDLM_MATCH_RIGHT) + ext.end = OBD_OBJECT_EOF; + for (idx = 0; idx < LCK_MODE_NUM; idx++) { struct ldlm_interval_tree *tree = &res->lr_itree[idx]; @@ -1173,9 +1181,7 @@ struct ldlm_lock *search_itree(struct ldlm_resource *res, if (!(tree->lit_mode & *data->lmd_mode)) continue; - ldlm_extent_search(&tree->lit_root, - data->lmd_policy->l_extent.start, - data->lmd_policy->l_extent.end, + ldlm_extent_search(&tree->lit_root, ext.start, ext.end, lock_matches, data); if (data->lmd_lock) return data->lmd_lock; diff --git a/fs/lustre/osc/osc_cache.c b/fs/lustre/osc/osc_cache.c index ddf6fb1..d511ece 100644 --- a/fs/lustre/osc/osc_cache.c +++ b/fs/lustre/osc/osc_cache.c @@ -3207,28 +3207,51 @@ static bool check_and_discard_cb(const struct lu_env *env, struct cl_io *io, { struct osc_thread_info *info = osc_env_info(env); struct osc_object *osc = cbdata; + struct cl_page *page = ops->ops_cl.cpl_page; pgoff_t index; + bool discard = false; index = osc_index(ops); - if (index >= info->oti_fn_index) { + /* negative lock caching */ + if (index < info->oti_ng_index) { + discard = true; + } else if (index >= info->oti_fn_index) { struct ldlm_lock *tmp; - struct cl_page *page = ops->ops_cl.cpl_page; /* refresh non-overlapped index */ tmp = osc_dlmlock_at_pgoff(env, osc, index, - OSC_DAP_FL_TEST_LOCK | OSC_DAP_FL_AST); + OSC_DAP_FL_TEST_LOCK | + OSC_DAP_FL_AST | OSC_DAP_FL_RIGHT); if (tmp) { u64 end = tmp->l_policy_data.l_extent.end; - /* Cache the first-non-overlapped index so as to skip - * all pages within [index, oti_fn_index). This is safe - * because if tmp lock is canceled, it will discard - * these pages. - */ - info->oti_fn_index = cl_index(osc2cl(osc), end + 1); - if (end == OBD_OBJECT_EOF) - info->oti_fn_index = CL_PAGE_EOF; + u64 start = tmp->l_policy_data.l_extent.start; + + /* no lock covering this page */ + if (index < cl_index(osc2cl(osc), start)) { + /* no lock at @index, first lock at @start */ + info->oti_ng_index = cl_index(osc2cl(osc), + start); + discard = true; + } else { + /* Cache the first-non-overlapped index so as to + * skip all pages within [index, oti_fn_index). + * This is safe because if tmp lock is canceled, + * it will discard these pages. + */ + info->oti_fn_index = cl_index(osc2cl(osc), + end + 1); + if (end == OBD_OBJECT_EOF) + info->oti_fn_index = CL_PAGE_EOF; + } LDLM_LOCK_PUT(tmp); - } else if (cl_page_own(env, io, page) == 0) { + } else { + info->oti_ng_index = CL_PAGE_EOF; + discard = true; + } + } + + if (discard) { + if (cl_page_own(env, io, page) == 0) { /* discard the page */ cl_page_discard(env, io, page); cl_page_disown(env, io, page); @@ -3292,6 +3315,7 @@ int osc_lock_discard_pages(const struct lu_env *env, struct osc_object *osc, cb = discard ? osc_discard_cb : check_and_discard_cb; info->oti_fn_index = start; info->oti_next_index = start; + info->oti_ng_index = 0; osc_page_gang_lookup(env, io, osc, info->oti_next_index, end, cb, osc); diff --git a/fs/lustre/osc/osc_lock.c b/fs/lustre/osc/osc_lock.c index 7bfcbfb..536142f2 100644 --- a/fs/lustre/osc/osc_lock.c +++ b/fs/lustre/osc/osc_lock.c @@ -1282,6 +1282,9 @@ struct ldlm_lock *osc_obj_dlmlock_at_pgoff(const struct lu_env *env, if (dap_flags & OSC_DAP_FL_CANCELING) match_flags |= LDLM_MATCH_UNREF; + if (dap_flags & OSC_DAP_FL_RIGHT) + match_flags |= LDLM_MATCH_RIGHT; + /* * It is fine to match any group lock since there could be only one * with a uniq gid and it conflicts with all other lock modes too From patchwork Thu Jan 21 17:16:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12037141 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0E120C433E0 for ; Thu, 21 Jan 2021 17:17:13 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A283123A57 for ; Thu, 21 Jan 2021 17:17:12 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A283123A57 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 2C92421FBD5; Thu, 21 Jan 2021 09:17:11 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id E0E3A21FA40 for ; Thu, 21 Jan 2021 09:17:06 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 2785F1007AB9; Thu, 21 Jan 2021 12:17:05 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 1D7F11158C; Thu, 21 Jan 2021 12:17:05 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 21 Jan 2021 12:16:25 -0500 Message-Id: <1611249422-556-3-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org> References: <1611249422-556-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 02/39] lustre: ptlrpc: fixes for RCU-related stalls X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Andrew Perepechko , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Andrew Perepechko ptlrpc_expired_set() may need to process a lot of requests, so the processing loop needs to schedule from time to time to avoid RCU-related stalls. HPE-bug-id: LUS-8939 WC-bug-id: https://jira.whamcloud.com/browse/LU-13822 Lustre-commit: 1bbd5b5f0ee042 ("LU-13822 ptlrpc: fixes for RCU-related stalls") Signed-off-by: Andrew Perepechko Reviewed-on: https://review.whamcloud.com/39514 Reviewed-by: Alexander Zarochentsev Reviewed-by: Neil Brown Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/ptlrpc/client.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/fs/lustre/ptlrpc/client.c b/fs/lustre/ptlrpc/client.c index 0e01ab33..2002c03 100644 --- a/fs/lustre/ptlrpc/client.c +++ b/fs/lustre/ptlrpc/client.c @@ -2249,6 +2249,11 @@ void ptlrpc_expired_set(struct ptlrpc_request_set *set) * ptlrpcd thread. */ ptlrpc_expire_one_request(req, 1); + /* + * Loops require that we resched once in a while to avoid + * RCU stalls and a few other problems. + */ + cond_resched(); } } From patchwork Thu Jan 21 17:16:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12037169 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BAC88C433DB for ; Thu, 21 Jan 2021 17:17:51 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 5477F23A59 for ; Thu, 21 Jan 2021 17:17:51 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5477F23A59 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id C031F21FC1B; Thu, 21 Jan 2021 09:17:35 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 2C92021FA40 for ; Thu, 21 Jan 2021 09:17:07 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 29821100804B; Thu, 21 Jan 2021 12:17:05 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 21F1411596; Thu, 21 Jan 2021 12:17:05 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 21 Jan 2021 12:16:26 -0500 Message-Id: <1611249422-556-4-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org> References: <1611249422-556-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 03/39] lustre: ldlm: Do not wait for lock replay sending if import dsconnected X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Oleg Drokin If import disconnected while we were preparing to send some lock replays the sending RPC would get stuck on the sending list and would keep the reconnected import state from progressing from REPLAY to REPLAY_LOCKS state waiting for the queued replay RPCs to finish. Set them as no_delay to ensure they don't wait. LU-13600 exacerbated this issue some but it certainly exist before it as well. WC-bug-id: https://jira.whamcloud.com/browse/LU-14027 Lustre-commit: f06a4efe13faca ("LU-14027 ldlm: Do not wait for lock replay sending if import dsconnected") Signed-off-by: Oleg Drokin Reviewed-on: https://review.whamcloud.com/40272 Reviewed-by: Andreas Dilger Reviewed-by: Mike Pershin Signed-off-by: James Simmons --- fs/lustre/ldlm/ldlm_request.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/lustre/ldlm/ldlm_request.c b/fs/lustre/ldlm/ldlm_request.c index 74bcba2..a2e1969 100644 --- a/fs/lustre/ldlm/ldlm_request.c +++ b/fs/lustre/ldlm/ldlm_request.c @@ -2173,6 +2173,8 @@ static int replay_one_lock(struct obd_import *imp, struct ldlm_lock *lock) /* We're part of recovery, so don't wait for it. */ req->rq_send_state = LUSTRE_IMP_REPLAY_LOCKS; + /* If the state changed while we were prepared, don't wait */ + req->rq_no_delay = 1; body = req_capsule_client_get(&req->rq_pill, &RMF_DLM_REQ); ldlm_lock2desc(lock, &body->lock_desc); From patchwork Thu Jan 21 17:16:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12037173 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 557EAC433E0 for ; Thu, 21 Jan 2021 17:17:57 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 074C123A59 for ; Thu, 21 Jan 2021 17:17:56 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 074C123A59 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id E227921FE22; Thu, 21 Jan 2021 09:17:38 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 642AC21FA40 for ; Thu, 21 Jan 2021 09:17:07 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 2C558100804C; Thu, 21 Jan 2021 12:17:05 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 25C50115A7; Thu, 21 Jan 2021 12:17:05 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 21 Jan 2021 12:16:27 -0500 Message-Id: <1611249422-556-5-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org> References: <1611249422-556-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 04/39] lustre: ldlm: Do not hang if recovery restarted during lock replay X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Oleg Drokin LU-13600 introduced lock ratelimiting logic, but it did not take into account that if there's a disconnection in the REPLAY_LOCKS phase then yet unsent locks get stuck in the sending queue so the replay locks thread hangs with imp_replay_inflight elevated above zero. The direct consequence from that is recovery state machine never advances from REPLAY to REPLAY_LOCKS status when imp_replay_inflight is non zero. Adjust __ldlm_replay_locks() to check if the import state changed before attempting to send any more requests. Add a testcase. Fixes: 8cc7f22847 ("lustre: ptlrpc: limit rate of lock replays") WC-bug-id: https://jira.whamcloud.com/browse/LU-14027 Lustre-commit: 7ca495ec67f474 ("LU-14027 ldlm: Do not hang if recovery restarted during lock replay") Signed-off-by: Oleg Drokin Reviewed-on: https://review.whamcloud.com/40238 Reviewed-by: Mike Pershin Reviewed-by: Andreas Dilger Signed-off-by: James Simmons --- fs/lustre/ldlm/ldlm_request.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/fs/lustre/ldlm/ldlm_request.c b/fs/lustre/ldlm/ldlm_request.c index a2e1969..86b10a7 100644 --- a/fs/lustre/ldlm/ldlm_request.c +++ b/fs/lustre/ldlm/ldlm_request.c @@ -2271,9 +2271,12 @@ int __ldlm_replay_locks(struct obd_import *imp, bool rate_limit) lock = list_first_entry(&list, struct ldlm_lock, l_pending_chain); list_del_init(&lock->l_pending_chain); - if (rc) { + /* If we disconnected in the middle - cleanup and let + * reconnection to happen again. LU-14027 + */ + if (rc || (imp->imp_state != LUSTRE_IMP_REPLAY_LOCKS)) { LDLM_LOCK_RELEASE(lock); - continue; /* or try to do the rest? */ + continue; } rc = replay_one_lock(imp, lock); LDLM_LOCK_RELEASE(lock); From patchwork Thu Jan 21 17:16:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12037143 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CFD41C433DB for ; Thu, 21 Jan 2021 17:17:15 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 5D9F123A57 for ; Thu, 21 Jan 2021 17:17:15 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5D9F123A57 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id C367821FC4B; Thu, 21 Jan 2021 09:17:13 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 9BBB021FA40 for ; Thu, 21 Jan 2021 09:17:07 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 2F9DF100804D; Thu, 21 Jan 2021 12:17:05 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 28D8B115B8; Thu, 21 Jan 2021 12:17:05 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 21 Jan 2021 12:16:28 -0500 Message-Id: <1611249422-556-6-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org> References: <1611249422-556-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 05/39] lnet: Correct handling of NETWORK_TIMEOUT status X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Horn , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Chris Horn The original intent of the LNET_MSG_STATUS_NETWORK_TIMEOUT health status was to handle cases where the LND was unsure whether the failure was due to the local or remote NI. In this case, we'll want to decrement both the local and remote NI health and allow recovery to ascertain which interface is actually healthy. HPE-bug-id: LUS-9342 WC-bug-id: https://jira.whamcloud.com/browse/LU-13751 Lustre-commit: ffd4523f2d50ef ("LU-13571 lnet: Correct handling of NETWORK_TIMEOUT status") Signed-off-by: Chris Horn Reviewed-on: https://review.whamcloud.com/39898 Reviewed-by: Amir Shehata Reviewed-by: Serguei Smirnov Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- net/lnet/lnet/lib-msg.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/net/lnet/lnet/lib-msg.c b/net/lnet/lnet/lib-msg.c index e84cf02..d888090 100644 --- a/net/lnet/lnet/lib-msg.c +++ b/net/lnet/lnet/lib-msg.c @@ -925,9 +925,14 @@ case LNET_MSG_STATUS_REMOTE_ERROR: case LNET_MSG_STATUS_REMOTE_TIMEOUT: + if (handle_remote_health) + lnet_handle_remote_failure(lpni); + return -1; case LNET_MSG_STATUS_NETWORK_TIMEOUT: if (handle_remote_health) lnet_handle_remote_failure(lpni); + if (handle_local_health) + lnet_handle_local_failure(ni); return -1; default: LBUG(); From patchwork Thu Jan 21 17:16:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12037177 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.9 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,UNWANTED_LANGUAGE_BODY, URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B01F1C433DB for ; Thu, 21 Jan 2021 17:18:03 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 680F723A5A for ; Thu, 21 Jan 2021 17:18:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 680F723A5A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 07CE821FE3D; Thu, 21 Jan 2021 09:17:42 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id D580221FA40 for ; Thu, 21 Jan 2021 09:17:07 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 32F6E100804E; Thu, 21 Jan 2021 12:17:05 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 2C28DF0A5; Thu, 21 Jan 2021 12:17:05 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 21 Jan 2021 12:16:29 -0500 Message-Id: <1611249422-556-7-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org> References: <1611249422-556-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 06/39] lnet: Introduce constant for net ID of LNET_NID_ANY X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Horn , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Chris Horn This patch adds a new constant, LNET_NET_ANY, to represent the net ID of the LNET_NID_ANY wildcard NID. HPE-bug-id: LUS-9122 WC-bug-id: https://jira.whamcloud.com/browse/LU-13837 Lustre-commit: 1741e993c874ed ("LU-13837 lnet: Introduce constant for net ID of LNET_NID_ANY") Signed-off-by: Chris Horn Reviewed-on: https://review.whamcloud.com/39544 Reviewed-by: Andreas Dilger Reviewed-by: Neil Brown Signed-off-by: James Simmons --- fs/lustre/ldlm/ldlm_lib.c | 2 +- fs/lustre/ptlrpc/sec_config.c | 10 +++++----- include/uapi/linux/lnet/lnet-types.h | 2 ++ net/lnet/lnet/config.c | 8 ++++---- net/lnet/lnet/lib-move.c | 3 +-- net/lnet/lnet/nidstrings.c | 2 +- net/lnet/lnet/peer.c | 2 +- net/lnet/lnet/router.c | 6 +++--- 8 files changed, 18 insertions(+), 17 deletions(-) diff --git a/fs/lustre/ldlm/ldlm_lib.c b/fs/lustre/ldlm/ldlm_lib.c index 713ca1c..2965395 100644 --- a/fs/lustre/ldlm/ldlm_lib.c +++ b/fs/lustre/ldlm/ldlm_lib.c @@ -487,7 +487,7 @@ int client_obd_setup(struct obd_device *obd, struct lustre_cfg *lcfg) if (lustre_cfg_buf(lcfg, 4)) { u32 refnet = libcfs_str2net(lustre_cfg_string(lcfg, 4)); - if (refnet == LNET_NIDNET(LNET_NID_ANY)) { + if (refnet == LNET_NET_ANY) { rc = -EINVAL; CERROR("%s: bad mount option 'network=%s': rc = %d\n", obd->obd_name, lustre_cfg_string(lcfg, 4), diff --git a/fs/lustre/ptlrpc/sec_config.c b/fs/lustre/ptlrpc/sec_config.c index 9ced6c7..0891f2f 100644 --- a/fs/lustre/ptlrpc/sec_config.c +++ b/fs/lustre/ptlrpc/sec_config.c @@ -145,7 +145,7 @@ static void get_default_flavor(struct sptlrpc_flavor *sf) static void sptlrpc_rule_init(struct sptlrpc_rule *rule) { - rule->sr_netid = LNET_NIDNET(LNET_NID_ANY); + rule->sr_netid = LNET_NET_ANY; rule->sr_from = LUSTRE_SP_ANY; rule->sr_to = LUSTRE_SP_ANY; rule->sr_padding = 0; @@ -177,7 +177,7 @@ static int sptlrpc_parse_rule(char *param, struct sptlrpc_rule *rule) /* 1.1 network */ if (strcmp(param, "default")) { rule->sr_netid = libcfs_str2net(param); - if (rule->sr_netid == LNET_NIDNET(LNET_NID_ANY)) { + if (rule->sr_netid == LNET_NET_ANY) { CERROR("invalid network name: %s\n", param); return -EINVAL; } @@ -263,7 +263,7 @@ static inline int rule_spec_dir(struct sptlrpc_rule *rule) static inline int rule_spec_net(struct sptlrpc_rule *rule) { - return (rule->sr_netid != LNET_NIDNET(LNET_NID_ANY)); + return (rule->sr_netid != LNET_NET_ANY); } static inline int rule_match_dir(struct sptlrpc_rule *r1, @@ -384,8 +384,8 @@ static int sptlrpc_rule_set_choose(struct sptlrpc_rule_set *rset, for (n = 0; n < rset->srs_nrule; n++) { r = &rset->srs_rules[n]; - if (LNET_NIDNET(nid) != LNET_NIDNET(LNET_NID_ANY) && - r->sr_netid != LNET_NIDNET(LNET_NID_ANY) && + if (LNET_NIDNET(nid) != LNET_NET_ANY && + r->sr_netid != LNET_NET_ANY && LNET_NIDNET(nid) != r->sr_netid) continue; diff --git a/include/uapi/linux/lnet/lnet-types.h b/include/uapi/linux/lnet/lnet-types.h index 70fab42..3324792 100644 --- a/include/uapi/linux/lnet/lnet-types.h +++ b/include/uapi/linux/lnet/lnet-types.h @@ -112,6 +112,8 @@ static inline __u32 LNET_MKNET(__u32 type, __u32 num) /** The lolnd NID (i.e. myself) */ #define LNET_NID_LO_0 LNET_MKNID(LNET_MKNET(LOLND, 0), 0) +#define LNET_NET_ANY LNET_NIDNET(LNET_NID_ANY) + /* Packed version of lnet_process_id to transfer via network */ struct lnet_process_id_packed { /* node id / process id */ diff --git a/net/lnet/lnet/config.c b/net/lnet/lnet/config.c index 6ddd9d6..b078bc8 100644 --- a/net/lnet/lnet/config.c +++ b/net/lnet/lnet/config.c @@ -679,7 +679,7 @@ struct lnet_ni * * At this point the name is properly terminated. */ net_id = libcfs_str2net(name); - if (net_id == LNET_NIDNET(LNET_NID_ANY)) { + if (net_id == LNET_NET_ANY) { LCONSOLE_ERROR_MSG(0x113, "Unrecognised network type\n"); str = name; @@ -1169,7 +1169,7 @@ struct lnet_ni * if (ntokens == 1) { net = libcfs_str2net(ltb->ltb_text); - if (net == LNET_NIDNET(LNET_NID_ANY) || + if (net == LNET_NET_ANY || LNET_NETTYP(net) == LOLND) goto token_error; } else { @@ -1197,7 +1197,7 @@ struct lnet_ni * list_for_each_entry(ltb1, &nets, ltb_list) { net = libcfs_str2net(ltb1->ltb_text); - LASSERT(net != LNET_NIDNET(LNET_NID_ANY)); + LASSERT(net != LNET_NET_ANY); list_for_each_entry(ltb2, &gateways, ltb_list) { nid = libcfs_str2nid(ltb2->ltb_text); @@ -1403,7 +1403,7 @@ struct lnet_ni * *sep++ = 0; net = lnet_netspec2net(tb->ltb_text); - if (net == LNET_NIDNET(LNET_NID_ANY)) { + if (net == LNET_NET_ANY) { lnet_syntax("ip2nets", source, offset, strlen(tb->ltb_text)); return -EINVAL; diff --git a/net/lnet/lnet/lib-move.c b/net/lnet/lnet/lib-move.c index 1c9fb41..4687acd 100644 --- a/net/lnet/lnet/lib-move.c +++ b/net/lnet/lnet/lib-move.c @@ -1222,10 +1222,9 @@ void lnet_usr_translate_stats(struct lnet_ioctl_element_msg_stats *msg_stats, struct lnet_peer *peer, u32 net_id) { struct lnet_peer_net *peer_net; - u32 any_net = LNET_NIDNET(LNET_NID_ANY); /* find the best_lpni on any local network */ - if (net_id == any_net) { + if (net_id == LNET_NET_ANY) { struct lnet_peer_ni *best_lpni = NULL; struct lnet_peer_net *lpn; diff --git a/net/lnet/lnet/nidstrings.c b/net/lnet/lnet/nidstrings.c index fb8d3e2..f260092 100644 --- a/net/lnet/lnet/nidstrings.c +++ b/net/lnet/lnet/nidstrings.c @@ -884,7 +884,7 @@ int cfs_print_nidlist(char *buffer, int count, struct list_head *nidlist) if (libcfs_str2net_internal(str, &net)) return net; - return LNET_NIDNET(LNET_NID_ANY); + return LNET_NET_ANY; } EXPORT_SYMBOL(libcfs_str2net); diff --git a/net/lnet/lnet/peer.c b/net/lnet/lnet/peer.c index 3889310..70df37a 100644 --- a/net/lnet/lnet/peer.c +++ b/net/lnet/lnet/peer.c @@ -596,7 +596,7 @@ void lnet_peer_uninit(void) gw_nid = lp->lpni_peer_net->lpn_peer->lp_primary_nid; lnet_net_unlock(LNET_LOCK_EX); - lnet_del_route(LNET_NIDNET(LNET_NID_ANY), gw_nid); + lnet_del_route(LNET_NET_ANY, gw_nid); lnet_net_lock(LNET_LOCK_EX); } } diff --git a/net/lnet/lnet/router.c b/net/lnet/lnet/router.c index 1253e4c..e030b16 100644 --- a/net/lnet/lnet/router.c +++ b/net/lnet/lnet/router.c @@ -664,7 +664,7 @@ static void lnet_shuffle_seed(void) if (gateway == LNET_NID_ANY || gateway == LNET_NID_LO_0 || - net == LNET_NIDNET(LNET_NID_ANY) || + net == LNET_NET_ANY || LNET_NETTYP(net) == LOLND || LNET_NIDNET(gateway) == net || (hops != LNET_UNDEFINED_HOPS && (hops < 1 || hops > 255))) @@ -841,7 +841,7 @@ static void lnet_shuffle_seed(void) lnet_peer_ni_decref_locked(lpni); } - if (net != LNET_NIDNET(LNET_NID_ANY)) { + if (net != LNET_NET_ANY) { rnet = lnet_find_rnet_locked(net); if (!rnet) { lnet_net_unlock(LNET_LOCK_EX); @@ -898,7 +898,7 @@ static void lnet_shuffle_seed(void) void lnet_destroy_routes(void) { - lnet_del_route(LNET_NIDNET(LNET_NID_ANY), LNET_NID_ANY); + lnet_del_route(LNET_NET_ANY, LNET_NID_ANY); } int lnet_get_rtr_pool_cfg(int cpt, struct lnet_ioctl_pool_cfg *pool_cfg) From patchwork Thu Jan 21 17:16:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12037171 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 06EE3C433DB for ; Thu, 21 Jan 2021 17:17:55 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A2E3123A5A for ; Thu, 21 Jan 2021 17:17:54 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A2E3123A5A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id B184D21FE0F; Thu, 21 Jan 2021 09:17:37 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 2E56421FA40 for ; Thu, 21 Jan 2021 09:17:08 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 34766100804F; Thu, 21 Jan 2021 12:17:05 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 2F921F0A7; Thu, 21 Jan 2021 12:17:05 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 21 Jan 2021 12:16:30 -0500 Message-Id: <1611249422-556-8-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org> References: <1611249422-556-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 07/39] lustre: ldlm: Don't re-enqueue glimpse lock on read X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Andriy Skulysh , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Andriy Skulysh cl_glimpse_lock() doesn't match a lock with LDLM_FL_BL_AST even if this lock is acquired by the same thread earlier. It needs only size to check for spare file, so let't add LDLM_FL_CBPENDING to match flags. #1 [ffff9ba7326036f0] schedule at ffffffff87b67c49 #2 [ffff9ba732603700] obd_get_request_slot at ffffffffc0dbe0a4 [obdclass] #3 [ffff9ba7326037b8] ldlm_cli_enqueue at ffffffffc0faedce [ptlrpc] #4 [ffff9ba732603878] mdc_enqueue_send at ffffffffc11b38a8 [mdc] #5 [ffff9ba732603938] mdc_lock_enqueue at ffffffffc11b3eb2 [mdc] #6 [ffff9ba7326039a8] cl_lock_enqueue at ffffffffc0dfee95 [obdclass] #7 [ffff9ba7326039e0] lov_lock_enqueue at ffffffffc10ef265 [lov] #8 [ffff9ba732603a20] cl_lock_enqueue at ffffffffc0dfee95 [obdclass] #9 [ffff9ba732603a58] cl_lock_request at ffffffffc0dff54b [obdclass] HPE-bug-id: LUS-8690 WC-bug-id: https://jira.whamcloud.com/browse/LU-13987 Lustre-commit: 829a3a93d43e4d ("LU-13987 ldlm: Don't re-enqueue glimpse lock on read") Reviewed-on: https://review.whamcloud.com/40044 Signed-off-by: Andriy Skulysh Reviewed-by: Vitaly Fertman Reviewed-by: Alexander Boyko Reviewed-by: Andrew Perepechko Tested-by: Elena Gryaznova Reviewed-by: Alexander Boyko Reviewed-by: Vitaly Fertman Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/include/lustre_osc.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/lustre/include/lustre_osc.h b/fs/lustre/include/lustre_osc.h index e7bf392..e32723c 100644 --- a/fs/lustre/include/lustre_osc.h +++ b/fs/lustre/include/lustre_osc.h @@ -203,7 +203,7 @@ static inline u64 osc_enq2ldlm_flags(u32 enqflags) if (enqflags & CEF_NONBLOCK) result |= LDLM_FL_BLOCK_NOWAIT; if (enqflags & CEF_GLIMPSE) - result |= LDLM_FL_HAS_INTENT; + result |= LDLM_FL_HAS_INTENT | LDLM_FL_CBPENDING; if (enqflags & CEF_DISCARD_DATA) result |= LDLM_FL_AST_DISCARD_DATA; if (enqflags & CEF_PEEK) From patchwork Thu Jan 21 17:16:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12037151 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CC776C433DB for ; Thu, 21 Jan 2021 17:17:27 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 12CC323A57 for ; Thu, 21 Jan 2021 17:17:26 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 12CC323A57 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id E713521FCCE; Thu, 21 Jan 2021 09:17:20 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 65FA821FA40 for ; Thu, 21 Jan 2021 09:17:08 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 35DD21008050; Thu, 21 Jan 2021 12:17:05 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 320A71B49B; Thu, 21 Jan 2021 12:17:05 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 21 Jan 2021 12:16:31 -0500 Message-Id: <1611249422-556-9-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org> References: <1611249422-556-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 08/39] lustre: osc: prevent overflow of o_dropped X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Olaf Faaland In osc_announce_cached(), prevent o_dropped from overflowing. Necessary because o_dropped AKA o_misc is 32 bits, but cl_lost_grant is 64 bits. Add a CDEBUG call so we can tell whether this happened. WC-bug-id: https://jira.whamcloud.com/browse/LU-14125 Lustre-commit: 82e9a11056a552 ("LU-14125 osc: prevent overflow of o_dropped") Signed-off-by: Olaf Faaland Reviewed-on: https://review.whamcloud.com/40659 Reviewed-by: Andreas Dilger Reviewed-by: James Simmons Signed-off-by: James Simmons --- fs/lustre/osc/osc_request.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/fs/lustre/osc/osc_request.c b/fs/lustre/osc/osc_request.c index f225ccd..4a4b5ef 100644 --- a/fs/lustre/osc/osc_request.c +++ b/fs/lustre/osc/osc_request.c @@ -754,11 +754,21 @@ static void osc_announce_cached(struct client_obd *cli, struct obdo *oa, ~(PTLRPC_MAX_BRW_SIZE * 4UL)); } oa->o_grant = cli->cl_avail_grant + cli->cl_reserved_grant; - oa->o_dropped = cli->cl_lost_grant; - cli->cl_lost_grant = 0; + /* o_dropped AKA o_misc is 32 bits, but cl_lost_grant is 64 bits */ + if (cli->cl_lost_grant > INT_MAX) { + CDEBUG(D_CACHE, + "%s: avoided o_dropped overflow: cl_lost_grant %lu\n", + cli_name(cli), cli->cl_lost_grant); + oa->o_dropped = INT_MAX; + } else { + oa->o_dropped = cli->cl_lost_grant; + } + cli->cl_lost_grant -= oa->o_dropped; spin_unlock(&cli->cl_loi_list_lock); - CDEBUG(D_CACHE, "dirty: %llu undirty: %u dropped %u grant: %llu\n", - oa->o_dirty, oa->o_undirty, oa->o_dropped, oa->o_grant); + CDEBUG(D_CACHE, + "%s: dirty: %llu undirty: %u dropped %u grant: %llu cl_lost_grant %lu\n", + cli_name(cli), oa->o_dirty, oa->o_undirty, oa->o_dropped, + oa->o_grant, cli->cl_lost_grant); } void osc_update_next_shrink(struct client_obd *cli) From patchwork Thu Jan 21 17:16:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12037155 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E4B35C433DB for ; Thu, 21 Jan 2021 17:17:32 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 7D5D723A57 for ; Thu, 21 Jan 2021 17:17:32 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7D5D723A57 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 743AC21FDAD; Thu, 21 Jan 2021 09:17:24 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 9F39B21FA40 for ; Thu, 21 Jan 2021 09:17:08 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 37D301008051; Thu, 21 Jan 2021 12:17:05 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 357001158C; Thu, 21 Jan 2021 12:17:05 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 21 Jan 2021 12:16:32 -0500 Message-Id: <1611249422-556-10-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org> References: <1611249422-556-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 09/39] lustre: llite: fix client evicition with DIO X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Wang Shilong , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Wang Shilong We set lockless in file open if O_DIRECT flag is passed, however O_DIRECT flag could be cleared by fcntl(..., F_SETFL, ...). Finally we comes to a case where buffer IO without lock held properly, and hit hang: [] osc_extent_wait+0x21d/0x7c0 [osc] [] osc_cache_wait_range+0x2e7/0x940 [osc] [] osc_cache_writeback_range+0x96e/0xff0 [osc] [] osc_lock_flush+0x195/0x290 [osc] [] osc_lock_lockless_cancel+0x3c/0xe0 [osc] [] cl_lock_cancel+0x78/0x160 [obdclass] [] lov_lock_cancel+0x99/0x190 [lov] [] cl_lock_cancel+0x78/0x160 [obdclass] [] cl_lock_release+0x52/0x140 [obdclass] [] cl_io_unlock+0x139/0x290 [obdclass] [] cl_io_loop+0xb8/0x200 [obdclass] [] ll_file_io_generic+0x91b/0xdf0 [lustre] [] ll_file_aio_write+0x29c/0x6e0 [lustre] [] ll_file_write+0x100/0x1c0 [lustre] [] vfs_write+0xc0/0x1f0 [] SyS_write+0x7f/0xf0 [] system_call_fastpath+0x25/0x2a [] 0xffffffffffffffff Lock cancel time out in the server side and client eviction happen. Fix this problem by testing O_DIRECT flag to decide if we could issue lockless IO. Fixes: bf18998820 ("lustre: clio: turn on lockless for some kind of IO") WC-bug-id: https://jira.whamcloud.com/browse/LU-14072 Lustre-commit: f348437218d0b9 ("LU-14072 llite: fix client evicition with DIO") Signed-off-by: Wang Shilong Reviewed-on: https://review.whamcloud.com/40389 Reviewed-by: Andreas Dilger Reviewed-by: Gu Zheng Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/include/cl_object.h | 2 +- fs/lustre/llite/file.c | 9 +++------ fs/lustre/llite/rw.c | 4 ++-- fs/lustre/llite/rw26.c | 6 +++--- fs/lustre/llite/vvp_io.c | 6 +++--- include/uapi/linux/lustre/lustre_user.h | 1 - 6 files changed, 12 insertions(+), 16 deletions(-) diff --git a/fs/lustre/include/cl_object.h b/fs/lustre/include/cl_object.h index e17385c0..d2cee34 100644 --- a/fs/lustre/include/cl_object.h +++ b/fs/lustre/include/cl_object.h @@ -1962,7 +1962,7 @@ struct cl_io { /** * Ignore lockless and do normal locking for this io. */ - ci_ignore_lockless:1, + ci_dio_lock:1, /** * Set if we've tried all mirrors for this read IO, if it's not set, * the read IO will check to-be-read OSCs' status, and make fast-switch diff --git a/fs/lustre/llite/file.c b/fs/lustre/llite/file.c index f7f917b..2b0ffad 100644 --- a/fs/lustre/llite/file.c +++ b/fs/lustre/llite/file.c @@ -945,9 +945,6 @@ int ll_file_open(struct inode *inode, struct file *file) mutex_unlock(&lli->lli_och_mutex); - /* lockless for direct IO so that it can do IO in parallel */ - if (file->f_flags & O_DIRECT) - fd->fd_flags |= LL_FILE_LOCKLESS_IO; fd = NULL; /* Must do this outside lli_och_mutex lock to prevent deadlock where @@ -1573,7 +1570,7 @@ static void ll_heat_add(struct inode *inode, enum cl_io_type iot, ssize_t result = 0; int rc = 0; unsigned int retried = 0; - unsigned int ignore_lockless = 0; + unsigned int dio_lock = 0; bool is_aio = false; struct cl_dio_aio *ci_aio = NULL; @@ -1595,7 +1592,7 @@ static void ll_heat_add(struct inode *inode, enum cl_io_type iot, io = vvp_env_thread_io(env); ll_io_init(io, file, iot == CIT_WRITE, args); io->ci_aio = ci_aio; - io->ci_ignore_lockless = ignore_lockless; + io->ci_dio_lock = dio_lock; io->ci_ndelay_tried = retried; if (cl_io_rw_init(env, io, iot, *ppos, count) == 0) { @@ -1675,7 +1672,7 @@ static void ll_heat_add(struct inode *inode, enum cl_io_type iot, *ppos, count, result); /* preserve the tried count for FLR */ retried = io->ci_ndelay_tried; - ignore_lockless = io->ci_ignore_lockless; + dio_lock = io->ci_dio_lock; goto restart; } diff --git a/fs/lustre/llite/rw.c b/fs/lustre/llite/rw.c index 54f0b9a..da4a26d 100644 --- a/fs/lustre/llite/rw.c +++ b/fs/lustre/llite/rw.c @@ -1723,9 +1723,9 @@ int ll_readpage(struct file *file, struct page *vmpage) */ if (file->f_flags & O_DIRECT && lcc && lcc->lcc_type == LCC_RW && - !io->ci_ignore_lockless) { + !io->ci_dio_lock) { unlock_page(vmpage); - io->ci_ignore_lockless = 1; + io->ci_dio_lock = 1; io->ci_need_restart = 1; return -ENOLCK; } diff --git a/fs/lustre/llite/rw26.c b/fs/lustre/llite/rw26.c index 1736e9a..605a326 100644 --- a/fs/lustre/llite/rw26.c +++ b/fs/lustre/llite/rw26.c @@ -538,12 +538,12 @@ static int ll_write_begin(struct file *file, struct address_space *mapping, } /* - * Direct read can fall back to buffered read, but DIO is done + * Direct write can fall back to buffered read, but DIO is done * with lockless i/o, and buffered requires LDLM locking, so * in this case we must restart without lockless. */ - if (!io->ci_ignore_lockless) { - io->ci_ignore_lockless = 1; + if (!io->ci_dio_lock) { + io->ci_dio_lock = 1; io->ci_need_restart = 1; result = -ENOLCK; goto out; diff --git a/fs/lustre/llite/vvp_io.c b/fs/lustre/llite/vvp_io.c index d6ca267..8dbe835 100644 --- a/fs/lustre/llite/vvp_io.c +++ b/fs/lustre/llite/vvp_io.c @@ -557,11 +557,11 @@ static int vvp_io_rw_lock(const struct lu_env *env, struct cl_io *io, if (vio->vui_fd) { /* Group lock held means no lockless any more */ if (vio->vui_fd->fd_flags & LL_FILE_GROUP_LOCKED) - io->ci_ignore_lockless = 1; + io->ci_dio_lock = 1; if (ll_file_nolock(vio->vui_fd->fd_file) || - (vio->vui_fd->fd_flags & LL_FILE_LOCKLESS_IO && - !io->ci_ignore_lockless)) + (vio->vui_fd->fd_file->f_flags & O_DIRECT && + !io->ci_dio_lock)) ast_flags |= CEF_NEVER; } diff --git a/include/uapi/linux/lustre/lustre_user.h b/include/uapi/linux/lustre/lustre_user.h index b0301e1..143b7d5 100644 --- a/include/uapi/linux/lustre/lustre_user.h +++ b/include/uapi/linux/lustre/lustre_user.h @@ -402,7 +402,6 @@ struct ll_ioc_lease_id { #define LL_FILE_GROUP_LOCKED 0x00000002 #define LL_FILE_READAHEA 0x00000004 #define LL_FILE_LOCKED_DIRECTIO 0x00000008 /* client-side locks with dio */ -#define LL_FILE_LOCKLESS_IO 0x00000010 /* server-side locks with cio */ #define LL_FILE_FLOCK_WARNING 0x00000020 /* warned about disabled flock */ #define LOV_USER_MAGIC_V1 0x0BD10BD0 From patchwork Thu Jan 21 17:16:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12037197 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DEC7BC433DB for ; Thu, 21 Jan 2021 17:18:40 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6F71723A5A for ; Thu, 21 Jan 2021 17:18:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6F71723A5A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 98BAA21FEF8; Thu, 21 Jan 2021 09:17:58 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id EB35221FA40 for ; Thu, 21 Jan 2021 09:17:08 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 3BCBA1008052; Thu, 21 Jan 2021 12:17:05 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 38BBF11596; Thu, 21 Jan 2021 12:17:05 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 21 Jan 2021 12:16:33 -0500 Message-Id: <1611249422-556-11-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org> References: <1611249422-556-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 10/39] lustre: Use vfree_atomic instead of vfree X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Oleg Drokin Since vfree is unsafe to use in atomic context, we can use vmalloc_free() introduced in linux 4.10 commit bf22e37a641327e34681b7b6959d9646e3886770. To use this we have to export vmalloc_free(). The biggest offender is in the ptlrpc code so replace the kvfree() with vfree_atomic(). WC-bug-id: https://jira.whamcloud.com/browse/LU-12564 Lustre-commit: 7a9c0ca690eb00 ("LU-12564 libcfs: Use vfree_atomic instead of vfree") Signed-off-by: Oleg Drokin Reviewed-on: https://review.whamcloud.com/40136 Reviewed-by: Andreas Dilger Reviewed-by: Aurelien Degremont Reviewed-by: Neil Brown Signed-off-by: James Simmons --- fs/lustre/ptlrpc/client.c | 6 +++++- fs/lustre/ptlrpc/sec.c | 21 +++++++++++++++++---- fs/lustre/ptlrpc/sec_bulk.c | 6 +++++- fs/lustre/ptlrpc/sec_null.c | 24 +++++++++++++++++++----- fs/lustre/ptlrpc/sec_plain.c | 24 +++++++++++++++++++----- fs/lustre/ptlrpc/service.c | 21 +++++++++++++++++---- mm/vmalloc.c | 1 + 7 files changed, 83 insertions(+), 20 deletions(-) diff --git a/fs/lustre/ptlrpc/client.c b/fs/lustre/ptlrpc/client.c index 2002c03..4b8aa25 100644 --- a/fs/lustre/ptlrpc/client.c +++ b/fs/lustre/ptlrpc/client.c @@ -38,6 +38,7 @@ #include #include #include +#include #include #include @@ -529,7 +530,10 @@ void ptlrpc_free_rq_pool(struct ptlrpc_request_pool *pool) list_del(&req->rq_list); LASSERT(req->rq_reqbuf); LASSERT(req->rq_reqbuf_len == pool->prp_rq_size); - kvfree(req->rq_reqbuf); + if (is_vmalloc_addr(req->rq_reqbuf)) + vfree_atomic(req->rq_reqbuf); + else + kfree(req->rq_reqbuf); ptlrpc_request_cache_free(req); } kfree(pool); diff --git a/fs/lustre/ptlrpc/sec.c b/fs/lustre/ptlrpc/sec.c index 44c15e6..43d4f76 100644 --- a/fs/lustre/ptlrpc/sec.c +++ b/fs/lustre/ptlrpc/sec.c @@ -42,6 +42,7 @@ #include #include #include +#include #include #include @@ -474,7 +475,10 @@ int sptlrpc_req_ctx_switch(struct ptlrpc_request *req, req->rq_flvr = old_flvr; } - kvfree(reqmsg); + if (is_vmalloc_addr(reqmsg)) + vfree_atomic(reqmsg); + else + kfree(reqmsg); } return rc; } @@ -836,7 +840,10 @@ void sptlrpc_request_out_callback(struct ptlrpc_request *req) if (req->rq_pool || !req->rq_reqbuf) return; - kvfree(req->rq_reqbuf); + if (is_vmalloc_addr(req->rq_reqbuf)) + vfree_atomic(req->rq_reqbuf); + else + kfree(req->rq_reqbuf); req->rq_reqbuf = NULL; req->rq_reqbuf_len = 0; } @@ -1133,7 +1140,10 @@ int sptlrpc_cli_unwrap_early_reply(struct ptlrpc_request *req, err_ctx: sptlrpc_cli_ctx_put(early_req->rq_cli_ctx, 1); err_buf: - kvfree(early_buf); + if (is_vmalloc_addr(early_buf)) + vfree_atomic(early_buf); + else + kfree(early_buf); err_req: ptlrpc_request_cache_free(early_req); return rc; @@ -1151,7 +1161,10 @@ void sptlrpc_cli_finish_early_reply(struct ptlrpc_request *early_req) LASSERT(early_req->rq_repmsg); sptlrpc_cli_ctx_put(early_req->rq_cli_ctx, 1); - kvfree(early_req->rq_repbuf); + if (is_vmalloc_addr(early_req->rq_repbuf)) + vfree_atomic(early_req->rq_repbuf); + else + kfree(early_req->rq_repbuf); ptlrpc_request_cache_free(early_req); } diff --git a/fs/lustre/ptlrpc/sec_bulk.c b/fs/lustre/ptlrpc/sec_bulk.c index 3c3ae8b..9548721 100644 --- a/fs/lustre/ptlrpc/sec_bulk.c +++ b/fs/lustre/ptlrpc/sec_bulk.c @@ -37,6 +37,7 @@ #define DEBUG_SUBSYSTEM S_SEC +#include #include #include @@ -380,7 +381,10 @@ static inline void enc_pools_free(void) LASSERT(page_pools.epp_max_pools); LASSERT(page_pools.epp_pools); - kvfree(page_pools.epp_pools); + if (is_vmalloc_addr(page_pools.epp_pools)) + vfree_atomic(page_pools.epp_pools); + else + kfree(page_pools.epp_pools); } static struct shrinker pools_shrinker = { diff --git a/fs/lustre/ptlrpc/sec_null.c b/fs/lustre/ptlrpc/sec_null.c index 97c4e19..3892d6e 100644 --- a/fs/lustre/ptlrpc/sec_null.c +++ b/fs/lustre/ptlrpc/sec_null.c @@ -37,6 +37,7 @@ #define DEBUG_SUBSYSTEM S_SEC +#include #include #include #include @@ -180,7 +181,10 @@ void null_free_reqbuf(struct ptlrpc_sec *sec, "req %p: reqlen %d should smaller than buflen %d\n", req, req->rq_reqlen, req->rq_reqbuf_len); - kvfree(req->rq_reqbuf); + if (is_vmalloc_addr(req->rq_reqbuf)) + vfree_atomic(req->rq_reqbuf); + else + kfree(req->rq_reqbuf); req->rq_reqbuf = NULL; req->rq_reqbuf_len = 0; } @@ -210,7 +214,10 @@ void null_free_repbuf(struct ptlrpc_sec *sec, { LASSERT(req->rq_repbuf); - kvfree(req->rq_repbuf); + if (is_vmalloc_addr(req->rq_repbuf)) + vfree_atomic(req->rq_repbuf); + else + kfree(req->rq_repbuf); req->rq_repbuf = NULL; req->rq_repbuf_len = 0; } @@ -257,7 +264,10 @@ int null_enlarge_reqbuf(struct ptlrpc_sec *sec, spin_lock(&req->rq_import->imp_lock); memcpy(newbuf, req->rq_reqbuf, req->rq_reqlen); - kvfree(req->rq_reqbuf); + if (is_vmalloc_addr(req->rq_reqbuf)) + vfree_atomic(req->rq_reqbuf); + else + kfree(req->rq_reqbuf); req->rq_reqbuf = newbuf; req->rq_reqmsg = newbuf; req->rq_reqbuf_len = alloc_size; @@ -337,8 +347,12 @@ void null_free_rs(struct ptlrpc_reply_state *rs) LASSERT_ATOMIC_GT(&rs->rs_svc_ctx->sc_refcount, 1); atomic_dec(&rs->rs_svc_ctx->sc_refcount); - if (!rs->rs_prealloc) - kvfree(rs); + if (!rs->rs_prealloc) { + if (is_vmalloc_addr(rs)) + vfree_atomic(rs); + else + kfree(rs); + } } static diff --git a/fs/lustre/ptlrpc/sec_plain.c b/fs/lustre/ptlrpc/sec_plain.c index b487968..80831af 100644 --- a/fs/lustre/ptlrpc/sec_plain.c +++ b/fs/lustre/ptlrpc/sec_plain.c @@ -38,6 +38,7 @@ #define DEBUG_SUBSYSTEM S_SEC #include +#include #include #include #include @@ -582,7 +583,10 @@ void plain_free_reqbuf(struct ptlrpc_sec *sec, struct ptlrpc_request *req) { if (!req->rq_pool) { - kvfree(req->rq_reqbuf); + if (is_vmalloc_addr(req->rq_reqbuf)) + vfree_atomic(req->rq_reqbuf); + else + kfree(req->rq_reqbuf); req->rq_reqbuf = NULL; req->rq_reqbuf_len = 0; } @@ -623,7 +627,10 @@ int plain_alloc_repbuf(struct ptlrpc_sec *sec, void plain_free_repbuf(struct ptlrpc_sec *sec, struct ptlrpc_request *req) { - kvfree(req->rq_repbuf); + if (is_vmalloc_addr(req->rq_repbuf)) + vfree_atomic(req->rq_repbuf); + else + kfree(req->rq_repbuf); req->rq_repbuf = NULL; req->rq_repbuf_len = 0; } @@ -678,7 +685,10 @@ int plain_enlarge_reqbuf(struct ptlrpc_sec *sec, memcpy(newbuf, req->rq_reqbuf, req->rq_reqbuf_len); - kvfree(req->rq_reqbuf); + if (is_vmalloc_addr(req->rq_reqbuf)) + vfree_atomic(req->rq_reqbuf); + else + kfree(req->rq_reqbuf); req->rq_reqbuf = newbuf; req->rq_reqbuf_len = newbuf_size; req->rq_reqmsg = lustre_msg_buf(req->rq_reqbuf, @@ -823,8 +833,12 @@ void plain_free_rs(struct ptlrpc_reply_state *rs) LASSERT(atomic_read(&rs->rs_svc_ctx->sc_refcount) > 1); atomic_dec(&rs->rs_svc_ctx->sc_refcount); - if (!rs->rs_prealloc) - kvfree(rs); + if (!rs->rs_prealloc) { + if (is_vmalloc_addr(rs)) + vfree_atomic(rs); + else + kfree(rs); + } } static diff --git a/fs/lustre/ptlrpc/service.c b/fs/lustre/ptlrpc/service.c index 5881e0a..b341877 100644 --- a/fs/lustre/ptlrpc/service.c +++ b/fs/lustre/ptlrpc/service.c @@ -36,6 +36,7 @@ #include #include #include +#include #include #include @@ -118,7 +119,10 @@ static void ptlrpc_free_rqbd(struct ptlrpc_request_buffer_desc *rqbd) svcpt->scp_nrqbds_total--; spin_unlock(&svcpt->scp_lock); - kvfree(rqbd->rqbd_buffer); + if (is_vmalloc_addr(rqbd->rqbd_buffer)) + vfree_atomic(rqbd->rqbd_buffer); + else + kfree(rqbd->rqbd_buffer); kfree(rqbd); } @@ -838,7 +842,10 @@ static void ptlrpc_server_drop_request(struct ptlrpc_request *req) test_req_buffer_pressure) { /* like in ptlrpc_free_rqbd() */ svcpt->scp_nrqbds_total--; - kvfree(rqbd->rqbd_buffer); + if (is_vmalloc_addr(rqbd->rqbd_buffer)) + vfree_atomic(rqbd->rqbd_buffer); + else + kfree(rqbd->rqbd_buffer); kfree(rqbd); } else { list_add_tail(&rqbd->rqbd_list, @@ -1197,7 +1204,10 @@ static int ptlrpc_at_send_early_reply(struct ptlrpc_request *req) class_export_put(reqcopy->rq_export); out: sptlrpc_svc_ctx_decref(reqcopy); - kvfree(reqmsg); + if (is_vmalloc_addr(reqmsg)) + vfree_atomic(reqmsg); + else + kfree(reqmsg); out_free: ptlrpc_request_cache_free(reqcopy); return rc; @@ -2938,7 +2948,10 @@ static void ptlrpc_wait_replies(struct ptlrpc_service_part *svcpt) struct ptlrpc_reply_state, rs_list)) != NULL) { list_del(&rs->rs_list); - kvfree(rs); + if (is_vmalloc_addr(rs)) + vfree_atomic(rs); + else + kfree(rs); } } } diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 9a8227a..9a27fbd 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -2355,6 +2355,7 @@ void vfree_atomic(const void *addr) return; __vfree_deferred(addr); } +EXPORT_SYMBOL(vfree_atomic); static void __vfree(const void *addr) { From patchwork Thu Jan 21 17:16:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12037145 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2AB02C433E6 for ; Thu, 21 Jan 2021 17:17:18 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B25E423A57 for ; Thu, 21 Jan 2021 17:17:17 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B25E423A57 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 7FC4B21FC25; Thu, 21 Jan 2021 09:17:15 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 4034A21FA40 for ; Thu, 21 Jan 2021 09:17:09 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 3E3321008053; Thu, 21 Jan 2021 12:17:05 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 3BD38115A7; Thu, 21 Jan 2021 12:17:05 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 21 Jan 2021 12:16:34 -0500 Message-Id: <1611249422-556-12-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org> References: <1611249422-556-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 11/39] lnet: lnd: Use NETWORK_TIMEOUT for txs on ibp_tx_queue X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Horn , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Chris Horn TXs on the ibp_tx_queue are waiting for a connection to be established. Failure to establish a connection could be due to a problem with either the local NI or the remote NI, and o2iblnd cannot currently distinguish between these failures. As such, it should return LNET_MSG_STATUS_NETWORK_TIMEOUT to LNet so that LNet will decrement the health value of both the local NI and the remote NI and future sends can take these health values into account. HPE-bug-id: LUS-9342 WC-bug-id: https://jira.whamcloud.com/browse/LU-13571 Lustre-commit: 7af63191370fd2 ("LU-13571 lnd: Use NETWORK_TIMEOUT for txs on ibp_tx_queue") Signed-off-by: Chris Horn Reviewed-on: https://review.whamcloud.com/39899 Reviewed-by: Amir Shehata Reviewed-by: Serguei Smirnov Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- net/lnet/klnds/o2iblnd/o2iblnd_cb.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/lnet/klnds/o2iblnd/o2iblnd_cb.c b/net/lnet/klnds/o2iblnd/o2iblnd_cb.c index 3d7026b..9766aa2 100644 --- a/net/lnet/klnds/o2iblnd/o2iblnd_cb.c +++ b/net/lnet/klnds/o2iblnd/o2iblnd_cb.c @@ -3308,7 +3308,7 @@ static int kiblnd_map_tx(struct lnet_ni *ni, struct kib_tx *tx, if (!list_empty(&timedout_txs)) kiblnd_txlist_done(&timedout_txs, -ETIMEDOUT, - LNET_MSG_STATUS_LOCAL_TIMEOUT); + LNET_MSG_STATUS_REMOTE_TIMEOUT); /* * Handle timeout by closing the whole From patchwork Thu Jan 21 17:16:35 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12037149 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 002B0C433E6 for ; Thu, 21 Jan 2021 17:17:23 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 957C323A57 for ; Thu, 21 Jan 2021 17:17:23 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 957C323A57 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 4769C21FD67; Thu, 21 Jan 2021 09:17:19 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 77B3321FA40 for ; Thu, 21 Jan 2021 09:17:09 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 422B91008054; Thu, 21 Jan 2021 12:17:05 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 3F078115B8; Thu, 21 Jan 2021 12:17:05 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 21 Jan 2021 12:16:35 -0500 Message-Id: <1611249422-556-13-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org> References: <1611249422-556-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 12/39] lnet: lnd: Use NETWORK_TIMEOUT for some conn failures X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Horn , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Chris Horn For -EHOSTUNREACH and -ETIMEDOUT we cannot tell whether the connection failure was due to a problem with the remote or local NI, so we should return the LNET_MSG_STATUS_NETWORK_TIMEOUT to LNet in these cases. HPE-bug-id: LUS-9342 WC-bug-id: https://jira.whamcloud.com/browse/LU-13571 Lustre-commit: 12333c1fecc00e ("LU-13571 lnd: Use NETWORK_TIMEOUT for some conn failures") Signed-off-by: Chris Horn Reviewed-on: https://review.whamcloud.com/39900 Reviewed-by: Amir Shehata Reviewed-by: Serguei Smirnov Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- net/lnet/klnds/o2iblnd/o2iblnd_cb.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/net/lnet/klnds/o2iblnd/o2iblnd_cb.c b/net/lnet/klnds/o2iblnd/o2iblnd_cb.c index 9766aa2..20d555f 100644 --- a/net/lnet/klnds/o2iblnd/o2iblnd_cb.c +++ b/net/lnet/klnds/o2iblnd/o2iblnd_cb.c @@ -2143,8 +2143,12 @@ static int kiblnd_map_tx(struct lnet_ni *ni, struct kib_tx *tx, CNETERR("Deleting messages for %s: connection failed\n", libcfs_nid2str(peer_ni->ibp_nid)); - kiblnd_txlist_done(&zombies, error, - LNET_MSG_STATUS_LOCAL_DROPPED); + if (error == -EHOSTUNREACH || error == -ETIMEDOUT) + kiblnd_txlist_done(&zombies, error, + LNET_MSG_STATUS_NETWORK_TIMEOUT); + else + kiblnd_txlist_done(&zombies, error, + LNET_MSG_STATUS_LOCAL_DROPPED); } static void From patchwork Thu Jan 21 17:16:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12037175 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1A07CC433E0 for ; Thu, 21 Jan 2021 17:18:01 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B887E23A5A for ; Thu, 21 Jan 2021 17:18:00 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B887E23A5A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id C5F5021FDCF; Thu, 21 Jan 2021 09:17:40 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id B0C4421FB64 for ; Thu, 21 Jan 2021 09:17:09 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 441E21008055; Thu, 21 Jan 2021 12:17:05 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 426EDF0A5; Thu, 21 Jan 2021 12:17:05 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 21 Jan 2021 12:16:36 -0500 Message-Id: <1611249422-556-14-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org> References: <1611249422-556-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 13/39] lustre: llite: allow DIO with unaligned IO count X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Wang Shilong , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Wang Shilong DIO only require user buffer page aligned and IO offset page aligned, it is ok that io count is not page aligned, remove this unnecessary limit so that we could use DIO with file not aligned with PAGE SIZE. WC-bug-id: https://jira.whamcloud.com/browse/LU-14043 Lustre-commit: 45c46c6effd827 ("LU-14043 llite: allow DIO with unaligned IO count") Signed-off-by: Wang Shilong Reviewed-on: https://review.whamcloud.com/40392 Reviewed-by: John L. Hammond Reviewed-by: Andreas Dilger Reviewed-by: Alex Zhuravlev Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/llite/rw26.c | 33 +++++++++++++++++++++++++++++++-- 1 file changed, 31 insertions(+), 2 deletions(-) diff --git a/fs/lustre/llite/rw26.c b/fs/lustre/llite/rw26.c index 605a326..28c0a75 100644 --- a/fs/lustre/llite/rw26.c +++ b/fs/lustre/llite/rw26.c @@ -181,6 +181,35 @@ static ssize_t ll_get_user_pages(int rw, struct iov_iter *iter, return result; } +/* + * Lustre could relax a bit for alignment, io count is not + * necessary page alignment. + */ +static unsigned long ll_iov_iter_alignment(struct iov_iter *i) +{ + size_t orig_size = i->count; + size_t count = orig_size & ~PAGE_MASK; + unsigned long res; + + if (!count) + return iov_iter_alignment(i); + + if (orig_size > PAGE_SIZE) { + iov_iter_truncate(i, orig_size - count); + res = iov_iter_alignment(i); + iov_iter_reexpand(i, orig_size); + + return res; + } + + res = iov_iter_alignment(i); + /* start address is page aligned */ + if ((res & ~PAGE_MASK) == orig_size) + return PAGE_SIZE; + + return res; +} + /* direct IO pages */ struct ll_dio_pages { struct cl_dio_aio *ldp_aio; @@ -325,7 +354,7 @@ static ssize_t ll_direct_IO(struct kiocb *iocb, struct iov_iter *iter) return 0; /* FIXME: io smaller than PAGE_SIZE is broken on ia64 ??? */ - if ((file_offset & ~PAGE_MASK) || (count & ~PAGE_MASK)) + if (file_offset & ~PAGE_MASK) return -EINVAL; CDEBUG(D_VFSTRACE, @@ -335,7 +364,7 @@ static ssize_t ll_direct_IO(struct kiocb *iocb, struct iov_iter *iter) MAX_DIO_SIZE >> PAGE_SHIFT); /* Check that all user buffers are aligned as well */ - if (iov_iter_alignment(iter) & ~PAGE_MASK) + if (ll_iov_iter_alignment(iter) & ~PAGE_MASK) return -EINVAL; lcc = ll_cl_find(file); From patchwork Thu Jan 21 17:16:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12037147 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B154BC433E0 for ; Thu, 21 Jan 2021 17:17:21 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id F0E4223A57 for ; Thu, 21 Jan 2021 17:17:20 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F0E4223A57 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id E189F21FD2B; Thu, 21 Jan 2021 09:17:17 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id E949721FB7C for ; Thu, 21 Jan 2021 09:17:09 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 469481008056; Thu, 21 Jan 2021 12:17:05 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 45675F0A7; Thu, 21 Jan 2021 12:17:05 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 21 Jan 2021 12:16:37 -0500 Message-Id: <1611249422-556-15-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org> References: <1611249422-556-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 14/39] lustre: osc: skip 0 row for rpc_stats X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Yang Sheng , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Yang Sheng Fix the rpc_stats statistic it should not print 0 row as it makes nosense. WC-bug-id: https://jira.whamcloud.com/browse/LU-14130 Lustre-commit: 596f74c122f5ed ("LU-14130 osc: skip 0 row for rpc_stats") Signed-off-by: Yang Sheng Reviewed-on: https://review.whamcloud.com/40613 Reviewed-by: Andreas Dilger Reviewed-by: Jian Yu Signed-off-by: James Simmons --- fs/lustre/mdc/lproc_mdc.c | 2 +- fs/lustre/osc/lproc_osc.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/lustre/mdc/lproc_mdc.c b/fs/lustre/mdc/lproc_mdc.c index 662be42..ce03999 100644 --- a/fs/lustre/mdc/lproc_mdc.c +++ b/fs/lustre/mdc/lproc_mdc.c @@ -383,7 +383,7 @@ static int mdc_rpc_stats_seq_show(struct seq_file *seq, void *v) read_cum = 0; write_cum = 0; - for (i = 0; i < OBD_HIST_MAX; i++) { + for (i = 1; i < OBD_HIST_MAX; i++) { unsigned long r = cli->cl_read_rpc_hist.oh_buckets[i]; unsigned long w = cli->cl_write_rpc_hist.oh_buckets[i]; diff --git a/fs/lustre/osc/lproc_osc.c b/fs/lustre/osc/lproc_osc.c index 7ea9530..89b55c3 100644 --- a/fs/lustre/osc/lproc_osc.c +++ b/fs/lustre/osc/lproc_osc.c @@ -808,7 +808,7 @@ static int osc_rpc_stats_seq_show(struct seq_file *seq, void *v) read_cum = 0; write_cum = 0; - for (i = 0; i < OBD_HIST_MAX; i++) { + for (i = 1; i < OBD_HIST_MAX; i++) { unsigned long r = cli->cl_read_rpc_hist.oh_buckets[i]; unsigned long w = cli->cl_write_rpc_hist.oh_buckets[i]; From patchwork Thu Jan 21 17:16:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12037201 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C0F42C433E0 for ; Thu, 21 Jan 2021 17:18:47 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 7251823A57 for ; Thu, 21 Jan 2021 17:18:47 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7251823A57 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id D5BF921FF2C; Thu, 21 Jan 2021 09:18:01 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 2D1B621FB7C for ; Thu, 21 Jan 2021 09:17:10 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 49B991008057; Thu, 21 Jan 2021 12:17:05 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 4859C1158C; Thu, 21 Jan 2021 12:17:05 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 21 Jan 2021 12:16:38 -0500 Message-Id: <1611249422-556-16-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org> References: <1611249422-556-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 15/39] lustre: quota: df should return projid-specific values X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Wang Shilong , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Wang Shilong With local ext4 and XFS filesystems, it is possible to use "df /path/to/directory" (statfs()) to return the current project quota usage for that directory as "used", and min(projid quota limit, free space) as "total". statfs() is a natural interface for users/applications, since it represents the used/maximum space for that subdirectory. Otherwise, the user will get EDQUOT back when the project quota runs out for that directory and applications will not be able to figure out how much data they could write into that directory. WC-bug-id: https://jira.whamcloud.com/browse/LU-9555 Lustre-commit: e5c8f6670fbeea ("LU-9555 quota: df should return projid-specific values") Signed-off-by: Wang Shilong Reviewed-on: https://review.whamcloud.com/36685 Reviewed-by: Andreas Dilger Reviewed-by: Hongchao Zhang Signed-off-by: James Simmons --- fs/lustre/llite/dir.c | 2 +- fs/lustre/llite/llite_internal.h | 1 + fs/lustre/llite/llite_lib.c | 49 +++++++++++++++++++++++++++++++++ include/uapi/linux/lustre/lustre_user.h | 6 ++-- 4 files changed, 54 insertions(+), 4 deletions(-) diff --git a/fs/lustre/llite/dir.c b/fs/lustre/llite/dir.c index 6bc95d9..db620ce 100644 --- a/fs/lustre/llite/dir.c +++ b/fs/lustre/llite/dir.c @@ -1079,7 +1079,7 @@ static int check_owner(int type, int id) return 0; } -static int quotactl_ioctl(struct ll_sb_info *sbi, struct if_quotactl *qctl) +int quotactl_ioctl(struct ll_sb_info *sbi, struct if_quotactl *qctl) { int cmd = qctl->qc_cmd; int type = qctl->qc_type; diff --git a/fs/lustre/llite/llite_internal.h b/fs/lustre/llite/llite_internal.h index 9d988aac..bad974f 100644 --- a/fs/lustre/llite/llite_internal.h +++ b/fs/lustre/llite/llite_internal.h @@ -996,6 +996,7 @@ int ll_dir_read(struct inode *inode, u64 *ppos, struct md_op_data *op_data, struct page *ll_get_dir_page(struct inode *dir, struct md_op_data *op_data, u64 offset); void ll_release_page(struct inode *inode, struct page *page, bool remove); +int quotactl_ioctl(struct ll_sb_info *sbi, struct if_quotactl *qctl); enum get_default_layout_type { GET_DEFAULT_LAYOUT_ROOT = 1, diff --git a/fs/lustre/llite/llite_lib.c b/fs/lustre/llite/llite_lib.c index e4036af..34bd661 100644 --- a/fs/lustre/llite/llite_lib.c +++ b/fs/lustre/llite/llite_lib.c @@ -2137,6 +2137,53 @@ int ll_statfs_internal(struct ll_sb_info *sbi, struct obd_statfs *osfs, return rc; } +static int ll_statfs_project(struct inode *inode, struct kstatfs *sfs) +{ + struct if_quotactl qctl = { + .qc_cmd = LUSTRE_Q_GETQUOTA, + .qc_type = PRJQUOTA, + .qc_valid = QC_GENERAL, + }; + u64 limit, curblock; + int ret; + + qctl.qc_id = ll_i2info(inode)->lli_projid; + ret = quotactl_ioctl(ll_i2sbi(inode), &qctl); + if (ret) { + /* ignore errors if project ID does not have + * a quota limit or feature unsupported. + */ + if (ret == -ESRCH || ret == -EOPNOTSUPP) + ret = 0; + return ret; + } + + limit = ((qctl.qc_dqblk.dqb_bsoftlimit ? + qctl.qc_dqblk.dqb_bsoftlimit : + qctl.qc_dqblk.dqb_bhardlimit) * 1024) / sfs->f_bsize; + if (limit && sfs->f_blocks > limit) { + curblock = (qctl.qc_dqblk.dqb_curspace + + sfs->f_bsize - 1) / sfs->f_bsize; + sfs->f_blocks = limit; + sfs->f_bavail = + (sfs->f_blocks > curblock) ? + (sfs->f_blocks - curblock) : 0; + sfs->f_bfree = sfs->f_bavail; + } + + limit = qctl.qc_dqblk.dqb_isoftlimit ? + qctl.qc_dqblk.dqb_isoftlimit : + qctl.qc_dqblk.dqb_ihardlimit; + if (limit && sfs->f_files > limit) { + sfs->f_files = limit; + sfs->f_ffree = (sfs->f_files > + qctl.qc_dqblk.dqb_curinodes) ? + (sfs->f_files - qctl.qc_dqblk.dqb_curinodes) : 0; + } + + return 0; +} + int ll_statfs(struct dentry *de, struct kstatfs *sfs) { struct super_block *sb = de->d_sb; @@ -2174,6 +2221,8 @@ int ll_statfs(struct dentry *de, struct kstatfs *sfs) sfs->f_bavail = osfs.os_bavail; sfs->f_fsid.val[0] = (u32)fsid; sfs->f_fsid.val[1] = (u32)(fsid >> 32); + if (ll_i2info(de->d_inode)->lli_projid) + return ll_statfs_project(de->d_inode, sfs); ll_stats_ops_tally(ll_s2sbi(sb), LPROC_LL_STATFS, ktime_us_delta(ktime_get(), kstart)); diff --git a/include/uapi/linux/lustre/lustre_user.h b/include/uapi/linux/lustre/lustre_user.h index 143b7d5..62c6952 100644 --- a/include/uapi/linux/lustre/lustre_user.h +++ b/include/uapi/linux/lustre/lustre_user.h @@ -1043,9 +1043,9 @@ struct obd_dqinfo { /* XXX: same as if_dqblk struct in kernel, plus one padding */ struct obd_dqblk { - __u64 dqb_bhardlimit; - __u64 dqb_bsoftlimit; - __u64 dqb_curspace; + __u64 dqb_bhardlimit; /* kbytes unit */ + __u64 dqb_bsoftlimit; /* kbytes unit */ + __u64 dqb_curspace; /* bytes unit */ __u64 dqb_ihardlimit; __u64 dqb_isoftlimit; __u64 dqb_curinodes; From patchwork Thu Jan 21 17:16:39 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12037205 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 91EE0C433DB for ; Thu, 21 Jan 2021 17:18:54 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3D4AF23A57 for ; Thu, 21 Jan 2021 17:18:54 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3D4AF23A57 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 1BFA221FD2A; Thu, 21 Jan 2021 09:18:05 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 7C68F21FB9F for ; Thu, 21 Jan 2021 09:17:10 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 4D8FB1008058; Thu, 21 Jan 2021 12:17:05 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 4B57A1B49B; Thu, 21 Jan 2021 12:17:05 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 21 Jan 2021 12:16:39 -0500 Message-Id: <1611249422-556-17-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org> References: <1611249422-556-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 16/39] lnet: discard the callback X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Yang Sheng , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Yang Sheng Lustre need a completion callback for event that request has been sent. And then need other callback when reply arrived. Sometime the request completion callback maybe lost by some reason even reply has been received. system will wait forever even timeout. We needn't to wait request completion in such case. So provide a way to discard the callback. WC-bug-id: https://jira.whamcloud.com/browse/LU-13368 Lustre-commit: babf0232273467 ("LU-13368 lnet: discard the callback") Signed-off-by: Yang Sheng Reviewed-on: https://review.whamcloud.com/38845 Reviewed-by: Amir Shehata Reviewed-by: Cyril Bordage Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/include/lustre_net.h | 15 +++++++++- fs/lustre/ptlrpc/client.c | 15 ++++++---- fs/lustre/ptlrpc/niobuf.c | 7 +++-- include/linux/lnet/api.h | 3 +- include/linux/lnet/lib-lnet.h | 1 + include/linux/lnet/lib-types.h | 1 + net/lnet/klnds/o2iblnd/o2iblnd.c | 1 + net/lnet/klnds/o2iblnd/o2iblnd.h | 4 +++ net/lnet/klnds/o2iblnd/o2iblnd_cb.c | 58 +++++++++++++++++++++++++++++++++++-- net/lnet/lnet/lib-md.c | 25 ++++++++++++++-- 10 files changed, 117 insertions(+), 13 deletions(-) diff --git a/fs/lustre/include/lustre_net.h b/fs/lustre/include/lustre_net.h index 61be05c..f16c935 100644 --- a/fs/lustre/include/lustre_net.h +++ b/fs/lustre/include/lustre_net.h @@ -2225,8 +2225,10 @@ static inline int ptlrpc_status_ntoh(int n) return req->rq_receiving_reply; } +#define ptlrpc_cli_wait_unlink(req) __ptlrpc_cli_wait_unlink(req, NULL) + static inline int -ptlrpc_client_recv_or_unlink(struct ptlrpc_request *req) +__ptlrpc_cli_wait_unlink(struct ptlrpc_request *req, bool *discard) { int rc; @@ -2239,6 +2241,17 @@ static inline int ptlrpc_status_ntoh(int n) spin_unlock(&req->rq_lock); return 1; } + + if (discard) { + *discard = false; + if (req->rq_reply_unlinked && req->rq_req_unlinked == 0) { + *discard = true; + spin_unlock(&req->rq_lock); + return 1; /* Should call again after LNetMDUnlink */ + } + } + + rc = !req->rq_req_unlinked || !req->rq_reply_unlinked || req->rq_receiving_reply; spin_unlock(&req->rq_lock); diff --git a/fs/lustre/ptlrpc/client.c b/fs/lustre/ptlrpc/client.c index 4b8aa25..cec4da99 100644 --- a/fs/lustre/ptlrpc/client.c +++ b/fs/lustre/ptlrpc/client.c @@ -1783,7 +1783,7 @@ int ptlrpc_check_set(const struct lu_env *env, struct ptlrpc_request_set *set) * not corrupt any data. */ if (req->rq_phase == RQ_PHASE_UNREG_RPC && - ptlrpc_client_recv_or_unlink(req)) + ptlrpc_cli_wait_unlink(req)) continue; if (req->rq_phase == RQ_PHASE_UNREG_BULK && ptlrpc_client_bulk_active(req)) @@ -1821,7 +1821,7 @@ int ptlrpc_check_set(const struct lu_env *env, struct ptlrpc_request_set *set) ptlrpc_expire_one_request(req, 1); /* Check if we still need to wait for unlink. */ - if (ptlrpc_client_recv_or_unlink(req) || + if (ptlrpc_cli_wait_unlink(req) || ptlrpc_client_bulk_active(req)) continue; /* If there is no need to resend, fail it now. */ @@ -2599,6 +2599,8 @@ u64 ptlrpc_req_xid(struct ptlrpc_request *request) */ static int ptlrpc_unregister_reply(struct ptlrpc_request *request, int async) { + bool discard = false; + /* Might sleep. */ LASSERT(!in_interrupt()); @@ -2609,13 +2611,16 @@ static int ptlrpc_unregister_reply(struct ptlrpc_request *request, int async) PTLRPC_REQ_LONG_UNLINK; /* Nothing left to do. */ - if (!ptlrpc_client_recv_or_unlink(request)) + if (!__ptlrpc_cli_wait_unlink(request, &discard)) return 1; LNetMDUnlink(request->rq_reply_md_h); + if (discard) /* Discard the request-out callback */ + __LNetMDUnlink(request->rq_req_md_h, discard); + /* Let's check it once again. */ - if (!ptlrpc_client_recv_or_unlink(request)) + if (!ptlrpc_cli_wait_unlink(request)) return 1; /* Move to "Unregistering" phase as reply was not unlinked yet. */ @@ -2636,7 +2641,7 @@ static int ptlrpc_unregister_reply(struct ptlrpc_request *request, int async) */ while (seconds > PTLRPC_REQ_LONG_UNLINK && (wait_event_idle_timeout(*wq, - !ptlrpc_client_recv_or_unlink(request), + !ptlrpc_cli_wait_unlink(request), HZ)) == 0) seconds -= 1; if (seconds > 0) { diff --git a/fs/lustre/ptlrpc/niobuf.c b/fs/lustre/ptlrpc/niobuf.c index a1e6581..5ae7dd1 100644 --- a/fs/lustre/ptlrpc/niobuf.c +++ b/fs/lustre/ptlrpc/niobuf.c @@ -103,12 +103,15 @@ static int ptl_send_buf(struct lnet_handle_md *mdh, void *base, int len, return 0; } -static void mdunlink_iterate_helper(struct lnet_handle_md *bd_mds, int count) +#define mdunlink_iterate_helper(mds, count) \ + __mdunlink_iterate_helper(mds, count, false) +static void __mdunlink_iterate_helper(struct lnet_handle_md *bd_mds, + int count, bool discard) { int i; for (i = 0; i < count; i++) - LNetMDUnlink(bd_mds[i]); + __LNetMDUnlink(bd_mds[i], discard); } /** diff --git a/include/linux/lnet/api.h b/include/linux/lnet/api.h index 064c92e..891c4a6 100644 --- a/include/linux/lnet/api.h +++ b/include/linux/lnet/api.h @@ -125,7 +125,8 @@ int LNetMDBind(const struct lnet_md *md_in, enum lnet_unlink unlink_in, struct lnet_handle_md *md_handle_out); -int LNetMDUnlink(struct lnet_handle_md md_in); +int __LNetMDUnlink(struct lnet_handle_md md_in, bool discard); +#define LNetMDUnlink(handle) __LNetMDUnlink(handle, false) void lnet_assert_handler_unused(lnet_handler_t handler); /** @} lnet_md */ diff --git a/include/linux/lnet/lib-lnet.h b/include/linux/lnet/lib-lnet.h index 6253c16..d349f06 100644 --- a/include/linux/lnet/lib-lnet.h +++ b/include/linux/lnet/lib-lnet.h @@ -625,6 +625,7 @@ void lnet_set_reply_msg_len(struct lnet_ni *ni, struct lnet_msg *msg, void lnet_detach_rsp_tracker(struct lnet_libmd *md, int cpt); void lnet_clean_zombie_rstqs(void); +bool lnet_md_discarded(struct lnet_libmd *md); void lnet_finalize(struct lnet_msg *msg, int rc); bool lnet_send_error_simulation(struct lnet_msg *msg, enum lnet_msg_hstatus *hstatus); diff --git a/include/linux/lnet/lib-types.h b/include/linux/lnet/lib-types.h index aaf2a46..7c9d7e2 100644 --- a/include/linux/lnet/lib-types.h +++ b/include/linux/lnet/lib-types.h @@ -222,6 +222,7 @@ struct lnet_libmd { * call. */ #define LNET_MD_FLAG_HANDLING BIT(3) +#define LNET_MD_FLAG_DISCARD BIT(4) struct lnet_test_peer { /* info about peers we are trying to fail */ diff --git a/net/lnet/klnds/o2iblnd/o2iblnd.c b/net/lnet/klnds/o2iblnd/o2iblnd.c index c6a077b..9c65524 100644 --- a/net/lnet/klnds/o2iblnd/o2iblnd.c +++ b/net/lnet/klnds/o2iblnd/o2iblnd.c @@ -2732,6 +2732,7 @@ static int kiblnd_base_startup(struct net *ns) spin_lock_init(&kiblnd_data.kib_connd_lock); INIT_LIST_HEAD(&kiblnd_data.kib_connd_conns); + INIT_LIST_HEAD(&kiblnd_data.kib_connd_waits); INIT_LIST_HEAD(&kiblnd_data.kib_connd_zombies); INIT_LIST_HEAD(&kiblnd_data.kib_reconn_list); INIT_LIST_HEAD(&kiblnd_data.kib_reconn_wait); diff --git a/net/lnet/klnds/o2iblnd/o2iblnd.h b/net/lnet/klnds/o2iblnd/o2iblnd.h index 2b8d5ff..1fc68e1 100644 --- a/net/lnet/klnds/o2iblnd/o2iblnd.h +++ b/net/lnet/klnds/o2iblnd/o2iblnd.h @@ -360,6 +360,8 @@ struct kib_data { struct list_head kib_reconn_list; /* peers wait for reconnection */ struct list_head kib_reconn_wait; + /* connections wait for completion */ + struct list_head kib_connd_waits; /* * The second that peers are pulled out from @kib_reconn_wait * for reconnection. @@ -567,6 +569,8 @@ struct kib_conn { u16 ibc_queue_depth; /* connections max frags */ u16 ibc_max_frags; + /* count of timeout txs waiting on cq */ + u16 ibc_waits; unsigned int ibc_nrx:16; /* receive buffers owned */ unsigned int ibc_scheduled:1;/* scheduled for attention */ unsigned int ibc_ready:1; /* CQ callback fired */ diff --git a/net/lnet/klnds/o2iblnd/o2iblnd_cb.c b/net/lnet/klnds/o2iblnd/o2iblnd_cb.c index 20d555f..5cd367e5 100644 --- a/net/lnet/klnds/o2iblnd/o2iblnd_cb.c +++ b/net/lnet/klnds/o2iblnd/o2iblnd_cb.c @@ -2052,6 +2052,10 @@ static int kiblnd_map_tx(struct lnet_ni *ni, struct kib_tx *tx, if (!tx->tx_sending) { tx->tx_queued = 0; list_move(&tx->tx_list, &zombies); + } else { + /* keep tx until cq destroy */ + list_move(&tx->tx_list, &conn->ibc_zombie_txs); + conn->ibc_waits++; } } @@ -2065,6 +2069,31 @@ static int kiblnd_map_tx(struct lnet_ni *ni, struct kib_tx *tx, kiblnd_txlist_done(&zombies, -ECONNABORTED, LNET_MSG_STATUS_OK); } +static int +kiblnd_tx_may_discard(struct kib_conn *conn) +{ + int rc = 0; + struct kib_tx *nxt; + struct kib_tx *tx; + + spin_lock(&conn->ibc_lock); + + list_for_each_entry_safe(tx, nxt, &conn->ibc_zombie_txs, tx_list) { + if (tx->tx_sending > 0 && tx->tx_lntmsg[0] && + lnet_md_discarded(tx->tx_lntmsg[0]->msg_md)) { + tx->tx_sending--; + if (tx->tx_sending == 0) { + kiblnd_conn_decref(tx->tx_conn); + tx->tx_conn = NULL; + rc = 1; + } + } + } + + spin_unlock(&conn->ibc_lock); + return rc; +} + static void kiblnd_finalise_conn(struct kib_conn *conn) { @@ -3221,8 +3250,9 @@ static int kiblnd_map_tx(struct lnet_ni *ni, struct kib_tx *tx, } if (ktime_compare(ktime_get(), tx->tx_deadline) >= 0) { - CERROR("Timed out tx: %s, %lld seconds\n", + CERROR("Timed out tx: %s(WSQ:%d%d%d), %lld seconds\n", kiblnd_queue2str(conn, txs), + tx->tx_waiting, tx->tx_sending, tx->tx_queued, kiblnd_timeout() + ktime_ms_delta(ktime_get(), tx->tx_deadline) / MSEC_PER_SEC); @@ -3426,15 +3456,23 @@ static int kiblnd_map_tx(struct lnet_ni *ni, struct kib_tx *tx, conn = list_first_entry_or_null(&kiblnd_data.kib_connd_conns, struct kib_conn, ibc_list); if (conn) { + int wait; + list_del(&conn->ibc_list); spin_unlock_irqrestore(lock, flags); dropped_lock = 1; kiblnd_disconnect_conn(conn); - kiblnd_conn_decref(conn); + wait = conn->ibc_waits; + if (wait == 0) /* keep ref for connd_wait, see below */ + kiblnd_conn_decref(conn); spin_lock_irqsave(lock, flags); + + if (wait) + list_add_tail(&conn->ibc_list, + &kiblnd_data.kib_connd_waits); } while (reconn < KIB_RECONN_BREAK) { @@ -3462,6 +3500,22 @@ static int kiblnd_map_tx(struct lnet_ni *ni, struct kib_tx *tx, spin_lock_irqsave(lock, flags); } + conn = list_first_entry_or_null(&kiblnd_data.kib_connd_conns, + struct kib_conn, ibc_list); + if (conn) { + list_del(&conn->ibc_list); + spin_unlock_irqrestore(lock, flags); + + dropped_lock = kiblnd_tx_may_discard(conn); + if (dropped_lock) + kiblnd_conn_decref(conn); + + spin_lock_irqsave(lock, flags); + if (dropped_lock == 0) + list_add_tail(&conn->ibc_list, + &kiblnd_data.kib_connd_waits); + } + /* careful with the jiffy wrap... */ timeout = (int)(deadline - jiffies); if (timeout <= 0) { diff --git a/net/lnet/lnet/lib-md.c b/net/lnet/lnet/lib-md.c index 203c794..b3f758c 100644 --- a/net/lnet/lnet/lib-md.c +++ b/net/lnet/lnet/lib-md.c @@ -465,7 +465,7 @@ void lnet_assert_handler_unused(lnet_handler_t handler) * -ENOENT If @mdh does not point to a valid MD object. */ int -LNetMDUnlink(struct lnet_handle_md mdh) +__LNetMDUnlink(struct lnet_handle_md mdh, bool discard) { struct lnet_event ev; struct lnet_libmd *md = NULL; @@ -502,6 +502,9 @@ void lnet_assert_handler_unused(lnet_handler_t handler) handler = md->md_handler; } + if (discard) + md->md_flags |= LNET_MD_FLAG_DISCARD; + if (md->md_rspt_ptr) lnet_detach_rsp_tracker(md, cpt); @@ -514,4 +517,22 @@ void lnet_assert_handler_unused(lnet_handler_t handler) return 0; } -EXPORT_SYMBOL(LNetMDUnlink); +EXPORT_SYMBOL(__LNetMDUnlink); + +bool +lnet_md_discarded(struct lnet_libmd *md) +{ + bool rc; + int cpt; + + if (!md) + return false; + + cpt = lnet_cpt_of_cookie(md->md_lh.lh_cookie); + lnet_res_lock(cpt); + rc = md->md_flags & LNET_MD_FLAG_DISCARD; + lnet_res_unlock(cpt); + + return rc; +} +EXPORT_SYMBOL(lnet_md_discarded); From patchwork Thu Jan 21 17:16:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12037181 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 819CAC433E0 for ; Thu, 21 Jan 2021 17:18:10 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3964D23A5A for ; Thu, 21 Jan 2021 17:18:10 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3964D23A5A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 2844721FDDD; Thu, 21 Jan 2021 09:17:45 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id C704521FBB4 for ; Thu, 21 Jan 2021 09:17:10 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 4FDDB1008061; Thu, 21 Jan 2021 12:17:05 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 4E6571B49C; Thu, 21 Jan 2021 12:17:05 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 21 Jan 2021 12:16:40 -0500 Message-Id: <1611249422-556-18-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org> References: <1611249422-556-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 17/39] lustre: llite: try to improve mmap performance X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Wang Shilong , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Wang Shilong We have observed slow mmap read performances for some applications. The problem is if access pattern is neither sequential nor stride, but could be still adjacent in a small range and then seek a random position. So the pattern could be something like this: [1M data] [hole] [0.5M data] [hole] [0.7M data] [1M data] Every time an application reads mmap data, it may not only read a single 4KB page, but aslo a cluster of nearby pages in a range(e.g. 1MB) of the first page after a cache miss. The readahead engine is modified to track the range size of a cluster of mmap reads, so that after a seek and/or cache miss, the range size is used to efficiently prefetch multiple pages in a single RPC rather than many small RPCs. Benchmark: fio --name=randread --directory=/ai400/fio --rw=randread --ioengine=mmap --bs=128K --numjobs=32 --filesize=200G --filename=randread --time_based --status-interval=10s --runtime=30s --allow_file_create=1 --group_reporting --disable_lat=1 --disable_clat=1 --disable_slat=1 --disk_util=0 --aux-path=/tmp --randrepeat=0 --unique_filename=0 --fallocate=0 | master | patched | speedup | ---------------+-----------+------------+-----------+ page_fault_avg | 512usec | 52usec | 9.75x page_fault_max | 37698usec| 6543usec| 5.76x WC-bug-id: https://jira.whamcloud.com/browse/LU-13669 Lustre-commit: 0c5ad4b6df5bf3 ("LU-13669 llite: try to improve mmap performance") Signed-off-by: Wang Shilong Reviewed-on: https://review.whamcloud.com/38916 Reviewed-by: Andreas Dilger Reviewed-by: Yingjin Qian Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/llite/llite_internal.h | 18 +++++ fs/lustre/llite/llite_lib.c | 1 + fs/lustre/llite/lproc_llite.c | 47 +++++++++++++ fs/lustre/llite/rw.c | 142 +++++++++++++++++++++++++++++++++++---- 4 files changed, 196 insertions(+), 12 deletions(-) diff --git a/fs/lustre/llite/llite_internal.h b/fs/lustre/llite/llite_internal.h index bad974f..797dfea 100644 --- a/fs/lustre/llite/llite_internal.h +++ b/fs/lustre/llite/llite_internal.h @@ -482,6 +482,12 @@ static inline struct pcc_inode *ll_i2pcci(struct inode *inode) /* default read-ahead full files smaller than limit on the second read */ #define SBI_DEFAULT_READ_AHEAD_WHOLE_MAX MiB_TO_PAGES(2UL) +/* default range pages */ +#define SBI_DEFAULT_RA_RANGE_PAGES MiB_TO_PAGES(1ULL) + +/* Min range pages */ +#define RA_MIN_MMAP_RANGE_PAGES 16UL + enum ra_stat { RA_STAT_HIT = 0, RA_STAT_MISS, @@ -498,6 +504,7 @@ enum ra_stat { RA_STAT_FAILED_REACH_END, RA_STAT_ASYNC, RA_STAT_FAILED_FAST_READ, + RA_STAT_MMAP_RANGE_READ, _NR_RA_STAT, }; @@ -505,6 +512,7 @@ struct ll_ra_info { atomic_t ra_cur_pages; unsigned long ra_max_pages; unsigned long ra_max_pages_per_file; + unsigned long ra_range_pages; unsigned long ra_max_read_ahead_whole_pages; struct workqueue_struct *ll_readahead_wq; /* @@ -790,6 +798,16 @@ struct ll_readahead_state { */ pgoff_t ras_window_start_idx; pgoff_t ras_window_pages; + + /* Page index where min range read starts */ + pgoff_t ras_range_min_start_idx; + /* Page index where mmap range read ends */ + pgoff_t ras_range_max_end_idx; + /* number of mmap pages where last time detected */ + pgoff_t ras_last_range_pages; + /* number of mmap range requests */ + pgoff_t ras_range_requests; + /* * Optimal RPC size in pages. * It decides how many pages will be sent for each read-ahead. diff --git a/fs/lustre/llite/llite_lib.c b/fs/lustre/llite/llite_lib.c index 34bd661..c560492 100644 --- a/fs/lustre/llite/llite_lib.c +++ b/fs/lustre/llite/llite_lib.c @@ -130,6 +130,7 @@ static struct ll_sb_info *ll_init_sbi(void) SBI_DEFAULT_READ_AHEAD_PER_FILE_MAX); sbi->ll_ra_info.ra_async_pages_per_file_threshold = sbi->ll_ra_info.ra_max_pages_per_file; + sbi->ll_ra_info.ra_range_pages = SBI_DEFAULT_RA_RANGE_PAGES; sbi->ll_ra_info.ra_max_read_ahead_whole_pages = -1; atomic_set(&sbi->ll_ra_info.ra_async_inflight, 0); diff --git a/fs/lustre/llite/lproc_llite.c b/fs/lustre/llite/lproc_llite.c index 9b1c392..5d1e2f4 100644 --- a/fs/lustre/llite/lproc_llite.c +++ b/fs/lustre/llite/lproc_llite.c @@ -1173,6 +1173,51 @@ static ssize_t read_ahead_async_file_threshold_mb_show(struct kobject *kobj, } LUSTRE_RW_ATTR(read_ahead_async_file_threshold_mb); +static ssize_t read_ahead_range_kb_show(struct kobject *kobj, + struct attribute *attr, char *buf) +{ + struct ll_sb_info *sbi = container_of(kobj, struct ll_sb_info, + ll_kset.kobj); + + return scnprintf(buf, PAGE_SIZE, "%lu\n", + sbi->ll_ra_info.ra_range_pages << (PAGE_SHIFT - 10)); +} + +static ssize_t +read_ahead_range_kb_store(struct kobject *kobj, + struct attribute *attr, + const char *buffer, size_t count) +{ + struct ll_sb_info *sbi = container_of(kobj, struct ll_sb_info, + ll_kset.kobj); + unsigned long pages_number; + unsigned long max_ra_per_file; + u64 val; + int rc; + + rc = sysfs_memparse(buffer, count, &val, "KiB"); + if (rc < 0) + return rc; + + pages_number = val >> PAGE_SHIFT; + /* Disable mmap range read */ + if (pages_number == 0) + goto out; + + max_ra_per_file = sbi->ll_ra_info.ra_max_pages_per_file; + if (pages_number > max_ra_per_file || + pages_number < RA_MIN_MMAP_RANGE_PAGES) + return -ERANGE; + +out: + spin_lock(&sbi->ll_lock); + sbi->ll_ra_info.ra_range_pages = pages_number; + spin_unlock(&sbi->ll_lock); + + return count; +} +LUSTRE_RW_ATTR(read_ahead_range_kb); + static ssize_t fast_read_show(struct kobject *kobj, struct attribute *attr, char *buf) @@ -1506,6 +1551,7 @@ struct ldebugfs_vars lprocfs_llite_obd_vars[] = { &lustre_attr_max_read_ahead_mb.attr, &lustre_attr_max_read_ahead_per_file_mb.attr, &lustre_attr_max_read_ahead_whole_mb.attr, + &lustre_attr_read_ahead_range_kb.attr, &lustre_attr_checksums.attr, &lustre_attr_checksum_pages.attr, &lustre_attr_stats_track_pid.attr, @@ -1622,6 +1668,7 @@ void ll_stats_ops_tally(struct ll_sb_info *sbi, int op, long count) [RA_STAT_FAILED_REACH_END] = "failed to reach end", [RA_STAT_ASYNC] = "async readahead", [RA_STAT_FAILED_FAST_READ] = "failed to fast read", + [RA_STAT_MMAP_RANGE_READ] = "mmap range read", }; int ll_debugfs_register_super(struct super_block *sb, const char *name) diff --git a/fs/lustre/llite/rw.c b/fs/lustre/llite/rw.c index da4a26d..096e015 100644 --- a/fs/lustre/llite/rw.c +++ b/fs/lustre/llite/rw.c @@ -388,7 +388,7 @@ static bool ras_inside_ra_window(pgoff_t idx, struct ra_io_arg *ria) static unsigned long ll_read_ahead_pages(const struct lu_env *env, struct cl_io *io, struct cl_page_list *queue, struct ll_readahead_state *ras, - struct ra_io_arg *ria, pgoff_t *ra_end) + struct ra_io_arg *ria, pgoff_t *ra_end, pgoff_t skip_index) { struct cl_read_ahead ra = { 0 }; pgoff_t page_idx; @@ -402,6 +402,8 @@ static bool ras_inside_ra_window(pgoff_t idx, struct ra_io_arg *ria) for (page_idx = ria->ria_start_idx; page_idx <= ria->ria_end_idx && ria->ria_reserved > 0; page_idx++) { + if (skip_index && page_idx == skip_index) + continue; if (ras_inside_ra_window(page_idx, ria)) { if (!ra.cra_end_idx || ra.cra_end_idx < page_idx) { pgoff_t end_idx; @@ -447,10 +449,12 @@ static bool ras_inside_ra_window(pgoff_t idx, struct ra_io_arg *ria) if (ras->ras_rpc_pages != ra.cra_rpc_pages && ra.cra_rpc_pages > 0) ras->ras_rpc_pages = ra.cra_rpc_pages; - /* trim it to align with optimal RPC size */ - end_idx = ras_align(ras, ria->ria_end_idx + 1); - if (end_idx > 0 && !ria->ria_eof) - ria->ria_end_idx = end_idx - 1; + if (!skip_index) { + /* trim it to align with optimal RPC size */ + end_idx = ras_align(ras, ria->ria_end_idx + 1); + if (end_idx > 0 && !ria->ria_eof) + ria->ria_end_idx = end_idx - 1; + } if (ria->ria_end_idx < ria->ria_end_idx_min) ria->ria_end_idx = ria->ria_end_idx_min; } @@ -650,7 +654,7 @@ static void ll_readahead_handle_work(struct work_struct *wq) cl_2queue_init(queue); rc = ll_read_ahead_pages(env, io, &queue->c2_qin, ras, ria, - &ra_end_idx); + &ra_end_idx, 0); if (ria->ria_reserved != 0) ll_ra_count_put(ll_i2sbi(inode), ria->ria_reserved); if (queue->c2_qin.pl_nr > 0) { @@ -688,7 +692,7 @@ static void ll_readahead_handle_work(struct work_struct *wq) static int ll_readahead(const struct lu_env *env, struct cl_io *io, struct cl_page_list *queue, struct ll_readahead_state *ras, bool hit, - struct file *file) + struct file *file, pgoff_t skip_index) { struct vvp_io *vio = vvp_env_io(env); struct ll_thread_info *lti = ll_env_info(env); @@ -731,6 +735,9 @@ static int ll_readahead(const struct lu_env *env, struct cl_io *io, if (ras->ras_window_pages > 0) end_idx = ras->ras_window_start_idx + ras->ras_window_pages - 1; + if (skip_index) + end_idx = start_idx + ras->ras_window_pages - 1; + /* Enlarge the RA window to encompass the full read */ if (vio->vui_ra_valid && end_idx < vio->vui_ra_start_idx + vio->vui_ra_pages - 1) @@ -783,6 +790,10 @@ static int ll_readahead(const struct lu_env *env, struct cl_io *io, ria->ria_start_idx; } + /* don't over reserved for mmap range read */ + if (skip_index) + pages_min = 0; + ria->ria_reserved = ll_ra_count_get(ll_i2sbi(inode), ria, pages, pages_min); if (ria->ria_reserved < pages) @@ -793,8 +804,8 @@ static int ll_readahead(const struct lu_env *env, struct cl_io *io, atomic_read(&ll_i2sbi(inode)->ll_ra_info.ra_cur_pages), ll_i2sbi(inode)->ll_ra_info.ra_max_pages); - ret = ll_read_ahead_pages(env, io, queue, ras, ria, &ra_end_idx); - + ret = ll_read_ahead_pages(env, io, queue, ras, ria, &ra_end_idx, + skip_index); if (ria->ria_reserved) ll_ra_count_put(ll_i2sbi(inode), ria->ria_reserved); @@ -890,6 +901,10 @@ void ll_readahead_init(struct inode *inode, struct ll_readahead_state *ras) ras_reset(ras, 0); ras->ras_last_read_end_bytes = 0; ras->ras_requests = 0; + ras->ras_range_min_start_idx = 0; + ras->ras_range_max_end_idx = 0; + ras->ras_range_requests = 0; + ras->ras_last_range_pages = 0; } /* @@ -1033,6 +1048,73 @@ static inline bool is_loose_seq_read(struct ll_readahead_state *ras, loff_t pos) 8UL << PAGE_SHIFT, 8UL << PAGE_SHIFT); } +static inline bool is_loose_mmap_read(struct ll_sb_info *sbi, + struct ll_readahead_state *ras, + unsigned long pos) +{ + unsigned long range_pages = sbi->ll_ra_info.ra_range_pages; + + return pos_in_window(pos, ras->ras_last_read_end_bytes, + range_pages << PAGE_SHIFT, + range_pages << PAGE_SHIFT); +} + +/** + * We have observed slow mmap read performances for some + * applications. The problem is if access pattern is neither + * sequential nor stride, but could be still adjacent in a + * small range and then seek a random position. + * + * So the pattern could be something like this: + * + * [1M data] [hole] [0.5M data] [hole] [0.7M data] [1M data] + * + * + * Every time an application reads mmap data, it may not only + * read a single 4KB page, but aslo a cluster of nearby pages in + * a range(e.g. 1MB) of the first page after a cache miss. + * + * The readahead engine is modified to track the range size of + * a cluster of mmap reads, so that after a seek and/or cache miss, + * the range size is used to efficiently prefetch multiple pages + * in a single RPC rather than many small RPCs. + */ +static void ras_detect_cluster_range(struct ll_readahead_state *ras, + struct ll_sb_info *sbi, + unsigned long pos, unsigned long count) +{ + pgoff_t last_pages, pages; + pgoff_t end_idx = (pos + count - 1) >> PAGE_SHIFT; + + last_pages = ras->ras_range_max_end_idx - + ras->ras_range_min_start_idx + 1; + /* First time come here */ + if (!ras->ras_range_max_end_idx) + goto out; + + /* Random or Stride read */ + if (!is_loose_mmap_read(sbi, ras, pos)) + goto out; + + ras->ras_range_requests++; + if (ras->ras_range_max_end_idx < end_idx) + ras->ras_range_max_end_idx = end_idx; + + if (ras->ras_range_min_start_idx > (pos >> PAGE_SHIFT)) + ras->ras_range_min_start_idx = pos >> PAGE_SHIFT; + + /* Out of range, consider it as random or stride */ + pages = ras->ras_range_max_end_idx - + ras->ras_range_min_start_idx + 1; + if (pages <= sbi->ll_ra_info.ra_range_pages) + return; +out: + ras->ras_last_range_pages = last_pages; + ras->ras_range_requests = 0; + ras->ras_range_min_start_idx = pos >> PAGE_SHIFT; + ras->ras_range_max_end_idx = end_idx; +} + static void ras_detect_read_pattern(struct ll_readahead_state *ras, struct ll_sb_info *sbi, loff_t pos, size_t count, bool mmap) @@ -1080,9 +1162,13 @@ static void ras_detect_read_pattern(struct ll_readahead_state *ras, ras->ras_consecutive_bytes += count; if (mmap) { + unsigned long ra_range_pages = + max_t(unsigned long, RA_MIN_MMAP_RANGE_PAGES, + sbi->ll_ra_info.ra_range_pages); pgoff_t idx = ras->ras_consecutive_bytes >> PAGE_SHIFT; - if ((idx >= 4 && (idx & 3UL) == 0) || stride_detect) + if ((idx >= ra_range_pages && + idx % ra_range_pages == 0) || stride_detect) ras->ras_need_increase_window = true; } else if ((ras->ras_consecutive_requests > 1 || stride_detect)) { ras->ras_need_increase_window = true; @@ -1190,10 +1276,36 @@ static void ras_update(struct ll_sb_info *sbi, struct inode *inode, if (ras->ras_no_miss_check) goto out_unlock; - if (flags & LL_RAS_MMAP) + if (flags & LL_RAS_MMAP) { + unsigned long ra_pages; + + ras_detect_cluster_range(ras, sbi, index << PAGE_SHIFT, + PAGE_SIZE); ras_detect_read_pattern(ras, sbi, (loff_t)index << PAGE_SHIFT, PAGE_SIZE, true); + /* we did not detect anything but we could prefetch */ + if (!ras->ras_need_increase_window && + ras->ras_window_pages <= sbi->ll_ra_info.ra_range_pages && + ras->ras_range_requests >= 2) { + if (!hit) { + ra_pages = max_t(unsigned long, + RA_MIN_MMAP_RANGE_PAGES, + ras->ras_last_range_pages); + if (index < ra_pages / 2) + index = 0; + else + index -= ra_pages / 2; + ras->ras_window_pages = ra_pages; + ll_ra_stats_inc_sbi(sbi, + RA_STAT_MMAP_RANGE_READ); + } else { + ras->ras_window_pages = 0; + } + goto skip; + } + } + if (!hit && ras->ras_window_pages && index < ras->ras_next_readahead_idx && pos_in_window(index, ras->ras_window_start_idx, 0, @@ -1231,6 +1343,8 @@ static void ras_update(struct ll_sb_info *sbi, struct inode *inode, goto out_unlock; } } + +skip: ras_set_start(ras, index); if (stride_io_mode(ras)) { @@ -1500,8 +1614,12 @@ int ll_io_read_page(const struct lu_env *env, struct cl_io *io, io_end_index = cl_index(io->ci_obj, io->u.ci_rw.crw_pos + io->u.ci_rw.crw_count - 1); if (ll_readahead_enabled(sbi) && ras) { + pgoff_t skip_index = 0; + + if (ras->ras_next_readahead_idx < vvp_index(vpg)) + skip_index = vvp_index(vpg); rc2 = ll_readahead(env, io, &queue->c2_qin, ras, - uptodate, file); + uptodate, file, skip_index); CDEBUG(D_READA, DFID " %d pages read ahead at %lu\n", PFID(ll_inode2fid(inode)), rc2, vvp_index(vpg)); } else if (vvp_index(vpg) == io_start_index && From patchwork Thu Jan 21 17:16:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12037153 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id ADB41C433E0 for ; Thu, 21 Jan 2021 17:17:29 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6624C23A57 for ; Thu, 21 Jan 2021 17:17:29 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6624C23A57 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 9DD9321FD95; Thu, 21 Jan 2021 09:17:22 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 1BE3121FBB4 for ; Thu, 21 Jan 2021 09:17:11 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 525B91008062; Thu, 21 Jan 2021 12:17:05 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 5194A1B49D; Thu, 21 Jan 2021 12:17:05 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 21 Jan 2021 12:16:41 -0500 Message-Id: <1611249422-556-19-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org> References: <1611249422-556-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 18/39] lnet: Introduce lnet_recovery_limit parameter X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Horn , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Chris Horn This parameter controls how long LNet will attempt to recover an unhealthy interface. Defaults to 0 to indicate indefinite recovery. This maintains the current behavior. HPE-bug-id: LUS-9109 WC-bug-id: https://jira.whamcloud.com/browse/LU-13569 Lustre-commit: a2e61838f8de89 ("LU-13569 lnet: Introduce lnet_recovery_limit parameter") Signed-off-by: Chris Horn Reviewed-on: https://review.whamcloud.com/39716 Reviewed-by: Amir Shehata Reviewed-by: Serguei Smirnov Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- include/linux/lnet/lib-lnet.h | 1 + net/lnet/lnet/api-ni.c | 5 +++++ 2 files changed, 6 insertions(+) diff --git a/include/linux/lnet/lib-lnet.h b/include/linux/lnet/lib-lnet.h index d349f06..927ca44 100644 --- a/include/linux/lnet/lib-lnet.h +++ b/include/linux/lnet/lib-lnet.h @@ -476,6 +476,7 @@ struct lnet_ni * extern unsigned int lnet_numa_range; extern unsigned int lnet_health_sensitivity; extern unsigned int lnet_recovery_interval; +extern unsigned int lnet_recovery_limit; extern unsigned int lnet_peer_discovery_disabled; extern unsigned int lnet_drop_asym_route; extern unsigned int router_sensitivity_percentage; diff --git a/net/lnet/lnet/api-ni.c b/net/lnet/lnet/api-ni.c index 03473bf..322b25d 100644 --- a/net/lnet/lnet/api-ni.c +++ b/net/lnet/lnet/api-ni.c @@ -112,6 +112,11 @@ static int recovery_interval_set(const char *val, MODULE_PARM_DESC(lnet_recovery_interval, "Interval to recover unhealthy interfaces in seconds"); +unsigned int lnet_recovery_limit; +module_param(lnet_recovery_limit, uint, 0644); +MODULE_PARM_DESC(lnet_recovery_limit, + "How long to attempt recovery of unhealthy peer interfaces in seconds. Set to 0 to allow indefinite recovery"); + static int lnet_interfaces_max = LNET_INTERFACES_MAX_DEFAULT; static int intf_max_set(const char *val, const struct kernel_param *kp); module_param_call(lnet_interfaces_max, intf_max_set, param_get_int, From patchwork Thu Jan 21 17:16:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12037187 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C39D0C433E0 for ; Thu, 21 Jan 2021 17:18:22 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 70C3523A5D for ; Thu, 21 Jan 2021 17:18:22 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 70C3523A5D Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id D82EC21FE8B; Thu, 21 Jan 2021 09:17:50 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 547B921FBD8 for ; Thu, 21 Jan 2021 09:17:11 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 55B011008063; Thu, 21 Jan 2021 12:17:05 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 5487E1B49E; Thu, 21 Jan 2021 12:17:05 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 21 Jan 2021 12:16:42 -0500 Message-Id: <1611249422-556-20-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org> References: <1611249422-556-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 19/39] lustre: mdc: avoid easize set to 0 X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Yang Sheng , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Yang Sheng The cl_default_mds_easize could be set to 0 in some case. Then check it before package. Fixes: 05fc96e25b55 ("lustre: osd: Set max ea size to XATTR_SIZE_MAX") WC-bug-id: https://jira.whamcloud.com/browse/LU-14155 Lustre-commit: ff35e27da4c76b ("LU-14155 mdc: avoid easize set to 0") Signed-off-by: Yang Sheng Reviewed-on: https://review.whamcloud.com/40785 Reviewed-by: Andreas Dilger Reviewed-by: Mike Pershin Signed-off-by: James Simmons --- fs/lustre/mdc/mdc_locks.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/fs/lustre/mdc/mdc_locks.c b/fs/lustre/mdc/mdc_locks.c index a82e8ca..8bbb9e1 100644 --- a/fs/lustre/mdc/mdc_locks.c +++ b/fs/lustre/mdc/mdc_locks.c @@ -554,7 +554,10 @@ static int mdc_save_lovea(struct ptlrpc_request *req, void *data, u32 size) lit = req_capsule_client_get(&req->rq_pill, &RMF_LDLM_INTENT); lit->opc = (u64)it->it_op; - easize = obd->u.cli.cl_default_mds_easize; + if (obd->u.cli.cl_default_mds_easize > 0) + easize = obd->u.cli.cl_default_mds_easize; + else + easize = obd->u.cli.cl_max_mds_easize; /* pack the intended request */ mdc_getattr_pack(req, valid, it->it_flags, op_data, easize); From patchwork Thu Jan 21 17:16:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12037157 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 125D8C433E0 for ; Thu, 21 Jan 2021 17:17:35 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9AEE623A59 for ; Thu, 21 Jan 2021 17:17:34 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9AEE623A59 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id ADF3021FDBE; Thu, 21 Jan 2021 09:17:25 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 8D9E321FBD8 for ; Thu, 21 Jan 2021 09:17:11 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 596CC1008064; Thu, 21 Jan 2021 12:17:05 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 57EC71B49F; Thu, 21 Jan 2021 12:17:05 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 21 Jan 2021 12:16:43 -0500 Message-Id: <1611249422-556-21-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org> References: <1611249422-556-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 20/39] lustre: lmv: optimize dir shard revalidate X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lai Siyao , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Lai Siyao mdt_is_remote_object() will check whether child is directory shard if parent and child are on different MDTs, which needs to read LMV from disk, and hurt striped directory stat performance. This can be optimized, client can just set CROSS_REF flag to do a cross reference getattr, which avoids lots of checks. WC-bug-id: https://jira.whamcloud.com/browse/LU-14172 Lustre-commit: de47c7671f29b2 ("LU-14172 lmv: optimize dir shard revalidate") Signed-off-by: Lai Siyao Reviewed-on: https://review.whamcloud.com/40863 Reviewed-by: Andreas Dilger Reviewed-by: Yingjin Qian Signed-off-by: James Simmons --- fs/lustre/include/obd.h | 2 +- fs/lustre/include/obd_class.h | 3 +-- fs/lustre/llite/file.c | 4 ++-- fs/lustre/llite/llite_lib.c | 4 ++-- fs/lustre/lmv/lmv_intent.c | 15 ++++++++------- fs/lustre/lmv/lmv_internal.h | 1 - fs/lustre/lmv/lmv_obd.c | 3 +-- include/uapi/linux/lustre/lustre_idl.h | 7 +++++++ 8 files changed, 22 insertions(+), 17 deletions(-) diff --git a/fs/lustre/include/obd.h b/fs/lustre/include/obd.h index a017997..de62005 100644 --- a/fs/lustre/include/obd.h +++ b/fs/lustre/include/obd.h @@ -1033,7 +1033,7 @@ struct md_ops { int (*free_lustre_md)(struct obd_export *, struct lustre_md *); - int (*merge_attr)(struct obd_export *, const struct lu_fid *fid, + int (*merge_attr)(struct obd_export *exp, const struct lmv_stripe_md *lsm, struct cl_attr *attr, ldlm_blocking_callback); diff --git a/fs/lustre/include/obd_class.h b/fs/lustre/include/obd_class.h index 1ac9fcf..b441215 100644 --- a/fs/lustre/include/obd_class.h +++ b/fs/lustre/include/obd_class.h @@ -1460,7 +1460,6 @@ static inline int md_free_lustre_md(struct obd_export *exp, } static inline int md_merge_attr(struct obd_export *exp, - const struct lu_fid *fid, const struct lmv_stripe_md *lsm, struct cl_attr *attr, ldlm_blocking_callback cb) @@ -1471,7 +1470,7 @@ static inline int md_merge_attr(struct obd_export *exp, if (rc) return rc; - return MDP(exp->exp_obd, merge_attr)(exp, fid, lsm, attr, cb); + return MDP(exp->exp_obd, merge_attr)(exp, lsm, attr, cb); } static inline int md_setxattr(struct obd_export *exp, const struct lu_fid *fid, diff --git a/fs/lustre/llite/file.c b/fs/lustre/llite/file.c index 2b0ffad..5d03fc3 100644 --- a/fs/lustre/llite/file.c +++ b/fs/lustre/llite/file.c @@ -4708,8 +4708,8 @@ static int ll_merge_md_attr(struct inode *inode) return 0; down_read(&lli->lli_lsm_sem); - rc = md_merge_attr(ll_i2mdexp(inode), &lli->lli_fid, lli->lli_lsm_md, - &attr, ll_md_blocking_ast); + rc = md_merge_attr(ll_i2mdexp(inode), lli->lli_lsm_md, &attr, + ll_md_blocking_ast); up_read(&lli->lli_lsm_sem); if (rc) return rc; diff --git a/fs/lustre/llite/llite_lib.c b/fs/lustre/llite/llite_lib.c index c560492..570d51a 100644 --- a/fs/lustre/llite/llite_lib.c +++ b/fs/lustre/llite/llite_lib.c @@ -1521,8 +1521,8 @@ static int ll_update_lsm_md(struct inode *inode, struct lustre_md *md) } /* validate the lsm */ - rc = md_merge_attr(ll_i2mdexp(inode), &lli->lli_fid, lli->lli_lsm_md, - attr, ll_md_blocking_ast); + rc = md_merge_attr(ll_i2mdexp(inode), lli->lli_lsm_md, attr, + ll_md_blocking_ast); if (!rc) { if (md->body->mbo_valid & OBD_MD_FLNLINK) md->body->mbo_nlink = attr->cat_nlink; diff --git a/fs/lustre/lmv/lmv_intent.c b/fs/lustre/lmv/lmv_intent.c index ad59b64..38b8c75 100644 --- a/fs/lustre/lmv/lmv_intent.c +++ b/fs/lustre/lmv/lmv_intent.c @@ -153,7 +153,6 @@ static int lmv_intent_remote(struct obd_export *exp, struct lookup_intent *it, } int lmv_revalidate_slaves(struct obd_export *exp, - const struct lu_fid *pfid, const struct lmv_stripe_md *lsm, ldlm_blocking_callback cb_blocking, int extra_lock_flags) @@ -197,11 +196,14 @@ int lmv_revalidate_slaves(struct obd_export *exp, * which is not needed here. */ memset(op_data, 0, sizeof(*op_data)); - if (exp_connect_flags2(exp) & OBD_CONNECT2_GETATTR_PFID) - op_data->op_fid1 = *pfid; - else - op_data->op_fid1 = fid; + op_data->op_fid1 = fid; op_data->op_fid2 = fid; + /* shard revalidate only needs to fetch attributes and UPDATE + * lock, which is similar to the bottom half of remote object + * getattr, set this flag so that MDT skips checking whether + * it's remote object. + */ + op_data->op_bias = MDS_CROSS_REF; tgt = lmv_tgt(lmv, lsm->lsm_md_oinfo[i].lmo_mds); if (!tgt) { @@ -495,8 +497,7 @@ static int lmv_intent_lookup(struct obd_export *exp, * during update_inode process (see ll_update_lsm_md) */ if (lmv_dir_striped(op_data->op_mea2)) { - rc = lmv_revalidate_slaves(exp, &op_data->op_fid2, - op_data->op_mea2, + rc = lmv_revalidate_slaves(exp, op_data->op_mea2, cb_blocking, extra_lock_flags); if (rc != 0) diff --git a/fs/lustre/lmv/lmv_internal.h b/fs/lustre/lmv/lmv_internal.h index 756fa27..e42b141 100644 --- a/fs/lustre/lmv/lmv_internal.h +++ b/fs/lustre/lmv/lmv_internal.h @@ -53,7 +53,6 @@ int lmv_fid_alloc(const struct lu_env *env, struct obd_export *exp, struct lu_fid *fid, struct md_op_data *op_data); int lmv_revalidate_slaves(struct obd_export *exp, - const struct lu_fid *pfid, const struct lmv_stripe_md *lsm, ldlm_blocking_callback cb_blocking, int extra_lock_flags); diff --git a/fs/lustre/lmv/lmv_obd.c b/fs/lustre/lmv/lmv_obd.c index fa1dae5..d845118 100644 --- a/fs/lustre/lmv/lmv_obd.c +++ b/fs/lustre/lmv/lmv_obd.c @@ -3482,7 +3482,6 @@ static int lmv_quotactl(struct obd_device *unused, struct obd_export *exp, } static int lmv_merge_attr(struct obd_export *exp, - const struct lu_fid *fid, const struct lmv_stripe_md *lsm, struct cl_attr *attr, ldlm_blocking_callback cb_blocking) @@ -3492,7 +3491,7 @@ static int lmv_merge_attr(struct obd_export *exp, if (!lmv_dir_striped(lsm)) return 0; - rc = lmv_revalidate_slaves(exp, fid, lsm, cb_blocking, 0); + rc = lmv_revalidate_slaves(exp, lsm, cb_blocking, 0); if (rc < 0) return rc; diff --git a/include/uapi/linux/lustre/lustre_idl.h b/include/uapi/linux/lustre/lustre_idl.h index f56b3c5..f953815 100644 --- a/include/uapi/linux/lustre/lustre_idl.h +++ b/include/uapi/linux/lustre/lustre_idl.h @@ -1705,6 +1705,13 @@ struct mdt_rec_setattr { enum mds_op_bias { /* MDS_CHECK_SPLIT = 1 << 0, obsolete before 2.3.58 */ + /* used for remote object getattr/open by name: in the original + * getattr/open request, MDT found the object against name is on another + * MDT, then packed FID and LOOKUP lock in reply and returned -EREMOTE, + * and client knew it's a remote object, then set this flag in + * getattr/open request and sent to the corresponding MDT to finish + * getattr/open, which fetched attributes and UPDATE lock/opened file. + */ MDS_CROSS_REF = 1 << 1, /* MDS_VTX_BYPASS = 1 << 2, obsolete since 2.3.54 */ MDS_PERM_BYPASS = 1 << 3, From patchwork Thu Jan 21 17:16:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12037191 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 445FFC433DB for ; Thu, 21 Jan 2021 17:18:30 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 7E31523A5A for ; Thu, 21 Jan 2021 17:18:29 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7E31523A5A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 26CD321FDFF; Thu, 21 Jan 2021 09:17:54 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id D756521FBD8 for ; Thu, 21 Jan 2021 09:17:11 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 5BEC81008481; Thu, 21 Jan 2021 12:17:05 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 5B3E51B49B; Thu, 21 Jan 2021 12:17:05 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 21 Jan 2021 12:16:44 -0500 Message-Id: <1611249422-556-22-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org> References: <1611249422-556-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 21/39] lustre: ldlm: osc_object_ast_clear() is called for mdc object on eviction X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Andriy Skulysh , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Andriy Skulysh Replace osc_object_prune() with cl_object_prune() PID: 3477 TASK: ffff9360d82fa0e0 CPU: 0 COMMAND: "ll_imp_inval" #0 [ffff9360d5c5b990] machine_kexec at ffffffff86865704 #1 [ffff9360d5c5b9f0] __crash_kexec at ffffffff869209a2 #2 [ffff9360d5c5bac0] panic at ffffffff86f7294c #3 [ffff9360d5c5bb40] lbug_with_loc at ffffffffc04b78cb [libcfs] #4 [ffff9360d5c5bb60] osc_object_ast_clear at ffffffffc0956471 [osc] #5 [ffff9360d5c5bbc8] ldlm_resource_foreach at ffffffffc07e2fd6 [ptlrpc] #6 [ffff9360d5c5bc08] ldlm_resource_iterate at ffffffffc07e3266 [ptlrpc] #7 [ffff9360d5c5bc38] osc_object_prune at ffffffffc0956140 [osc] #8 [ffff9360d5c5bc58] osc_object_invalidate at ffffffffc0956e12 [osc] #9 [ffff9360d5c5bcd0] osc_ldlm_resource_invalidate at ffffffffc09477bf [osc] HPE-bug-id: LUS-8399 WC-bug-id: https://jira.whamcloud.com/browse/LU-13994 Lustre-commit: 542d0059184060 ("LU-13994 ldlm: osc_object_ast_clear() is called for mdc object on eviction") Signed-off-by: Andriy Skulysh Reviewed-on: https://review.whamcloud.com/40052 Reviewed-by: Alexander Boyko Reviewed-by: Vitaly Fertman Tested-by: Alexander Lezhoev Reviewed-by: Alexander Boyko Reviewed-by: Vitaly Fertman Reviewed-by: Mike Pershin Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/osc/osc_object.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/lustre/osc/osc_object.c b/fs/lustre/osc/osc_object.c index 9a0fc54..273098a 100644 --- a/fs/lustre/osc/osc_object.c +++ b/fs/lustre/osc/osc_object.c @@ -493,7 +493,7 @@ int osc_object_invalidate(const struct lu_env *env, struct osc_object *osc) osc_lock_discard_pages(env, osc, 0, CL_PAGE_EOF, true); /* Clear ast data of dlm lock. Do this after discarding all pages */ - osc_object_prune(env, osc2cl(osc)); + cl_object_prune(env, osc2cl(osc)); return 0; } From patchwork Thu Jan 21 17:16:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12037185 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 632B4C433E0 for ; Thu, 21 Jan 2021 17:18:18 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 04D8923A5A for ; Thu, 21 Jan 2021 17:18:17 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 04D8923A5A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id C6F7221FE79; Thu, 21 Jan 2021 09:17:48 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 1B05221FC17 for ; Thu, 21 Jan 2021 09:17:12 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 5FFCB1008482; Thu, 21 Jan 2021 12:17:05 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 5E52D1B49C; Thu, 21 Jan 2021 12:17:05 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 21 Jan 2021 12:16:45 -0500 Message-Id: <1611249422-556-23-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org> References: <1611249422-556-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 22/39] lustre: uapi: fix compatibility for LL_IOC_MDC_GETINFO X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Qian Yingjin The landed patch "lustre: som: integrate LSOM with lfs find" uses "LL_IOC_MDC_GETINFO_OLD", so while the IOCTL number/structs are ABI compatible, it is not API compatible and applications using for the header including the definition LL_IOC_MDC_GETINFO is broken. This patch defines versioned IOCTL number: LL_IOC_MDC_GETINFO_V1, LL_IOC_MDC_GETINFO_V2. Then we can use the explicitly versioned constrants everywhere for the in-tree code, and declare LL_IOC_MDC_GETINFO in a compatible way, but external applications can select the version that they want explicitly. And this patch does the same fix for IOC_MDC_GETFILEINFO. Fixes: 9b5e45e7275e ("lustre: som: integrate LSOM with lfs find") WC-bug-id: https://jira.whamcloud.com/browse/LU-13826 Lustre-commit: 449c648793d2fc ("LU-13826 utils: fix compatibility for LL_IOC_MDC_GETINFO") Signed-off-by: Qian Yingjin Reviewed-on: https://review.whamcloud.com/40858 Reviewed-by: John L. Hammond Reviewed-by: Andreas Dilger Signed-off-by: James Simmons --- fs/lustre/llite/dir.c | 34 ++++++++++++++++----------------- include/uapi/linux/lustre/lustre_user.h | 14 ++++++-------- 2 files changed, 23 insertions(+), 25 deletions(-) diff --git a/fs/lustre/llite/dir.c b/fs/lustre/llite/dir.c index db620ce..c42cff7 100644 --- a/fs/lustre/llite/dir.c +++ b/fs/lustre/llite/dir.c @@ -1634,10 +1634,10 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg) return ll_obd_statfs(inode, (void __user *)arg); case LL_IOC_LOV_GETSTRIPE: case LL_IOC_LOV_GETSTRIPE_NEW: - case LL_IOC_MDC_GETINFO: - case LL_IOC_MDC_GETINFO_OLD: - case IOC_MDC_GETFILEINFO: - case IOC_MDC_GETFILEINFO_OLD: + case LL_IOC_MDC_GETINFO_V1: + case LL_IOC_MDC_GETINFO_V2: + case IOC_MDC_GETFILEINFO_V1: + case IOC_MDC_GETFILEINFO_V2: case IOC_MDC_GETFILESTRIPE: { struct ptlrpc_request *request = NULL; struct ptlrpc_request *root_request = NULL; @@ -1652,8 +1652,8 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg) struct lu_fid __user *fidp = NULL; int lmmsize; - if (cmd == IOC_MDC_GETFILEINFO_OLD || - cmd == IOC_MDC_GETFILEINFO || + if (cmd == IOC_MDC_GETFILEINFO_V1 || + cmd == IOC_MDC_GETFILEINFO_V2 || cmd == IOC_MDC_GETFILESTRIPE) { filename = ll_getname((const char __user *)arg); if (IS_ERR(filename)) @@ -1675,10 +1675,10 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg) goto out_req; } - if (rc == -ENODATA && (cmd == IOC_MDC_GETFILEINFO || - cmd == LL_IOC_MDC_GETINFO || - cmd == IOC_MDC_GETFILEINFO_OLD || - cmd == LL_IOC_MDC_GETINFO_OLD)) { + if (rc == -ENODATA && (cmd == IOC_MDC_GETFILEINFO_V1 || + cmd == LL_IOC_MDC_GETINFO_V1 || + cmd == IOC_MDC_GETFILEINFO_V2 || + cmd == LL_IOC_MDC_GETINFO_V2)) { lmmsize = 0; rc = 0; } @@ -1690,8 +1690,8 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg) cmd == LL_IOC_LOV_GETSTRIPE || cmd == LL_IOC_LOV_GETSTRIPE_NEW) { lump = (struct lov_user_md __user *)arg; - } else if (cmd == IOC_MDC_GETFILEINFO_OLD || - cmd == LL_IOC_MDC_GETINFO_OLD){ + } else if (cmd == IOC_MDC_GETFILEINFO_V1 || + cmd == LL_IOC_MDC_GETINFO_V1) { struct lov_user_mds_data_v1 __user *lmdp; lmdp = (struct lov_user_mds_data_v1 __user *)arg; @@ -1724,8 +1724,8 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg) rc = -EOVERFLOW; } - if (cmd == IOC_MDC_GETFILEINFO_OLD || - cmd == LL_IOC_MDC_GETINFO_OLD) { + if (cmd == IOC_MDC_GETFILEINFO_V1 || + cmd == LL_IOC_MDC_GETINFO_V1) { lstat_t st = { 0 }; st.st_dev = inode->i_sb->s_dev; @@ -1748,8 +1748,8 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg) rc = -EFAULT; goto out_req; } - } else if (cmd == IOC_MDC_GETFILEINFO || - cmd == LL_IOC_MDC_GETINFO) { + } else if (cmd == IOC_MDC_GETFILEINFO_V2 || + cmd == LL_IOC_MDC_GETINFO_V2) { struct statx stx = { 0 }; u64 valid = body->mbo_valid; @@ -1783,7 +1783,7 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg) * However, this whould be better decided by the MDS * instead of the client. */ - if (cmd == LL_IOC_MDC_GETINFO && + if (cmd == LL_IOC_MDC_GETINFO_V2 && ll_i2info(inode)->lli_lsm_md) valid &= ~(OBD_MD_FLSIZE | OBD_MD_FLBLOCKS); diff --git a/include/uapi/linux/lustre/lustre_user.h b/include/uapi/linux/lustre/lustre_user.h index 62c6952..835ffce 100644 --- a/include/uapi/linux/lustre/lustre_user.h +++ b/include/uapi/linux/lustre/lustre_user.h @@ -86,8 +86,6 @@ #define fstatat_f fstatat #endif -#define HAVE_LOV_USER_MDS_DATA - #define LUSTRE_EOF 0xffffffffffffffffULL /* for statfs() */ @@ -384,10 +382,12 @@ struct ll_ioc_lease_id { #define IOC_MDC_TYPE 'i' #define IOC_MDC_LOOKUP _IOWR(IOC_MDC_TYPE, 20, struct obd_device *) #define IOC_MDC_GETFILESTRIPE _IOWR(IOC_MDC_TYPE, 21, struct lov_user_md *) -#define IOC_MDC_GETFILEINFO_OLD _IOWR(IOC_MDC_TYPE, 22, struct lov_user_mds_data_v1 *) -#define IOC_MDC_GETFILEINFO _IOWR(IOC_MDC_TYPE, 22, struct lov_user_mds_data) -#define LL_IOC_MDC_GETINFO_OLD _IOWR(IOC_MDC_TYPE, 23, struct lov_user_mds_data_v1 *) -#define LL_IOC_MDC_GETINFO _IOWR(IOC_MDC_TYPE, 23, struct lov_user_mds_data) +#define IOC_MDC_GETFILEINFO_V1 _IOWR(IOC_MDC_TYPE, 22, struct lov_user_mds_data_v1 *) +#define IOC_MDC_GETFILEINFO_V2 _IOWR(IOC_MDC_TYPE, 22, struct lov_user_mds_data) +#define LL_IOC_MDC_GETINFO_V1 _IOWR(IOC_MDC_TYPE, 23, struct lov_user_mds_data_v1 *) +#define LL_IOC_MDC_GETINFO_V2 _IOWR(IOC_MDC_TYPE, 23, struct lov_user_mds_data) +#define IOC_MDC_GETFILEINFO IOC_MDC_GETFILEINFO_V1 +#define LL_IOC_MDC_GETINFO LL_IOC_MDC_GETINFO_V1 #define MAX_OBD_NAME 128 /* If this changes, a NEW ioctl must be added */ @@ -658,7 +658,6 @@ static inline __u32 lov_user_md_size(__u16 stripes, __u32 lmm_magic) * use this. It is unsafe to #define those values in this header as it * is possible the application has already #included . */ -#ifdef HAVE_LOV_USER_MDS_DATA #define lov_user_mds_data lov_user_mds_data_v2 struct lov_user_mds_data_v1 { lstat_t lmd_st; /* MDS stat struct */ @@ -678,7 +677,6 @@ struct lov_user_mds_data_v3 { lstat_t lmd_st; /* MDS stat struct */ struct lov_user_md_v3 lmd_lmm; /* LOV EA V3 user data */ } __attribute__((packed)); -#endif struct lmv_user_mds_data { struct lu_fid lum_fid; From patchwork Thu Jan 21 17:16:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12037159 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D6587C433DB for ; Thu, 21 Jan 2021 17:17:37 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 90DE423A57 for ; Thu, 21 Jan 2021 17:17:37 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 90DE423A57 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 0509C21FDCE; Thu, 21 Jan 2021 09:17:28 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 6518D21FC19 for ; Thu, 21 Jan 2021 09:17:12 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 6270A1008483; Thu, 21 Jan 2021 12:17:05 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 614ED1B49D; Thu, 21 Jan 2021 12:17:05 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 21 Jan 2021 12:16:46 -0500 Message-Id: <1611249422-556-24-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org> References: <1611249422-556-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 23/39] lustre: llite: don't check layout info for page discard X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Bobi Jam The CIT_MISC+ignore_layout is indicating locks/pages manipulation from the OSC layer, it does not care/access lov layout related info. WC-bug-id: https://jira.whamcloud.com/browse/LU-14042 Lustre-commit: 5d1ffc65d5a97c ("LU-14042 llite: don't check layout info for page discard") Signed-off-by: Bobi Jam Reviewed-on: https://review.whamcloud.com/40267 Reviewed-by: Andreas Dilger Reviewed-by: Yingjin Qian Signed-off-by: James Simmons --- fs/lustre/lov/lov_io.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/fs/lustre/lov/lov_io.c b/fs/lustre/lov/lov_io.c index 20fcde1..7f0e945 100644 --- a/fs/lustre/lov/lov_io.c +++ b/fs/lustre/lov/lov_io.c @@ -465,8 +465,6 @@ static int lov_io_slice_init(struct lov_io *lio, struct lov_object *obj, io->ci_result = 0; lio->lis_object = obj; - LASSERT(obj->lo_lsm); - switch (io->ci_type) { case CIT_READ: case CIT_WRITE: @@ -555,6 +553,18 @@ static int lov_io_slice_init(struct lov_io *lio, struct lov_object *obj, default: LBUG(); } + + /* + * CIT_MISC + ci_ignore_layout can identify the I/O from the OSC layer, + * it won't care/access lov layout related info. + */ + if (io->ci_ignore_layout && io->ci_type == CIT_MISC) { + result = 0; + goto out; + } + + LASSERT(obj->lo_lsm); + result = lov_io_mirror_init(lio, obj, io); if (result) goto out; From patchwork Thu Jan 21 17:16:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12037161 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D3357C433DB for ; Thu, 21 Jan 2021 17:17:40 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 65D0823A57 for ; Thu, 21 Jan 2021 17:17:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 65D0823A57 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 573C621FB03; Thu, 21 Jan 2021 09:17:29 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id A0A1421FB09 for ; Thu, 21 Jan 2021 09:17:12 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 6606F1008484; Thu, 21 Jan 2021 12:17:05 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 649D31B49E; Thu, 21 Jan 2021 12:17:05 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 21 Jan 2021 12:16:47 -0500 Message-Id: <1611249422-556-25-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org> References: <1611249422-556-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 24/39] lustre: update version to 2.13.57 X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Oleg Drokin New tag 2.13.57 Signed-off-by: Oleg Drokin Signed-off-by: James Simmons --- include/uapi/linux/lustre/lustre_ver.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/include/uapi/linux/lustre/lustre_ver.h b/include/uapi/linux/lustre/lustre_ver.h index 8d2f2e8..2a6c050 100644 --- a/include/uapi/linux/lustre/lustre_ver.h +++ b/include/uapi/linux/lustre/lustre_ver.h @@ -3,9 +3,9 @@ #define LUSTRE_MAJOR 2 #define LUSTRE_MINOR 13 -#define LUSTRE_PATCH 56 +#define LUSTRE_PATCH 57 #define LUSTRE_FIX 0 -#define LUSTRE_VERSION_STRING "2.13.56" +#define LUSTRE_VERSION_STRING "2.13.57" #define OBD_OCD_VERSION(major, minor, patch, fix) \ (((major) << 24) + ((minor) << 16) + ((patch) << 8) + (fix)) From patchwork Thu Jan 21 17:16:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12037209 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 48BD2C433E0 for ; Thu, 21 Jan 2021 17:19:01 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id F324723A57 for ; Thu, 21 Jan 2021 17:19:00 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F324723A57 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 497DE21FF9D; Thu, 21 Jan 2021 09:18:08 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id DA56A21FB09 for ; Thu, 21 Jan 2021 09:17:12 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 6ACC11008489; Thu, 21 Jan 2021 12:17:05 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 680DE1B49B; Thu, 21 Jan 2021 12:17:05 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 21 Jan 2021 12:16:48 -0500 Message-Id: <1611249422-556-26-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org> References: <1611249422-556-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 25/39] lnet: o2iblnd: retry qp creation with reduced queue depth X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Serguei Smirnov , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Serguei Smirnov If negotiated number of frags * queue depth is too large for successful qp creation, reduce the queue depth in a loop until qp creation succeeds or the queue depth dips below 2. Remember the reduced queue depth value to use for later connections to the same peer. WC-bug-id: https://jira.whamcloud.com/browse/LU-12901 Lustre-commit: 8a3ef5713cc4ae ("LU-12901 o2iblnd: retry qp creation with reduced queue depth") Signed-off-by: Serguei Smirnov Reviewed-on: https://review.whamcloud.com/40748 Reviewed-by: Amir Shehata Reviewed-by: Cyril Bordage Reviewed-by: Chris Horn Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- net/lnet/klnds/o2iblnd/o2iblnd.c | 33 ++++++++++++++++++++++++++------- net/lnet/klnds/o2iblnd/o2iblnd.h | 2 ++ 2 files changed, 28 insertions(+), 7 deletions(-) diff --git a/net/lnet/klnds/o2iblnd/o2iblnd.c b/net/lnet/klnds/o2iblnd/o2iblnd.c index 9c65524..fc515fc 100644 --- a/net/lnet/klnds/o2iblnd/o2iblnd.c +++ b/net/lnet/klnds/o2iblnd/o2iblnd.c @@ -336,6 +336,7 @@ int kiblnd_create_peer(struct lnet_ni *ni, struct kib_peer_ni **peerp, peer_ni->ibp_last_alive = 0; peer_ni->ibp_max_frags = IBLND_MAX_RDMA_FRAGS; peer_ni->ibp_queue_depth = ni->ni_net->net_tunables.lct_peer_tx_credits; + peer_ni->ibp_queue_depth_mod = 0; /* try to use the default */ atomic_set(&peer_ni->ibp_refcount, 1); /* 1 ref for caller */ INIT_LIST_HEAD(&peer_ni->ibp_list); @@ -795,13 +796,28 @@ struct kib_conn *kiblnd_create_conn(struct kib_peer_ni *peer_ni, init_qp_attr.qp_type = IB_QPT_RC; init_qp_attr.send_cq = cq; init_qp_attr.recv_cq = cq; - /* kiblnd_send_wrs() can change the connection's queue depth if - * the maximum work requests for the device is maxed out - */ - init_qp_attr.cap.max_send_wr = kiblnd_send_wrs(conn); - init_qp_attr.cap.max_recv_wr = IBLND_RECV_WRS(conn); - rc = rdma_create_qp(cmid, conn->ibc_hdev->ibh_pd, &init_qp_attr); + if (peer_ni->ibp_queue_depth_mod && + peer_ni->ibp_queue_depth_mod < peer_ni->ibp_queue_depth) { + conn->ibc_queue_depth = peer_ni->ibp_queue_depth_mod; + CDEBUG(D_NET, "Use reduced queue depth %u (from %u)\n", + peer_ni->ibp_queue_depth_mod, + peer_ni->ibp_queue_depth); + } + + do { + /* kiblnd_send_wrs() can change the connection's queue depth if + * the maximum work requests for the device is maxed out + */ + init_qp_attr.cap.max_send_wr = kiblnd_send_wrs(conn); + init_qp_attr.cap.max_recv_wr = IBLND_RECV_WRS(conn); + rc = rdma_create_qp(cmid, conn->ibc_hdev->ibh_pd, + &init_qp_attr); + if (rc != -ENOMEM || conn->ibc_queue_depth < 2) + break; + conn->ibc_queue_depth--; + } while (rc); + if (rc) { CERROR("Can't create QP: %d, send_wr: %d, recv_wr: %d, send_sge: %d, recv_sge: %d\n", rc, init_qp_attr.cap.max_send_wr, @@ -813,11 +829,14 @@ struct kib_conn *kiblnd_create_conn(struct kib_peer_ni *peer_ni, conn->ibc_sched = sched; - if (conn->ibc_queue_depth != peer_ni->ibp_queue_depth) + if (!peer_ni->ibp_queue_depth_mod && + conn->ibc_queue_depth != peer_ni->ibp_queue_depth) { CWARN("peer %s - queue depth reduced from %u to %u to allow for qp creation\n", libcfs_nid2str(peer_ni->ibp_nid), peer_ni->ibp_queue_depth, conn->ibc_queue_depth); + peer_ni->ibp_queue_depth_mod = conn->ibc_queue_depth; + } conn->ibc_rxs = kzalloc_cpt(IBLND_RX_MSGS(conn) * sizeof(*conn->ibc_rxs), diff --git a/net/lnet/klnds/o2iblnd/o2iblnd.h b/net/lnet/klnds/o2iblnd/o2iblnd.h index 1fc68e1..424ca07 100644 --- a/net/lnet/klnds/o2iblnd/o2iblnd.h +++ b/net/lnet/klnds/o2iblnd/o2iblnd.h @@ -638,6 +638,8 @@ struct kib_peer_ni { u16 ibp_max_frags; /* max_peer_credits */ u16 ibp_queue_depth; + /* reduced value which allows conn to be created if max fails */ + u16 ibp_queue_depth_mod; }; extern struct kib_data kiblnd_data; From patchwork Thu Jan 21 17:16:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12037163 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D1FE9C433E0 for ; Thu, 21 Jan 2021 17:17:43 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6331F23A57 for ; Thu, 21 Jan 2021 17:17:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6331F23A57 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 4C1E921FDDE; Thu, 21 Jan 2021 09:17:31 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 3416321FC28 for ; Thu, 21 Jan 2021 09:17:13 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 6C722100848A; Thu, 21 Jan 2021 12:17:05 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 6B15F1B49C; Thu, 21 Jan 2021 12:17:05 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 21 Jan 2021 12:16:49 -0500 Message-Id: <1611249422-556-27-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org> References: <1611249422-556-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 26/39] lustre: lov: fix SEEK_HOLE calcs at component end X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mikhail Pershin , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Mikhail Pershin If data ends exactly at component end then LOV assumed that is not yet hole in file and the next component will take care. Meanwhile there can be no next component initialized yet if file ends exactly at component boundary, so no hole offset is returned but error Patch fixes that issue. If component reports hole offset at component end then it is saved to be used as result when no other components report valid hole offset. WC-bug-id: https://jira.whamcloud.com/browse/LU-14143 Lustre-commit: dbb6b493ad9f98 ("LU-14143 lov: fix SEEK_HOLE calcs at component end") Signed-off-by: Mikhail Pershin Reviewed-on: https://review.whamcloud.com/40713 Reviewed-by: Andreas Dilger Reviewed-by: John L. Hammond Signed-off-by: James Simmons --- fs/lustre/lov/lov_io.c | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-) diff --git a/fs/lustre/lov/lov_io.c b/fs/lustre/lov/lov_io.c index 7f0e945..ac88a55 100644 --- a/fs/lustre/lov/lov_io.c +++ b/fs/lustre/lov/lov_io.c @@ -1295,6 +1295,7 @@ static void lov_io_lseek_end(const struct lu_env *env, struct lov_stripe_md *lsm = lio->lis_object->lo_lsm; struct lov_io_sub *sub; loff_t offset = -ENXIO; + u64 hole_off = 0; bool seek_hole = io->u.ci_lseek.ls_whence == SEEK_HOLE; list_for_each_entry(sub, &lio->lis_active, sub_linkage) { @@ -1302,6 +1303,7 @@ static void lov_io_lseek_end(const struct lu_env *env, int index = lov_comp_entry(sub->sub_subio_index); int stripe = lov_comp_stripe(sub->sub_subio_index); loff_t sub_off, lov_off; + u64 comp_end = lsm->lsm_entries[index]->lsme_extent.e_end; lov_io_end_wrapper(sub->sub_env, subio); @@ -1347,10 +1349,22 @@ static void lov_io_lseek_end(const struct lu_env *env, /* resulting offset can be out of component range if stripe * object is full and its file size was returned as virtual * hole start. Skip this result, the next component will give - * us correct lseek result. + * us correct lseek result but keep possible hole offset in + * case there is no more components ahead */ - if (lov_off >= lsm->lsm_entries[index]->lsme_extent.e_end) + if (lov_off >= comp_end) { + /* must be SEEK_HOLE case */ + if (likely(seek_hole)) { + /* save comp end as potential hole offset */ + hole_off = max_t(u64, comp_end, hole_off); + } else { + io->ci_result = -EINVAL; + CDEBUG(D_INFO, + "off %lld >= comp_end %llu: rc = %d\n", + lov_off, comp_end, io->ci_result); + } continue; + } CDEBUG(D_INFO, "SEEK_%s: %lld->%lld/%lld: rc = %d\n", seek_hole ? "HOLE" : "DATA", @@ -1358,6 +1372,10 @@ static void lov_io_lseek_end(const struct lu_env *env, sub->sub_io.ci_result); offset = min_t(u64, offset, lov_off); } + /* no result but some component returns hole as component end */ + if (seek_hole && offset == -ENXIO && hole_off > 0) + offset = hole_off; + io->u.ci_lseek.ls_result = offset; } From patchwork Thu Jan 21 17:16:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12037179 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 10493C433DB for ; Thu, 21 Jan 2021 17:18:08 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 95A0523A5A for ; Thu, 21 Jan 2021 17:18:07 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 95A0523A5A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id EDBCF21FB9F; Thu, 21 Jan 2021 09:17:43 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 7D0F921FC4B for ; Thu, 21 Jan 2021 09:17:13 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 6F636100848B; Thu, 21 Jan 2021 12:17:05 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 6E0E71B49D; Thu, 21 Jan 2021 12:17:05 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 21 Jan 2021 12:16:50 -0500 Message-Id: <1611249422-556-28-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org> References: <1611249422-556-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 27/39] lustre: lov: instantiate components layout for fallocate X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Wang Shilong , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Wang Shilong fallocate() need send intent lock to MDS to instantiate layout like PFL. WC-bug-id: https://jira.whamcloud.com/browse/LU-14186 Lustre-commit: 7e25e6c7d0a710 ("LU-14186 lov: instantiate components layout for fallocate") Signed-off-by: Wang Shilong Reviewed-on: https://review.whamcloud.com/40885 Reviewed-by: Andreas Dilger Reviewed-by: Yingjin Qian Reviewed-by: Arshad Hussain Signed-off-by: James Simmons --- fs/lustre/llite/vvp_io.c | 2 +- fs/lustre/lov/lov_io.c | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/lustre/llite/vvp_io.c b/fs/lustre/llite/vvp_io.c index 8dbe835..b0b31c37 100644 --- a/fs/lustre/llite/vvp_io.c +++ b/fs/lustre/llite/vvp_io.c @@ -361,7 +361,7 @@ static void vvp_io_fini(const struct lu_env *env, const struct cl_io_slice *ios) io->ci_need_write_intent = 0; - LASSERT(io->ci_type == CIT_WRITE || + LASSERT(io->ci_type == CIT_WRITE || cl_io_is_fallocate(io) || cl_io_is_trunc(io) || cl_io_is_mkwrite(io)); CDEBUG(D_VFSTRACE, DFID" write layout, type %u " DEXT "\n", diff --git a/fs/lustre/lov/lov_io.c b/fs/lustre/lov/lov_io.c index ac88a55..d4a0c9d 100644 --- a/fs/lustre/lov/lov_io.c +++ b/fs/lustre/lov/lov_io.c @@ -571,6 +571,7 @@ static int lov_io_slice_init(struct lov_io *lio, struct lov_object *obj, /* check if it needs to instantiate layout */ if (!(io->ci_type == CIT_WRITE || cl_io_is_mkwrite(io) || + cl_io_is_fallocate(io) || (cl_io_is_trunc(io) && io->u.ci_setattr.sa_attr.lvb_size > 0))) { result = 0; goto out; From patchwork Thu Jan 21 17:16:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12037195 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 50E06C433E0 for ; Thu, 21 Jan 2021 17:18:37 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E63C823A5B for ; Thu, 21 Jan 2021 17:18:36 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E63C823A5B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 4D52221FCB2; Thu, 21 Jan 2021 09:17:57 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id B5BA921FC28 for ; Thu, 21 Jan 2021 09:17:13 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 72275100848C; Thu, 21 Jan 2021 12:17:05 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 714F11B49E; Thu, 21 Jan 2021 12:17:05 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 21 Jan 2021 12:16:51 -0500 Message-Id: <1611249422-556-29-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org> References: <1611249422-556-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 28/39] lustre: dom: non-blocking enqueue for DOM locks X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mikhail Pershin , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Mikhail Pershin DOM lock enqueue waits for blocking locks on MDT due to ATOMIC flag, so MDT thread is blocked until lock is granted. When many clients attempt to write to shared file that may cause server thread starvation and lock contention. Switch to non-atomic lock enqueue for DOM locks. - switch IO lock to non-intent enqueue, so it doesn't consume server thread for a long time being blocked - on client take LVB from l_lvb_data updated by completion AST and update l_ost_lvb used by DoM - make glimpse performing similarly on MDT and OST, it uses one format with no intent buffer and return data in LVB buffer - introduce new connect flag 'dom_lvb' for compatibility reasons - on server handle glimpse for both old and new clients by filling either LVB reply buffer or mdt_body buffer - don't take RPC slot for a DOM enqueue like it is done for EXTENT locks, update ldlm_cli_enqueue_fini() to accept ldlm_enqueue_info as parameter - check that there is no atomic local lock issued with mandatory DOM bit, trybits should be used WC-bug-id: https://jira.whamcloud.com/browse/LU-10664 Lustre-commit: 3c75d2522786a2a ("LU-10664 dom: non-blocking enqueue for DOM locks") Signed-off-by: Mikhail Pershin Reviewed-on: https://review.whamcloud.com/36903 Reviewed-by: Vitaly Fertman Reviewed-by: Andreas Dilger Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/include/lustre_dlm.h | 3 +- fs/lustre/include/lustre_export.h | 5 ++ fs/lustre/ldlm/ldlm_request.c | 66 +++++++-------- fs/lustre/llite/llite_lib.c | 3 +- fs/lustre/mdc/mdc_dev.c | 144 +++++++++++++++++++-------------- fs/lustre/mdc/mdc_internal.h | 10 +++ fs/lustre/mdc/mdc_locks.c | 9 ++- fs/lustre/obdclass/lprocfs_status.c | 1 + fs/lustre/osc/osc_request.c | 39 ++------- fs/lustre/ptlrpc/wiretest.c | 2 + include/uapi/linux/lustre/lustre_idl.h | 1 + 11 files changed, 147 insertions(+), 136 deletions(-) diff --git a/fs/lustre/include/lustre_dlm.h b/fs/lustre/include/lustre_dlm.h index e4c95a2..8156f75 100644 --- a/fs/lustre/include/lustre_dlm.h +++ b/fs/lustre/include/lustre_dlm.h @@ -1341,8 +1341,7 @@ int ldlm_prep_elc_req(struct obd_export *exp, struct ptlrpc_request *ldlm_enqueue_pack(struct obd_export *exp, int lvb_len); int ldlm_cli_enqueue_fini(struct obd_export *exp, struct ptlrpc_request *req, - enum ldlm_type type, u8 with_policy, - enum ldlm_mode mode, + struct ldlm_enqueue_info *einfo, u8 with_policy, u64 *flags, void *lvb, u32 lvb_len, const struct lustre_handle *lockh, int rc); int ldlm_cli_convert_req(struct ldlm_lock *lock, u32 *flags, u64 new_bits); diff --git a/fs/lustre/include/lustre_export.h b/fs/lustre/include/lustre_export.h index ed49a97..4cc88ef 100644 --- a/fs/lustre/include/lustre_export.h +++ b/fs/lustre/include/lustre_export.h @@ -290,6 +290,11 @@ static inline int exp_connect_lseek(struct obd_export *exp) return !!(exp_connect_flags2(exp) & OBD_CONNECT2_LSEEK); } +static inline int exp_connect_dom_lvb(struct obd_export *exp) +{ + return !!(exp_connect_flags2(exp) & OBD_CONNECT2_DOM_LVB); +} + enum { /* archive_ids in array format */ KKUC_CT_DATA_ARRAY_MAGIC = 0x092013cea, diff --git a/fs/lustre/ldlm/ldlm_request.c b/fs/lustre/ldlm/ldlm_request.c index 86b10a7..1c2ecf2 100644 --- a/fs/lustre/ldlm/ldlm_request.c +++ b/fs/lustre/ldlm/ldlm_request.c @@ -355,9 +355,12 @@ static void failed_lock_cleanup(struct ldlm_namespace *ns, } } -static bool ldlm_request_slot_needed(enum ldlm_type type) +static bool ldlm_request_slot_needed(struct ldlm_enqueue_info *einfo) { - return type == LDLM_FLOCK || type == LDLM_IBITS; + /* exclude EXTENT locks and DOM-only IBITS locks because they + * are asynchronous and don't wait on server being blocked. + */ + return einfo->ei_type == LDLM_FLOCK || einfo->ei_type == LDLM_IBITS; } /** @@ -366,19 +369,19 @@ static bool ldlm_request_slot_needed(enum ldlm_type type) * Called after receiving reply from server. */ int ldlm_cli_enqueue_fini(struct obd_export *exp, struct ptlrpc_request *req, - enum ldlm_type type, u8 with_policy, - enum ldlm_mode mode, - u64 *flags, void *lvb, u32 lvb_len, - const struct lustre_handle *lockh, int rc) + struct ldlm_enqueue_info *einfo, + u8 with_policy, u64 *ldlm_flags, void *lvb, + u32 lvb_len, const struct lustre_handle *lockh, + int rc) { struct ldlm_namespace *ns = exp->exp_obd->obd_namespace; const struct lu_env *env = NULL; - int is_replay = *flags & LDLM_FL_REPLAY; + int is_replay = *ldlm_flags & LDLM_FL_REPLAY; struct ldlm_lock *lock; struct ldlm_reply *reply; int cleanup_phase = 1; - if (ldlm_request_slot_needed(type)) + if (ldlm_request_slot_needed(einfo)) obd_put_request_slot(&req->rq_import->imp_obd->u.cli); ptlrpc_put_mod_rpc_slot(req); @@ -386,7 +389,7 @@ int ldlm_cli_enqueue_fini(struct obd_export *exp, struct ptlrpc_request *req, lock = ldlm_handle2lock(lockh); /* ldlm_cli_enqueue is holding a reference on this lock. */ if (!lock) { - LASSERT(type == LDLM_FLOCK); + LASSERT(einfo->ei_type == LDLM_FLOCK); return -ENOLCK; } @@ -443,20 +446,20 @@ int ldlm_cli_enqueue_fini(struct obd_export *exp, struct ptlrpc_request *req, lock_res_and_lock(lock); lock->l_remote_handle = reply->lock_handle; - *flags = ldlm_flags_from_wire(reply->lock_flags); + *ldlm_flags = ldlm_flags_from_wire(reply->lock_flags); lock->l_flags |= ldlm_flags_from_wire(reply->lock_flags & LDLM_FL_INHERIT_MASK); unlock_res_and_lock(lock); CDEBUG(D_INFO, "local: %p, remote cookie: %#llx, flags: 0x%llx\n", - lock, reply->lock_handle.cookie, *flags); + lock, reply->lock_handle.cookie, *ldlm_flags); /* * If enqueue returned a blocked lock but the completion handler has * already run, then it fixed up the resource and we don't need to do it * again. */ - if ((*flags) & LDLM_FL_LOCK_CHANGED) { + if ((*ldlm_flags) & LDLM_FL_LOCK_CHANGED) { int newmode = reply->lock_desc.l_req_mode; LASSERT(!is_replay); @@ -490,12 +493,12 @@ int ldlm_cli_enqueue_fini(struct obd_export *exp, struct ptlrpc_request *req, &lock->l_policy_data); } - if (type != LDLM_PLAIN) + if (einfo->ei_type != LDLM_PLAIN) LDLM_DEBUG(lock, "client-side enqueue, new policy data"); } - if ((*flags) & LDLM_FL_AST_SENT) { + if ((*ldlm_flags) & LDLM_FL_AST_SENT) { lock_res_and_lock(lock); ldlm_bl_desc2lock(&reply->lock_desc, lock); lock->l_flags |= LDLM_FL_CBPENDING | LDLM_FL_BL_AST; @@ -526,9 +529,10 @@ int ldlm_cli_enqueue_fini(struct obd_export *exp, struct ptlrpc_request *req, } if (!is_replay) { - rc = ldlm_lock_enqueue(env, ns, &lock, NULL, flags); + rc = ldlm_lock_enqueue(env, ns, &lock, NULL, ldlm_flags); if (lock->l_completion_ast) { - int err = lock->l_completion_ast(lock, *flags, NULL); + int err = lock->l_completion_ast(lock, *ldlm_flags, + NULL); if (!rc) rc = err; @@ -548,7 +552,7 @@ int ldlm_cli_enqueue_fini(struct obd_export *exp, struct ptlrpc_request *req, LDLM_DEBUG(lock, "client-side enqueue END"); cleanup: if (cleanup_phase == 1 && rc) - failed_lock_cleanup(ns, lock, mode); + failed_lock_cleanup(ns, lock, einfo->ei_mode); /* Put lock 2 times, the second reference is held by ldlm_cli_enqueue */ LDLM_LOCK_PUT(lock); LDLM_LOCK_RELEASE(lock); @@ -811,24 +815,15 @@ int ldlm_cli_enqueue(struct obd_export *exp, struct ptlrpc_request **reqp, /* extended LDLM opcodes in client stats */ if (exp->exp_obd->obd_svc_stats != NULL) { - bool glimpse = *flags & LDLM_FL_HAS_INTENT; - - /* OST glimpse has no intent buffer */ - if (req_capsule_has_field(&req->rq_pill, &RMF_LDLM_INTENT, - RCL_CLIENT)) { - struct ldlm_intent *it; - - it = req_capsule_client_get(&req->rq_pill, - &RMF_LDLM_INTENT); - glimpse = (it && (it->opc == IT_GLIMPSE)); - } - - if (!glimpse) - ldlm_svc_get_eopc(body, exp->exp_obd->obd_svc_stats); - else + /* glimpse is intent with no intent buffer */ + if (*flags & LDLM_FL_HAS_INTENT && + !req_capsule_has_field(&req->rq_pill, &RMF_LDLM_INTENT, + RCL_CLIENT)) lprocfs_counter_incr(exp->exp_obd->obd_svc_stats, PTLRPC_LAST_CNTR + LDLM_GLIMPSE_ENQUEUE); + else + ldlm_svc_get_eopc(body, exp->exp_obd->obd_svc_stats); } /* It is important to obtain modify RPC slot first (if applicable), so @@ -838,7 +833,7 @@ int ldlm_cli_enqueue(struct obd_export *exp, struct ptlrpc_request **reqp, if (einfo->ei_enq_slot) ptlrpc_get_mod_rpc_slot(req); - if (ldlm_request_slot_needed(einfo->ei_type)) { + if (ldlm_request_slot_needed(einfo)) { rc = obd_get_request_slot(&req->rq_import->imp_obd->u.cli); if (rc) { if (einfo->ei_enq_slot) @@ -858,9 +853,8 @@ int ldlm_cli_enqueue(struct obd_export *exp, struct ptlrpc_request **reqp, rc = ptlrpc_queue_wait(req); - err = ldlm_cli_enqueue_fini(exp, req, einfo->ei_type, policy ? 1 : 0, - einfo->ei_mode, flags, lvb, lvb_len, - lockh, rc); + err = ldlm_cli_enqueue_fini(exp, req, einfo, policy ? 1 : 0, flags, + lvb, lvb_len, lockh, rc); /* * If ldlm_cli_enqueue_fini did not find the lock, we need to free diff --git a/fs/lustre/llite/llite_lib.c b/fs/lustre/llite/llite_lib.c index 570d51a..3139669 100644 --- a/fs/lustre/llite/llite_lib.c +++ b/fs/lustre/llite/llite_lib.c @@ -265,7 +265,8 @@ static int client_common_fill_super(struct super_block *sb, char *md, char *dt) OBD_CONNECT2_ASYNC_DISCARD | OBD_CONNECT2_PCC | OBD_CONNECT2_CRUSH | OBD_CONNECT2_LSEEK | - OBD_CONNECT2_GETATTR_PFID; + OBD_CONNECT2_GETATTR_PFID | + OBD_CONNECT2_DOM_LVB; if (sbi->ll_flags & LL_SBI_LRU_RESIZE) data->ocd_connect_flags |= OBD_CONNECT_LRU_RESIZE; diff --git a/fs/lustre/mdc/mdc_dev.c b/fs/lustre/mdc/mdc_dev.c index 214fd31..e86e69d 100644 --- a/fs/lustre/mdc/mdc_dev.c +++ b/fs/lustre/mdc/mdc_dev.c @@ -294,20 +294,16 @@ void mdc_lock_lockless_cancel(const struct lu_env *env, * Helper for osc_dlm_blocking_ast() handling discrepancies between cl_lock * and ldlm_lock caches. */ -static int mdc_dlm_blocking_ast0(const struct lu_env *env, - struct ldlm_lock *dlmlock, - int flag) +static int mdc_dlm_canceling(const struct lu_env *env, + struct ldlm_lock *dlmlock) { struct cl_object *obj = NULL; int result = 0; bool discard; enum cl_lock_mode mode = CLM_READ; - LASSERT(flag == LDLM_CB_CANCELING); - LASSERT(dlmlock); - lock_res_and_lock(dlmlock); - if (dlmlock->l_granted_mode != dlmlock->l_req_mode) { + if (!ldlm_is_granted(dlmlock)) { dlmlock->l_ast_data = NULL; unlock_res_and_lock(dlmlock); return 0; @@ -349,11 +345,11 @@ static int mdc_dlm_blocking_ast0(const struct lu_env *env, } int mdc_ldlm_blocking_ast(struct ldlm_lock *dlmlock, - struct ldlm_lock_desc *new, void *data, int flag) + struct ldlm_lock_desc *new, void *data, int reason) { int rc = 0; - switch (flag) { + switch (reason) { case LDLM_CB_BLOCKING: { struct lustre_handle lockh; @@ -384,7 +380,7 @@ int mdc_ldlm_blocking_ast(struct ldlm_lock *dlmlock, break; } - rc = mdc_dlm_blocking_ast0(env, dlmlock, flag); + rc = mdc_dlm_canceling(env, dlmlock); cl_env_put(env, &refcheck); break; } @@ -430,6 +426,7 @@ void mdc_lock_lvb_update(const struct lu_env *env, struct osc_object *osc, attr->cat_kms = size; setkms = 1; } + ldlm_lock_allow_match_locked(dlmlock); } /* The size should not be less than the kms */ @@ -479,7 +476,7 @@ static void mdc_lock_granted(const struct lu_env *env, struct osc_lock *oscl, /* Lock must have been granted. */ lock_res_and_lock(dlmlock); - if (dlmlock->l_granted_mode == dlmlock->l_req_mode) { + if (ldlm_is_granted(dlmlock)) { struct cl_lock_descr *descr = &oscl->ols_cl.cls_lock->cll_descr; /* extend the lock extent, otherwise it will have problem when @@ -505,7 +502,7 @@ static void mdc_lock_granted(const struct lu_env *env, struct osc_lock *oscl, /** * Lock upcall function that is executed either when a reply to ENQUEUE rpc is - * received from a server, or after osc_enqueue_base() matched a local DLM + * received from a server, or after mdc_enqueue_base() matched a local DLM * lock. */ static int mdc_lock_upcall(void *cookie, struct lustre_handle *lockh, @@ -561,51 +558,64 @@ static int mdc_lock_upcall(void *cookie, struct lustre_handle *lockh, return rc; } +/* This is needed only for old servers (before 2.14) support */ int mdc_fill_lvb(struct ptlrpc_request *req, struct ost_lvb *lvb) { struct mdt_body *body; + /* get LVB data from mdt_body otherwise */ body = req_capsule_server_get(&req->rq_pill, &RMF_MDT_BODY); if (!body) return -EPROTO; - lvb->lvb_mtime = body->mbo_mtime; - lvb->lvb_atime = body->mbo_atime; - lvb->lvb_ctime = body->mbo_ctime; - lvb->lvb_blocks = body->mbo_dom_blocks; - lvb->lvb_size = body->mbo_dom_size; + if (!(body->mbo_valid & OBD_MD_DOM_SIZE)) + return -EPROTO; + mdc_body2lvb(body, lvb); return 0; } -int mdc_enqueue_fini(struct ptlrpc_request *req, osc_enqueue_upcall_f upcall, - void *cookie, struct lustre_handle *lockh, - enum ldlm_mode mode, u64 *flags, int errcode) +int mdc_enqueue_fini(struct obd_export *exp, struct ptlrpc_request *req, + osc_enqueue_upcall_f upcall, void *cookie, + struct lustre_handle *lockh, enum ldlm_mode mode, + u64 *flags, int errcode) { struct osc_lock *ols = cookie; - struct ldlm_lock *lock; + bool glimpse = *flags & LDLM_FL_HAS_INTENT; int rc = 0; - /* The request was created before ldlm_cli_enqueue call. */ - if (errcode == ELDLM_LOCK_ABORTED) { + /* needed only for glimpse from an old server (< 2.14) */ + if (glimpse && !exp_connect_dom_lvb(exp)) + rc = mdc_fill_lvb(req, &ols->ols_lvb); + + if (glimpse && errcode == ELDLM_LOCK_ABORTED) { struct ldlm_reply *rep; rep = req_capsule_server_get(&req->rq_pill, &RMF_DLM_REP); - LASSERT(rep); - - rep->lock_policy_res2 = - ptlrpc_status_ntoh(rep->lock_policy_res2); - if (rep->lock_policy_res2) - errcode = rep->lock_policy_res2; - - rc = mdc_fill_lvb(req, &ols->ols_lvb); + if (likely(rep)) { + rep->lock_policy_res2 = + ptlrpc_status_ntoh(rep->lock_policy_res2); + if (rep->lock_policy_res2) + errcode = rep->lock_policy_res2; + } else { + rc = -EPROTO; + } *flags |= LDLM_FL_LVB_READY; } else if (errcode == ELDLM_OK) { + struct ldlm_lock *lock; + /* Callers have references, should be valid always */ lock = ldlm_handle2lock(lockh); - LASSERT(lock); - rc = mdc_fill_lvb(req, &lock->l_ost_lvb); + /* At this point ols_lvb must be filled with correct LVB either + * by mdc_fill_lvb() above or by ldlm_cli_enqueue_fini(). + * DoM uses l_ost_lvb to store LVB data, so copy it here from + * just updated ols_lvb. + */ + lock_res_and_lock(lock); + memcpy(&lock->l_ost_lvb, &ols->ols_lvb, + sizeof(lock->l_ost_lvb)); + unlock_res_and_lock(lock); LDLM_LOCK_PUT(lock); *flags |= LDLM_FL_LVB_READY; } @@ -629,6 +639,10 @@ int mdc_enqueue_interpret(const struct lu_env *env, struct ptlrpc_request *req, struct ldlm_lock *lock; struct lustre_handle *lockh = &aa->oa_lockh; enum ldlm_mode mode = aa->oa_mode; + struct ldlm_enqueue_info einfo = { + .ei_type = aa->oa_type, + .ei_mode = mode, + }; LASSERT(!aa->oa_speculative); @@ -643,7 +657,7 @@ int mdc_enqueue_interpret(const struct lu_env *env, struct ptlrpc_request *req, /* Take an additional reference so that a blocking AST that * ldlm_cli_enqueue_fini() might post for a failed lock, is guaranteed * to arrive after an upcall has been executed by - * osc_enqueue_fini(). + * mdc_enqueue_fini(). */ ldlm_lock_addref(lockh, mode); @@ -654,12 +668,12 @@ int mdc_enqueue_interpret(const struct lu_env *env, struct ptlrpc_request *req, OBD_FAIL_TIMEOUT(OBD_FAIL_OSC_CP_ENQ_RACE, 1); /* Complete obtaining the lock procedure. */ - rc = ldlm_cli_enqueue_fini(aa->oa_exp, req, aa->oa_type, 1, - aa->oa_mode, aa->oa_flags, NULL, 0, - lockh, rc); + rc = ldlm_cli_enqueue_fini(aa->oa_exp, req, &einfo, 1, aa->oa_flags, + aa->oa_lvb, aa->oa_lvb ? + sizeof(*aa->oa_lvb) : 0, lockh, rc); /* Complete mdc stuff. */ - rc = mdc_enqueue_fini(req, aa->oa_upcall, aa->oa_cookie, lockh, mode, - aa->oa_flags, rc); + rc = mdc_enqueue_fini(aa->oa_exp, req, aa->oa_upcall, aa->oa_cookie, + lockh, mode, aa->oa_flags, rc); OBD_FAIL_TIMEOUT(OBD_FAIL_OSC_CP_CANCEL_RACE, 10); @@ -678,8 +692,7 @@ int mdc_enqueue_interpret(const struct lu_env *env, struct ptlrpc_request *req, */ int mdc_enqueue_send(const struct lu_env *env, struct obd_export *exp, struct ldlm_res_id *res_id, u64 *flags, - union ldlm_policy_data *policy, - struct ost_lvb *lvb, int kms_valid, + union ldlm_policy_data *policy, struct ost_lvb *lvb, osc_enqueue_upcall_f upcall, void *cookie, struct ldlm_enqueue_info *einfo, int async) { @@ -692,17 +705,16 @@ int mdc_enqueue_send(const struct lu_env *env, struct obd_export *exp, u64 match_flags = *flags; LIST_HEAD(cancels); int rc, count; + int lvb_size; + bool compat_glimpse = glimpse && !exp_connect_dom_lvb(exp); mode = einfo->ei_mode; if (einfo->ei_mode == LCK_PR) mode |= LCK_PW; + match_flags |= LDLM_FL_LVB_READY; if (glimpse) match_flags |= LDLM_FL_BLOCK_GRANTED; - /* DOM locking uses LDLM_FL_KMS_IGNORE to mark locks wich have no valid - * LVB information, e.g. canceled locks or locks of just pruned object, - * such locks should be skipped. - */ mode = ldlm_lock_match(obd->obd_namespace, match_flags, res_id, einfo->ei_type, policy, mode, &lockh); if (mode) { @@ -733,7 +745,9 @@ int mdc_enqueue_send(const struct lu_env *env, struct obd_export *exp, if (*flags & (LDLM_FL_TEST_LOCK | LDLM_FL_MATCH_LOCK)) return -ENOLCK; - req = ptlrpc_request_alloc(class_exp2cliimp(exp), &RQF_LDLM_INTENT); + /* Glimpse is intent on old server */ + req = ptlrpc_request_alloc(class_exp2cliimp(exp), compat_glimpse ? + &RQF_LDLM_INTENT : &RQF_LDLM_ENQUEUE); if (!req) return -ENOMEM; @@ -751,20 +765,27 @@ int mdc_enqueue_send(const struct lu_env *env, struct obd_export *exp, return rc; } - /* pack the intent */ - lit = req_capsule_client_get(&req->rq_pill, &RMF_LDLM_INTENT); - lit->opc = glimpse ? IT_GLIMPSE : IT_BRW; - - req_capsule_set_size(&req->rq_pill, &RMF_MDT_MD, RCL_SERVER, 0); - req_capsule_set_size(&req->rq_pill, &RMF_ACL, RCL_SERVER, 0); - ptlrpc_request_set_replen(req); + if (compat_glimpse) { + /* pack the glimpse intent */ + lit = req_capsule_client_get(&req->rq_pill, &RMF_LDLM_INTENT); + lit->opc = IT_GLIMPSE; + } /* users of mdc_enqueue() can pass this flag for ldlm_lock_match() */ *flags &= ~LDLM_FL_BLOCK_GRANTED; - /* All MDC IO locks are intents */ - *flags |= LDLM_FL_HAS_INTENT; - rc = ldlm_cli_enqueue(exp, &req, einfo, res_id, policy, flags, NULL, - 0, LVB_T_NONE, &lockh, async); + if (compat_glimpse) { + req_capsule_set_size(&req->rq_pill, &RMF_MDT_MD, RCL_SERVER, 0); + req_capsule_set_size(&req->rq_pill, &RMF_ACL, RCL_SERVER, 0); + lvb_size = 0; + } else { + lvb_size = sizeof(*lvb); + req_capsule_set_size(&req->rq_pill, &RMF_DLM_LVB, RCL_SERVER, + lvb_size); + } + ptlrpc_request_set_replen(req); + + rc = ldlm_cli_enqueue(exp, &req, einfo, res_id, policy, flags, lvb, + lvb_size, LVB_T_OST, &lockh, async); if (async) { if (!rc) { struct osc_enqueue_args *aa; @@ -778,7 +799,7 @@ int mdc_enqueue_send(const struct lu_env *env, struct obd_export *exp, aa->oa_cookie = cookie; aa->oa_speculative = false; aa->oa_flags = flags; - aa->oa_lvb = lvb; + aa->oa_lvb = compat_glimpse ? NULL : lvb; req->rq_interpret_reply = mdc_enqueue_interpret; ptlrpcd_add_req(req); @@ -788,7 +809,7 @@ int mdc_enqueue_send(const struct lu_env *env, struct obd_export *exp, return rc; } - rc = mdc_enqueue_fini(req, upcall, cookie, &lockh, einfo->ei_mode, + rc = mdc_enqueue_fini(exp, req, upcall, cookie, &lockh, einfo->ei_mode, flags, rc); ptlrpc_req_finished(req); return rc; @@ -874,8 +895,7 @@ static int mdc_lock_enqueue(const struct lu_env *env, mdc_lock_build_policy(env, lock, policy); LASSERT(!oscl->ols_speculative); result = mdc_enqueue_send(env, osc_export(osc), resname, - &oscl->ols_flags, policy, - &oscl->ols_lvb, osc->oo_oinfo->loi_kms_valid, + &oscl->ols_flags, policy, &oscl->ols_lvb, upcall, cookie, &oscl->ols_einfo, async); if (result == 0) { if (osc_lock_is_lockless(oscl)) { @@ -1429,7 +1449,7 @@ static int mdc_object_flush(const struct lu_env *env, struct cl_object *obj, * so init it here with given osc_object. */ mdc_set_dom_lock_data(lock, cl2osc(obj)); - return mdc_dlm_blocking_ast0(env, lock, LDLM_CB_CANCELING); + return mdc_dlm_canceling(env, lock); } static const struct cl_object_operations mdc_ops = { diff --git a/fs/lustre/mdc/mdc_internal.h b/fs/lustre/mdc/mdc_internal.h index 065cba5..91e8240 100644 --- a/fs/lustre/mdc/mdc_internal.h +++ b/fs/lustre/mdc/mdc_internal.h @@ -168,6 +168,16 @@ int mdc_unpack_acl(struct ptlrpc_request *req, struct lustre_md *md) } #endif +static inline void mdc_body2lvb(struct mdt_body *body, struct ost_lvb *lvb) +{ + LASSERT(body->mbo_valid & OBD_MD_DOM_SIZE); + lvb->lvb_mtime = body->mbo_mtime; + lvb->lvb_atime = body->mbo_atime; + lvb->lvb_ctime = body->mbo_ctime; + lvb->lvb_blocks = body->mbo_dom_blocks; + lvb->lvb_size = body->mbo_dom_size; +} + static inline unsigned long hash_x_index(u64 hash, int hash64) { if (BITS_PER_LONG == 32 && hash64) diff --git a/fs/lustre/mdc/mdc_locks.c b/fs/lustre/mdc/mdc_locks.c index 8bbb9e1..dbf402a 100644 --- a/fs/lustre/mdc/mdc_locks.c +++ b/fs/lustre/mdc/mdc_locks.c @@ -872,7 +872,10 @@ static int mdc_finish_enqueue(struct obd_export *exp, LDLM_DEBUG(lock, "DoM lock is returned by: %s, size: %llu", ldlm_it2str(it->it_op), body->mbo_dom_size); - rc = mdc_fill_lvb(req, &lock->l_ost_lvb); + lock_res_and_lock(lock); + mdc_body2lvb(body, &lock->l_ost_lvb); + ldlm_lock_allow_match_locked(lock); + unlock_res_and_lock(lock); } out_lock: LDLM_LOCK_PUT(lock); @@ -1368,8 +1371,8 @@ static int mdc_intent_getattr_async_interpret(const struct lu_env *env, if (OBD_FAIL_CHECK(OBD_FAIL_MDC_GETATTR_ENQUEUE)) rc = -ETIMEDOUT; - rc = ldlm_cli_enqueue_fini(exp, req, einfo->ei_type, 1, einfo->ei_mode, - &flags, NULL, 0, lockh, rc); + rc = ldlm_cli_enqueue_fini(exp, req, einfo, 1, &flags, NULL, 0, + lockh, rc); if (rc < 0) { CERROR("%s: ldlm_cli_enqueue_fini() failed: rc = %d\n", exp->exp_obd->obd_name, rc); diff --git a/fs/lustre/obdclass/lprocfs_status.c b/fs/lustre/obdclass/lprocfs_status.c index 6ce0a5d..0ed1bd5 100644 --- a/fs/lustre/obdclass/lprocfs_status.c +++ b/fs/lustre/obdclass/lprocfs_status.c @@ -130,6 +130,7 @@ "fidmap", /* 0x10000 */ "getattr_pfid", /* 0x20000 */ "lseek", /* 0x40000 */ + "dom_lvb", /* 0x80000 */ NULL }; diff --git a/fs/lustre/osc/osc_request.c b/fs/lustre/osc/osc_request.c index 4a4b5ef..a6a8cac 100644 --- a/fs/lustre/osc/osc_request.c +++ b/fs/lustre/osc/osc_request.c @@ -2684,6 +2684,10 @@ int osc_enqueue_interpret(const struct lu_env *env, struct ptlrpc_request *req, struct ost_lvb *lvb = aa->oa_lvb; u32 lvb_len = sizeof(*lvb); u64 flags = 0; + struct ldlm_enqueue_info einfo = { + .ei_type = aa->oa_type, + .ei_mode = mode, + }; /* ldlm_cli_enqueue is holding a reference on the lock, so it must * be valid. @@ -2712,9 +2716,8 @@ int osc_enqueue_interpret(const struct lu_env *env, struct ptlrpc_request *req, } /* Complete obtaining the lock procedure. */ - rc = ldlm_cli_enqueue_fini(aa->oa_exp, req, aa->oa_type, 1, - aa->oa_mode, aa->oa_flags, lvb, lvb_len, - lockh, rc); + rc = ldlm_cli_enqueue_fini(aa->oa_exp, req, &einfo, 1, aa->oa_flags, + lvb, lvb_len, lockh, rc); /* Complete osc stuff. */ rc = osc_enqueue_fini(req, aa->oa_upcall, aa->oa_cookie, lockh, mode, aa->oa_flags, aa->oa_speculative, rc); @@ -2821,22 +2824,6 @@ int osc_enqueue_base(struct obd_export *exp, struct ldlm_res_id *res_id, if (*flags & (LDLM_FL_TEST_LOCK | LDLM_FL_MATCH_LOCK)) return -ENOLCK; - if (intent) { - req = ptlrpc_request_alloc(class_exp2cliimp(exp), - &RQF_LDLM_ENQUEUE_LVB); - if (!req) - return -ENOMEM; - - rc = ldlm_prep_enqueue_req(exp, req, NULL, 0); - if (rc) { - ptlrpc_request_free(req); - return rc; - } - - req_capsule_set_size(&req->rq_pill, &RMF_DLM_LVB, RCL_SERVER, - sizeof(*lvb)); - ptlrpc_request_set_replen(req); - } /* users of osc_enqueue() can pass this flag for ldlm_lock_match() */ *flags &= ~LDLM_FL_BLOCK_GRANTED; @@ -2869,16 +2856,12 @@ int osc_enqueue_base(struct obd_export *exp, struct ldlm_res_id *res_id, req->rq_interpret_reply = osc_enqueue_interpret; ptlrpc_set_add_req(rqset, req); - } else if (intent) { - ptlrpc_req_finished(req); } return rc; } rc = osc_enqueue_fini(req, upcall, cookie, &lockh, einfo->ei_mode, flags, speculative, rc); - if (intent) - ptlrpc_req_finished(req); return rc; } @@ -2904,16 +2887,8 @@ int osc_match_base(const struct lu_env *env, struct obd_export *exp, policy->l_extent.end |= ~PAGE_MASK; /* Next, search for already existing extent locks that will cover us */ - /* If we're trying to read, we also search for an existing PW lock. The - * VFS and page cache already protect us locally, so lots of readers/ - * writers can share a single PW lock. - */ - rc = mode; - if (mode == LCK_PR) - rc |= LCK_PW; - rc = ldlm_lock_match_with_skip(obd->obd_namespace, lflags, 0, - res_id, type, policy, rc, lockh, + res_id, type, policy, mode, lockh, match_flags); if (!rc || lflags & LDLM_FL_TEST_LOCK) return rc; diff --git a/fs/lustre/ptlrpc/wiretest.c b/fs/lustre/ptlrpc/wiretest.c index c8b97fa..fedb914 100644 --- a/fs/lustre/ptlrpc/wiretest.c +++ b/fs/lustre/ptlrpc/wiretest.c @@ -1249,6 +1249,8 @@ void lustre_assert_wire_constants(void) OBD_CONNECT2_GETATTR_PFID); LASSERTF(OBD_CONNECT2_LSEEK == 0x40000ULL, "found 0x%.16llxULL\n", OBD_CONNECT2_LSEEK); + LASSERTF(OBD_CONNECT2_DOM_LVB == 0x80000ULL, "found 0x%.16llxULL\n", + OBD_CONNECT2_DOM_LVB); LASSERTF(OBD_CKSUM_CRC32 == 0x00000001UL, "found 0x%.8xUL\n", (unsigned int)OBD_CKSUM_CRC32); LASSERTF(OBD_CKSUM_ADLER == 0x00000002UL, "found 0x%.8xUL\n", diff --git a/include/uapi/linux/lustre/lustre_idl.h b/include/uapi/linux/lustre/lustre_idl.h index f953815..449ac47 100644 --- a/include/uapi/linux/lustre/lustre_idl.h +++ b/include/uapi/linux/lustre/lustre_idl.h @@ -839,6 +839,7 @@ struct ptlrpc_body_v2 { #define OBD_CONNECT2_FIDMAP 0x10000ULL /* FID map */ #define OBD_CONNECT2_GETATTR_PFID 0x20000ULL /* pack parent FID in getattr */ #define OBD_CONNECT2_LSEEK 0x40000ULL /* SEEK_HOLE/DATA RPC */ +#define OBD_CONNECT2_DOM_LVB 0x80000ULL /* pack DOM glimpse data in LVB */ /* XXX README XXX: * Please DO NOT add flag values here before first ensuring that this same * flag value is not in use on some other branch. Please clear any such From patchwork Thu Jan 21 17:16:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12037211 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7C6DCC433DB for ; Thu, 21 Jan 2021 17:19:06 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 154B923A57 for ; Thu, 21 Jan 2021 17:19:06 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 154B923A57 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 7847521FDCB; Thu, 21 Jan 2021 09:18:11 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 0ED5821FC5B for ; Thu, 21 Jan 2021 09:17:14 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 7547E100848D; Thu, 21 Jan 2021 12:17:05 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 7439A1B49F; Thu, 21 Jan 2021 12:17:05 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 21 Jan 2021 12:16:52 -0500 Message-Id: <1611249422-556-30-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org> References: <1611249422-556-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 29/39] lustre: llite: fiemap set flags for encrypted files X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Sebastien Buisson FIEMAP ioctl needs to set FIEMAP_EXTENT_DATA_ENCRYPTED|FIEMAP_EXTENT_ENCODED flags for all extents of files encrypted by fscrypt. WC-bug-id: https://jira.whamcloud.com/browse/LU-14149 Lustre-commit: 33322f3a24882d ("LU-14149 llite: fiemap set flags for encrypted files") Signed-off-by: Sebastien Buisson Reviewed-on: https://review.whamcloud.com/40852 Reviewed-by: Andreas Dilger Reviewed-by: Lai Siyao Signed-off-by: James Simmons --- fs/lustre/llite/file.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/fs/lustre/llite/file.c b/fs/lustre/llite/file.c index 5d03fc3..a3a8d1a 100644 --- a/fs/lustre/llite/file.c +++ b/fs/lustre/llite/file.c @@ -4986,6 +4986,15 @@ static int ll_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo, rc = ll_do_fiemap(inode, fiemap, num_bytes); + if (IS_ENCRYPTED(inode)) { + int i; + + for (i = 0; i < fiemap->fm_mapped_extents; i++) + fiemap->fm_extents[i].fe_flags |= + FIEMAP_EXTENT_DATA_ENCRYPTED | + FIEMAP_EXTENT_ENCODED; + } + fieinfo->fi_flags = fiemap->fm_flags; fieinfo->fi_extents_mapped = fiemap->fm_mapped_extents; if (extent_count > 0 && From patchwork Thu Jan 21 17:16:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12037189 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 73025C433E0 for ; Thu, 21 Jan 2021 17:18:25 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 25B1923A5A for ; Thu, 21 Jan 2021 17:18:25 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 25B1923A5A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 3206721FE9F; Thu, 21 Jan 2021 09:17:52 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 47A1D21FC5B for ; Thu, 21 Jan 2021 09:17:14 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 7918B100848E; Thu, 21 Jan 2021 12:17:05 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 778AD1B49B; Thu, 21 Jan 2021 12:17:05 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 21 Jan 2021 12:16:53 -0500 Message-Id: <1611249422-556-31-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org> References: <1611249422-556-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 30/39] lustre: ldlm: don't compute sumsq for pool stats X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Andreas Dilger Remove the calculation of sumsq from the LDLM pool stats, since these stats are almost never used, while conversely the pools are updated frequently. WC-bug-id: https://jira.whamcloud.com/browse/LU-9114 Lustre-commit: 966f6bb550be52e ("LU-9114 ldlm: don't compute sumsq for pool stats") Signed-off-by: Andreas Dilger Reviewed-on: https://review.whamcloud.com/39435 Reviewed-by: Jian Yu Reviewed-by: James Simmons Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/ldlm/ldlm_pool.c | 33 +++++++++++---------------------- 1 file changed, 11 insertions(+), 22 deletions(-) diff --git a/fs/lustre/ldlm/ldlm_pool.c b/fs/lustre/ldlm/ldlm_pool.c index 9cee24b..2e4d16b 100644 --- a/fs/lustre/ldlm/ldlm_pool.c +++ b/fs/lustre/ldlm/ldlm_pool.c @@ -606,38 +606,27 @@ static int ldlm_pool_debugfs_init(struct ldlm_pool *pl) } lprocfs_counter_init(pl->pl_stats, LDLM_POOL_GRANTED_STAT, - LPROCFS_CNTR_AVGMINMAX | LPROCFS_CNTR_STDDEV, - "granted", "locks"); + LPROCFS_CNTR_AVGMINMAX, "granted", "locks"); lprocfs_counter_init(pl->pl_stats, LDLM_POOL_GRANT_STAT, - LPROCFS_CNTR_AVGMINMAX | LPROCFS_CNTR_STDDEV, - "grant", "locks"); + LPROCFS_CNTR_AVGMINMAX, "grant", "locks"); lprocfs_counter_init(pl->pl_stats, LDLM_POOL_CANCEL_STAT, - LPROCFS_CNTR_AVGMINMAX | LPROCFS_CNTR_STDDEV, - "cancel", "locks"); + LPROCFS_CNTR_AVGMINMAX, "cancel", "locks"); lprocfs_counter_init(pl->pl_stats, LDLM_POOL_GRANT_RATE_STAT, - LPROCFS_CNTR_AVGMINMAX | LPROCFS_CNTR_STDDEV, - "grant_rate", "locks/s"); + LPROCFS_CNTR_AVGMINMAX, "grant_rate", "locks/s"); lprocfs_counter_init(pl->pl_stats, LDLM_POOL_CANCEL_RATE_STAT, - LPROCFS_CNTR_AVGMINMAX | LPROCFS_CNTR_STDDEV, - "cancel_rate", "locks/s"); + LPROCFS_CNTR_AVGMINMAX, "cancel_rate", "locks/s"); lprocfs_counter_init(pl->pl_stats, LDLM_POOL_GRANT_PLAN_STAT, - LPROCFS_CNTR_AVGMINMAX | LPROCFS_CNTR_STDDEV, - "grant_plan", "locks/s"); + LPROCFS_CNTR_AVGMINMAX, "grant_plan", "locks/s"); lprocfs_counter_init(pl->pl_stats, LDLM_POOL_SLV_STAT, - LPROCFS_CNTR_AVGMINMAX | LPROCFS_CNTR_STDDEV, - "slv", "slv"); + LPROCFS_CNTR_AVGMINMAX, "slv", "slv"); lprocfs_counter_init(pl->pl_stats, LDLM_POOL_SHRINK_REQTD_STAT, - LPROCFS_CNTR_AVGMINMAX | LPROCFS_CNTR_STDDEV, - "shrink_request", "locks"); + LPROCFS_CNTR_AVGMINMAX, "shrink_request", "locks"); lprocfs_counter_init(pl->pl_stats, LDLM_POOL_SHRINK_FREED_STAT, - LPROCFS_CNTR_AVGMINMAX | LPROCFS_CNTR_STDDEV, - "shrink_freed", "locks"); + LPROCFS_CNTR_AVGMINMAX, "shrink_freed", "locks"); lprocfs_counter_init(pl->pl_stats, LDLM_POOL_RECALC_STAT, - LPROCFS_CNTR_AVGMINMAX | LPROCFS_CNTR_STDDEV, - "recalc_freed", "locks"); + LPROCFS_CNTR_AVGMINMAX, "recalc_freed", "locks"); lprocfs_counter_init(pl->pl_stats, LDLM_POOL_TIMING_STAT, - LPROCFS_CNTR_AVGMINMAX | LPROCFS_CNTR_STDDEV, - "recalc_timing", "sec"); + LPROCFS_CNTR_AVGMINMAX, "recalc_timing", "sec"); debugfs_create_file("stats", 0644, pl->pl_debugfs_entry, pl->pl_stats, &lprocfs_stats_seq_fops); From patchwork Thu Jan 21 17:16:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12037193 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 75434C433DB for ; Thu, 21 Jan 2021 17:18:32 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 373EE23A5A for ; Thu, 21 Jan 2021 17:18:32 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 373EE23A5A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 541E221FED0; Thu, 21 Jan 2021 09:17:55 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 912B521FC7A for ; Thu, 21 Jan 2021 09:17:14 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 7C053100848F; Thu, 21 Jan 2021 12:17:05 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 7AC7D1B49C; Thu, 21 Jan 2021 12:17:05 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 21 Jan 2021 12:16:54 -0500 Message-Id: <1611249422-556-32-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org> References: <1611249422-556-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 31/39] lustre: lov: FIEMAP support for PFL and FLR file X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Bobi Jam * use the high 16 bits of fe_device to record the absolute stripe number from the beginning we are processing, so that continuous call can resume from the stripe specified by it. WC-bug-id: https://jira.whamcloud.com/browse/LU-11848 Lustre-commit: 409719608cf0f60 ("LU-11848 lov: FIEMAP support for PFL and FLR file") Signed-off-by: Bobi Jam Reviewed-on: https://review.whamcloud.com/40766 Reviewed-by: Andreas Dilger Reviewed-by: Alex Zhuravlev Signed-off-by: James Simmons --- fs/lustre/lov/lov_object.c | 248 +++++++++++++++++++----------- fs/lustre/lov/lov_offset.c | 8 +- fs/lustre/ptlrpc/wiretest.c | 1 - include/uapi/linux/lustre/lustre_fiemap.h | 30 +++- 4 files changed, 191 insertions(+), 96 deletions(-) diff --git a/fs/lustre/lov/lov_object.c b/fs/lustre/lov/lov_object.c index 0762cc5..3fcd342 100644 --- a/fs/lustre/lov/lov_object.c +++ b/fs/lustre/lov/lov_object.c @@ -1487,21 +1487,34 @@ static int fiemap_calc_last_stripe(struct lov_stripe_md *lsm, int index, int start_stripe, int *stripe_count) { struct lov_stripe_md_entry *lsme = lsm->lsm_entries[index]; + int init_stripe; int last_stripe; - u64 obd_start; - u64 obd_end; int i, j; + init_stripe = lov_stripe_number(lsm, index, ext->e_start); + if (ext->e_end - ext->e_start > lsme->lsme_stripe_size * lsme->lsme_stripe_count) { - last_stripe = (start_stripe < 1 ? lsme->lsme_stripe_count - 1 : - start_stripe - 1); - *stripe_count = lsme->lsme_stripe_count; + if (init_stripe == start_stripe) { + last_stripe = (start_stripe < 1) ? + lsme->lsme_stripe_count - 1 : start_stripe - 1; + *stripe_count = lsme->lsme_stripe_count; + } else if (init_stripe < start_stripe) { + last_stripe = (init_stripe < 1) ? + lsme->lsme_stripe_count - 1 : init_stripe - 1; + *stripe_count = lsme->lsme_stripe_count - + (start_stripe - init_stripe); + } else { + last_stripe = init_stripe - 1; + *stripe_count = init_stripe - start_stripe; + } } else { for (j = 0, i = start_stripe; j < lsme->lsme_stripe_count; i = (i + 1) % lsme->lsme_stripe_count, j++) { - if (lov_stripe_intersects(lsm, index, i, ext, - &obd_start, &obd_end) == 0) + if (!lov_stripe_intersects(lsm, index, i, ext, NULL, + NULL)) + break; + if ((start_stripe != init_stripe) && (i == init_stripe)) break; } *stripe_count = j; @@ -1524,13 +1537,14 @@ static int fiemap_calc_last_stripe(struct lov_stripe_md *lsm, int index, static void fiemap_prepare_and_copy_exts(struct fiemap *fiemap, struct fiemap_extent *lcl_fm_ext, int ost_index, unsigned int ext_count, - int current_extent) + int current_extent, int abs_stripeno) { unsigned int ext; char *to; for (ext = 0; ext < ext_count; ext++) { - lcl_fm_ext[ext].fe_device = ost_index; + set_fe_device_stripenr(&lcl_fm_ext[ext], ost_index, + abs_stripeno); lcl_fm_ext[ext].fe_flags |= FIEMAP_EXTENT_NET; } @@ -1565,26 +1579,14 @@ static u64 fiemap_calc_fm_end_offset(struct fiemap *fiemap, { struct lov_stripe_md_entry *lsme = lsm->lsm_entries[index]; u64 local_end = fiemap->fm_extents[0].fe_logical; - u64 lun_start, lun_end; + u64 lun_end; u64 fm_end_offset; int stripe_no = -1; - int i; if (!fiemap->fm_extent_count || !fiemap->fm_extents[0].fe_logical) return 0; - /* Find out stripe_no from ost_index saved in the fe_device */ - for (i = 0; i < lsme->lsme_stripe_count; i++) { - struct lov_oinfo *oinfo = lsme->lsme_oinfo[i]; - - if (lov_oinfo_is_dummy(oinfo)) - continue; - - if (oinfo->loi_ost_idx == fiemap->fm_extents[0].fe_device) { - stripe_no = i; - break; - } - } + stripe_no = *start_stripe; if (stripe_no == -1) return -EINVAL; @@ -1593,11 +1595,9 @@ static u64 fiemap_calc_fm_end_offset(struct fiemap *fiemap, * If we have finished mapping on previous device, shift logical * offset to start of next device */ - if (lov_stripe_intersects(lsm, index, stripe_no, ext, - &lun_start, &lun_end) != 0 && + if (lov_stripe_intersects(lsm, index, stripe_no, ext, NULL, &lun_end) && local_end < lun_end) { fm_end_offset = local_end; - *start_stripe = stripe_no; } else { /* This is a special value to indicate that caller should * calculate offset in next stripe. @@ -1611,16 +1611,16 @@ static u64 fiemap_calc_fm_end_offset(struct fiemap *fiemap, struct fiemap_state { struct fiemap *fs_fm; - struct lu_extent fs_ext; + struct lu_extent fs_ext; /* current entry extent */ u64 fs_length; - u64 fs_end_offset; - int fs_cur_extent; - int fs_cnt_need; + u64 fs_end_offset; /* last iteration offset */ + int fs_cur_extent; /* collected exts so far */ + int fs_cnt_need; /* # of extents buf can hold */ int fs_start_stripe; int fs_last_stripe; - bool fs_device_done; - bool fs_finish_stripe; - bool fs_enough; + bool fs_device_done; /* enough for this OST */ + bool fs_finish_stripe; /* reached fs_last_stripe */ + bool fs_enough; /* enough for this call */ }; static struct cl_object *lov_find_subobj(const struct lu_env *env, @@ -1669,17 +1669,17 @@ static struct cl_object *lov_find_subobj(const struct lu_env *env, static int fiemap_for_stripe(const struct lu_env *env, struct cl_object *obj, struct lov_stripe_md *lsm, struct fiemap *fiemap, size_t *buflen, struct ll_fiemap_info_key *fmkey, - int index, int stripeno, struct fiemap_state *fs) + int index, int stripe_last, int stripeno, + struct fiemap_state *fs) { struct lov_stripe_md_entry *lsme = lsm->lsm_entries[index]; struct cl_object *subobj; struct lov_obd *lov = lu2lov_dev(obj->co_lu.lo_dev)->ld_lov; struct fiemap_extent *fm_ext = &fs->fs_fm->fm_extents[0]; - u64 req_fm_len; /* Stores length of required mapping */ + u64 req_fm_len; /* max requested extent coverage */ u64 len_mapped_single_call; - u64 lun_start; - u64 lun_end; - u64 obd_object_end; + u64 obd_start; + u64 obd_end; unsigned int ext_count; /* EOF for object */ bool ost_eof = false; @@ -1691,24 +1691,24 @@ static int fiemap_for_stripe(const struct lu_env *env, struct cl_object *obj, fs->fs_device_done = false; /* Find out range of mapping on this stripe */ if ((lov_stripe_intersects(lsm, index, stripeno, &fs->fs_ext, - &lun_start, &obd_object_end)) == 0) + &obd_start, &obd_end)) == 0) return 0; if (lov_oinfo_is_dummy(lsme->lsme_oinfo[stripeno])) return -EIO; /* If this is a continuation FIEMAP call and we are on - * starting stripe then lun_start needs to be set to + * starting stripe then obd_start needs to be set to * end_offset */ if (fs->fs_end_offset != 0 && stripeno == fs->fs_start_stripe) - lun_start = fs->fs_end_offset; + obd_start = fs->fs_end_offset; - lun_end = lov_size_to_stripe(lsm, index, fs->fs_ext.e_end, stripeno); - if (lun_start == lun_end) + if (lov_size_to_stripe(lsm, index, fs->fs_ext.e_end, stripeno) == + obd_start) return 0; - req_fm_len = obd_object_end - lun_start + 1; + req_fm_len = obd_end - obd_start + 1; fs->fs_fm->fm_length = 0; len_mapped_single_call = 0; @@ -1729,7 +1729,7 @@ static int fiemap_for_stripe(const struct lu_env *env, struct cl_object *obj, fs->fs_cur_extent; } - lun_start += len_mapped_single_call; + obd_start += len_mapped_single_call; fs->fs_fm->fm_length = req_fm_len - len_mapped_single_call; req_fm_len = fs->fs_fm->fm_length; /** @@ -1753,14 +1753,14 @@ static int fiemap_for_stripe(const struct lu_env *env, struct cl_object *obj, fs->fs_fm->fm_flags |= FIEMAP_EXTENT_LAST; fs->fs_fm->fm_mapped_extents = 1; - fm_ext[0].fe_logical = lun_start; - fm_ext[0].fe_length = obd_object_end - lun_start + 1; + fm_ext[0].fe_logical = obd_start; + fm_ext[0].fe_length = obd_end - obd_start + 1; fm_ext[0].fe_flags |= FIEMAP_EXTENT_UNKNOWN; goto inactive_tgt; } - fs->fs_fm->fm_start = lun_start; + fs->fs_fm->fm_start = obd_start; fs->fs_fm->fm_flags &= ~FIEMAP_FLAG_DEVICE_ORDER; memcpy(&fmkey->lfik_fiemap, fs->fs_fm, sizeof(*fs->fs_fm)); *buflen = fiemap_count_to_size(fs->fs_fm->fm_extent_count); @@ -1799,7 +1799,7 @@ static int fiemap_for_stripe(const struct lu_env *env, struct cl_object *obj, /* prepare to copy retrived map extents */ len_mapped_single_call = fm_ext[ext_count - 1].fe_logical + fm_ext[ext_count - 1].fe_length - - lun_start; + obd_start; /* Have we finished mapping on this device? */ if (req_fm_len <= len_mapped_single_call) { @@ -1821,7 +1821,8 @@ static int fiemap_for_stripe(const struct lu_env *env, struct cl_object *obj, } fiemap_prepare_and_copy_exts(fiemap, fm_ext, ost_index, - ext_count, fs->fs_cur_extent); + ext_count, fs->fs_cur_extent, + stripe_last + stripeno); fs->fs_cur_extent += ext_count; /* Ran out of available extents? */ @@ -1863,12 +1864,17 @@ static int lov_object_fiemap(const struct lu_env *env, struct cl_object *obj, loff_t whole_start; loff_t whole_end; int entry; - int start_entry; + int start_entry = -1; int end_entry; int cur_stripe = 0; int stripe_count; int rc = 0; struct fiemap_state fs = { NULL }; + struct lu_extent range; + int cur_ext; + int stripe_last; + int start_stripe = 0; + bool resume = false; lsm = lov_lsm_addref(cl2lov(obj)); if (!lsm) { @@ -1936,8 +1942,6 @@ static int lov_object_fiemap(const struct lu_env *env, struct cl_object *obj, */ if (fiemap_count_to_size(fiemap->fm_extent_count) > *buflen) fiemap->fm_extent_count = fiemap_size_to_count(*buflen); - if (!fiemap->fm_extent_count) - fs.fs_cnt_need = 0; fs.fs_enough = false; fs.fs_cur_extent = 0; @@ -1951,73 +1955,142 @@ static int lov_object_fiemap(const struct lu_env *env, struct cl_object *obj, goto out_fm_local; } whole_end = (fiemap->fm_length == OBD_OBJECT_EOF) ? - fmkey->lfik_oa.o_size : - whole_start + fiemap->fm_length - 1; + fmkey->lfik_oa.o_size + 1 : + whole_start + fiemap->fm_length; /** * If fiemap->fm_length != OBD_OBJECT_EOF but whole_end exceeds file * size */ - if (whole_end > fmkey->lfik_oa.o_size) - whole_end = fmkey->lfik_oa.o_size; + if (whole_end > fmkey->lfik_oa.o_size + 1) + whole_end = fmkey->lfik_oa.o_size + 1; - start_entry = lov_lsm_entry(lsm, whole_start); - end_entry = lov_lsm_entry(lsm, whole_end); - if (end_entry == -1) - end_entry = lsm->lsm_entry_count - 1; + /** + * the high 16bits of fe_device remember which stripe the last + * call has been arrived, we'd continue from there in this call. + */ + if (fiemap->fm_extent_count && fiemap->fm_extents[0].fe_logical) + resume = true; + stripe_last = get_fe_stripenr(&fiemap->fm_extents[0]); + /** + * stripe_last records stripe number we've been processed in the last + * call + */ + end_entry = lsm->lsm_entry_count - 1; + cur_stripe = 0; + for (entry = 0; entry <= end_entry; entry++) { + lsme = lsm->lsm_entries[entry]; + if (cur_stripe + lsme->lsme_stripe_count >= stripe_last) { + start_entry = entry; + start_stripe = stripe_last - cur_stripe; + break; + } + cur_stripe += lsme->lsme_stripe_count; + } - if (start_entry == -1 || end_entry == -1) { + if (start_entry == -1) { + CERROR(DFID": FIEMAP does not init start entry, cur_stripe=%d, stripe_last=%d\n", + PFID(lu_object_fid(&obj->co_lu)), + cur_stripe, stripe_last); rc = -EINVAL; goto out_fm_local; } + /** + * @start_entry & @start_stripe records the position of fiemap + * resumption @stripe_last keeps recording the absolution position + * we'are processing. @resume indicates we'd honor @start_stripe. + */ + + range.e_start = whole_start; + range.e_end = whole_end; - /* TODO: rewrite it with lov_foreach_io_layout() */ for (entry = start_entry; entry <= end_entry; entry++) { + /* remeber to update stripe_last accordingly */ lsme = lsm->lsm_entries[entry]; - if (!lsme_inited(lsme)) - break; + /* FLR could contain component holes between entries */ + if (!lsme_inited(lsme)) { + stripe_last += lsme->lsme_stripe_count; + resume = false; + continue; + } - if (entry == start_entry) - fs.fs_ext.e_start = whole_start; - else + if (!lu_extent_is_overlapped(&range, &lsme->lsme_extent)) { + stripe_last += lsme->lsme_stripe_count; + resume = false; + continue; + } + + /* prepare for a component entry iteration */ + if (lsme->lsme_extent.e_start > whole_start) fs.fs_ext.e_start = lsme->lsme_extent.e_start; - if (entry == end_entry) + else + fs.fs_ext.e_start = whole_start; + if (lsme->lsme_extent.e_end > whole_end) fs.fs_ext.e_end = whole_end; else - fs.fs_ext.e_end = lsme->lsme_extent.e_end - 1; - fs.fs_length = fs.fs_ext.e_end - fs.fs_ext.e_start + 1; + fs.fs_ext.e_end = lsme->lsme_extent.e_end; /* Calculate start stripe, last stripe and length of mapping */ - fs.fs_start_stripe = lov_stripe_number(lsm, entry, - fs.fs_ext.e_start); + if (resume) { + fs.fs_start_stripe = start_stripe; + /* put stripe_last to the first stripe of the comp */ + stripe_last -= start_stripe; + resume = false; + } else { + fs.fs_start_stripe = lov_stripe_number(lsm, entry, + fs.fs_ext.e_start); + } fs.fs_last_stripe = fiemap_calc_last_stripe(lsm, entry, &fs.fs_ext, fs.fs_start_stripe, &stripe_count); - fs.fs_end_offset = fiemap_calc_fm_end_offset(fiemap, lsm, entry, - &fs.fs_ext, - &fs.fs_start_stripe); + /** + * A new mirror component is under process, reset + * fs.fs_end_offset and then fiemap_for_stripe() starts from + * the overlapping extent, otherwise starts from + * fs.fs_end_offset. + */ + if (entry > start_entry && lsme->lsme_extent.e_start == 0) { + /* new mirror */ + fs.fs_end_offset = 0; + } else { + fs.fs_end_offset = fiemap_calc_fm_end_offset(fiemap, + lsm, entry, + &fs.fs_ext, + &fs.fs_start_stripe); + } + /* Check each stripe */ for (cur_stripe = fs.fs_start_stripe; stripe_count > 0; --stripe_count, cur_stripe = (cur_stripe + 1) % lsme->lsme_stripe_count) { + /* reset fs_finish_stripe */ + fs.fs_finish_stripe = false; rc = fiemap_for_stripe(env, obj, lsm, fiemap, buflen, - fmkey, entry, cur_stripe, &fs); + fmkey, entry, stripe_last, + cur_stripe, &fs); if (rc < 0) goto out_fm_local; - if (fs.fs_enough) + if (fs.fs_enough) { + stripe_last += cur_stripe; goto finish; + } if (fs.fs_finish_stripe) break; } /* for each stripe */ - } /* for covering layout component */ + stripe_last += lsme->lsme_stripe_count; + } /* for covering layout component entry */ - /* - * We've traversed all components, set @entry to the last component - * entry, it's for the last stripe check. - */ - entry--; finish: + if (fs.fs_cur_extent > 0) + cur_ext = fs.fs_cur_extent - 1; + else + cur_ext = 0; + + /* done all the processing */ + if (entry > end_entry) + fiemap->fm_extents[cur_ext].fe_flags |= FIEMAP_EXTENT_LAST; + /* * Indicate that we are returning device offsets unless file just has * single stripe @@ -2030,13 +2103,6 @@ static int lov_object_fiemap(const struct lu_env *env, struct cl_object *obj, if (!fiemap->fm_extent_count) goto skip_last_device_calc; - /* - * Check if we have reached the last stripe and whether mapping for that - * stripe is done. - */ - if ((cur_stripe == fs.fs_last_stripe) && fs.fs_device_done) - fiemap->fm_extents[fs.fs_cur_extent - 1].fe_flags |= - FIEMAP_EXTENT_LAST; skip_last_device_calc: fiemap->fm_mapped_extents = fs.fs_cur_extent; out_fm_local: diff --git a/fs/lustre/lov/lov_offset.c b/fs/lustre/lov/lov_offset.c index b53ce43..ca763af 100644 --- a/fs/lustre/lov/lov_offset.c +++ b/fs/lustre/lov/lov_offset.c @@ -227,18 +227,24 @@ u64 lov_size_to_stripe(struct lov_stripe_md *lsm, int index, u64 file_size, * that is contained within the lov extent. this returns true if the given * stripe does intersect with the lov extent. * - * Closed interval [@obd_start, @obd_end] will be returned. + * Closed interval [@obd_start, @obd_end] will be returned if caller needs them. */ int lov_stripe_intersects(struct lov_stripe_md *lsm, int index, int stripeno, struct lu_extent *ext, u64 *obd_start, u64 *obd_end) { struct lov_stripe_md_entry *entry = lsm->lsm_entries[index]; int start_side, end_side; + u64 loc_start, loc_end; u64 start, end; if (!lu_extent_is_overlapped(ext, &entry->lsme_extent)) return 0; + if (!obd_start) + obd_start = &loc_start; + if (!obd_end) + obd_end = &loc_end; + start = max_t(u64, ext->e_start, entry->lsme_extent.e_start); end = min_t(u64, ext->e_end, entry->lsme_extent.e_end); if (end != OBD_OBJECT_EOF) diff --git a/fs/lustre/ptlrpc/wiretest.c b/fs/lustre/ptlrpc/wiretest.c index fedb914..a500a87 100644 --- a/fs/lustre/ptlrpc/wiretest.c +++ b/fs/lustre/ptlrpc/wiretest.c @@ -4262,7 +4262,6 @@ void lustre_assert_wire_constants(void) BUILD_BUG_ON(FIEMAP_EXTENT_UNWRITTEN != 0x00000800); BUILD_BUG_ON(FIEMAP_EXTENT_MERGED != 0x00001000); BUILD_BUG_ON(FIEMAP_EXTENT_SHARED != 0x00002000); - BUILD_BUG_ON(FIEMAP_EXTENT_NO_DIRECT != 0x40000000); BUILD_BUG_ON(FIEMAP_EXTENT_NET != 0x80000000); #ifdef CONFIG_FS_POSIX_ACL diff --git a/include/uapi/linux/lustre/lustre_fiemap.h b/include/uapi/linux/lustre/lustre_fiemap.h index 4ae1850..f93e107 100644 --- a/include/uapi/linux/lustre/lustre_fiemap.h +++ b/include/uapi/linux/lustre/lustre_fiemap.h @@ -43,9 +43,35 @@ #include #include -/* XXX: We use fiemap_extent::fe_reserved[0] */ +/** + * XXX: We use fiemap_extent::fe_reserved[0], notice the high 16bits of it + * is used to locate the stripe number starting from the very beginning to + * resume the fiemap call. + */ #define fe_device fe_reserved[0] +static inline int get_fe_device(struct fiemap_extent *fe) +{ + return fe->fe_device & 0xffff; +} +static inline void set_fe_device(struct fiemap_extent *fe, int devno) +{ + fe->fe_device = (fe->fe_device & 0xffff0000) | (devno & 0xffff); +} +static inline int get_fe_stripenr(struct fiemap_extent *fe) +{ + return fe->fe_device >> 16; +} +static inline void set_fe_stripenr(struct fiemap_extent *fe, int nr) +{ + fe->fe_device = (fe->fe_device & 0xffff) | (nr << 16); +} +static inline void set_fe_device_stripenr(struct fiemap_extent *fe, int devno, + int nr) +{ + fe->fe_device = (nr << 16) | (devno & 0xffff); +} + static inline __kernel_size_t fiemap_count_to_size(__kernel_size_t extent_count) { return sizeof(struct fiemap) + extent_count * @@ -64,8 +90,6 @@ static inline unsigned int fiemap_size_to_count(__kernel_size_t array_size) #undef FIEMAP_FLAGS_COMPAT #endif -/* Lustre specific flags - use a high bit, don't conflict with upstream flag */ -#define FIEMAP_EXTENT_NO_DIRECT 0x40000000 /* Data mapping undefined */ #define FIEMAP_EXTENT_NET 0x80000000 /* Data stored remotely. * Sets NO_DIRECT flag */ From patchwork Thu Jan 21 17:16:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12037167 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3501EC433E0 for ; Thu, 21 Jan 2021 17:17:49 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id AE3B023A57 for ; Thu, 21 Jan 2021 17:17:48 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AE3B023A57 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 6F31C21FCB0; Thu, 21 Jan 2021 09:17:34 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id DD67521FC5B for ; Thu, 21 Jan 2021 09:17:14 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 7EFB81008490; Thu, 21 Jan 2021 12:17:05 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 7E0D41B49D; Thu, 21 Jan 2021 12:17:05 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 21 Jan 2021 12:16:55 -0500 Message-Id: <1611249422-556-33-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org> References: <1611249422-556-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 32/39] lustre: mdc: process changelogs_catalog from the oldest rec X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Etienne AUJAMES The chlg_load use the LLOG_CAT_FIRST to process changelogs. This values will process record in the catalog always starting with index 0 to the newest record. So when catalog reach the end of indexes and when records are saved at the beginning of catalog, the llog_cat_process will ignore records at the end. This patch change the "startcat" value LLOG_CAT_FIRST to 0 to scan the catalog from the oldest record to the newest. Fixes: d95486c4 (lustre: mdc: polling mode for changelog reader) WC-bug-id: https://jira.whamcloud.com/browse/LU-14158 Lustre-commit: ad4c8633498848 ("LU-14158 mdc: process changelogs_catalog from the oldest rec") Signed-off-by: Etienne AUJAMES Reviewed-on: https://review.whamcloud.com/40786 Reviewed-by: Sebastien Buisson Reviewed-by: Mike Pershin Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/mdc/mdc_changelog.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/lustre/mdc/mdc_changelog.c b/fs/lustre/mdc/mdc_changelog.c index 8531edb..f671f46 100644 --- a/fs/lustre/mdc/mdc_changelog.c +++ b/fs/lustre/mdc/mdc_changelog.c @@ -287,7 +287,7 @@ static int chlg_load(void *args) struct llog_handle *llh = NULL; int rc; - crs->crs_last_catidx = -1; + crs->crs_last_catidx = 0; crs->crs_last_idx = 0; again: From patchwork Thu Jan 21 17:16:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12037199 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8C52BC433DB for ; Thu, 21 Jan 2021 17:18:44 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3F13623A57 for ; Thu, 21 Jan 2021 17:18:44 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3F13623A57 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 7006821FDF1; Thu, 21 Jan 2021 09:18:00 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 252D521FC85 for ; Thu, 21 Jan 2021 09:17:15 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 81E441008491; Thu, 21 Jan 2021 12:17:05 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 814791B49E; Thu, 21 Jan 2021 12:17:05 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 21 Jan 2021 12:16:56 -0500 Message-Id: <1611249422-556-34-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org> References: <1611249422-556-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 33/39] lustre: ldlm: Use req_mode while lock cleanup X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Yang Sheng , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Yang Sheng For local lock, the decref cannot count exactly by granted_mode if the lock has not been granted. LustreError: (ldlm_lock.c:354:ldlm_lock_destroy_internal()) ### lock still has references ns: ?? lock: ffff88342aa07200/0x9b92ad3407bea22a lrc: 4/0,1 mode: --/PW res: ?? rrc=?? type: ??? flags: 0x10106400000000 nid: local remote: 0x5248822d3123ac19 expref: -99 pid: 14515 timeout: 0 lvb_type: 0 LustreError: (ldlm_lock.c:355:ldlm_lock_destroy_internal()) LBUG Pid: 14562, comm: ll_imp_inval 3.10.0-693.21.1.el7.x86_64 #1 SMP Call Trace: [] save_stack_trace_tsk+0x22/0x40 [] libcfs_call_trace+0x8c/0xc0 [libcfs] [] lbug_with_loc+0x4c/0xa0 [libcfs] [] ldlm_lock_destroy_internal+0x269/0x2a0 [ptlrpc] [] ldlm_lock_destroy_nolock+0x2b/0x110 [ptlrpc] [] ldlm_flock_completion_ast+0x4f5/0x1080 [ptlrpc] [] cleanup_resource+0x18e/0x370 [ptlrpc] [] ldlm_resource_clean+0x53/0x60 [ptlrpc] [] cfs_hash_for_each_relax+0x250/0x450 [libcfs] [] cfs_hash_for_each_nolock+0x75/0x1c0 [libcfs] [] ldlm_namespace_cleanup+0x30/0xc0 [ptlrpc] [] mdc_import_event+0x1b6/0xa20 [mdc] [] ptlrpc_invalidate_import+0x220/0x8f0 [ptlrpc] [] ptlrpc_invalidate_import_thread+0x48/0x2b0 [ptlrpc] [] kthread+0xd1/0xe0 WC-bug-id: https://jira.whamcloud.com/browse/LU-14082 Lustre-commit: a11c18cbab00d0 ("LU-14082 ldlm: Use req_mode while lock cleanup") Signed-off-by: Yang Sheng Reviewed-on: https://review.whamcloud.com/40433 Reviewed-by: Andreas Dilger Reviewed-by: Bobi Jam Reviewed-by: Alex Zhuravlev Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/ldlm/ldlm_flock.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/lustre/ldlm/ldlm_flock.c b/fs/lustre/ldlm/ldlm_flock.c index 720362f..b4916cb15 100644 --- a/fs/lustre/ldlm/ldlm_flock.c +++ b/fs/lustre/ldlm/ldlm_flock.c @@ -414,7 +414,7 @@ static int ldlm_process_flock_lock(struct ldlm_lock *req) if (ldlm_is_test_lock(lock) || ldlm_is_flock_deadlock(lock)) mode = getlk->fl_type; else - mode = lock->l_granted_mode; + mode = lock->l_req_mode; if (ldlm_is_flock_deadlock(lock)) { LDLM_DEBUG(lock, From patchwork Thu Jan 21 17:16:57 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12037213 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BD150C433E0 for ; Thu, 21 Jan 2021 17:19:09 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 58EAF23A57 for ; Thu, 21 Jan 2021 17:19:09 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 58EAF23A57 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 561AA21FDE7; Thu, 21 Jan 2021 09:18:14 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 603C621FC95 for ; Thu, 21 Jan 2021 09:17:15 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 850CF1008492; Thu, 21 Jan 2021 12:17:05 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 845511B49B; Thu, 21 Jan 2021 12:17:05 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 21 Jan 2021 12:16:57 -0500 Message-Id: <1611249422-556-35-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org> References: <1611249422-556-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 34/39] lnet: socklnd: announce deprecation of 'use_tcp_bonding' X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Serguei Smirnov , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Serguei Smirnov Add warning to be printed if 'use_tcp_bonding' option is used notifying the user that the feature is being deprecated. It is suggested to use MR configuration with dynamic discovery instead. Multi-Rail feature doesn't need to be explicitly enabled. To use MR instead of tcp bonding, group the interfaces on the same network using the lnetctl utility: lnetctl net add --net tcp --if eth2,eth3 or via the modprobe configuration file (/etc/modprobe.d/lnet.conf or /etc/modprobe.d/lustre.conf): options lnet networks="tcp(eth2,eth3)" and make sure dynamic discovery is enabled: lnetctl set discovery 1 MR will aggregate the throughput of all configured and available networks/interfaces shared between peer nodes. WC-bug-id: https://jira.whamcloud.com/browse/LU-13641 Lustre-commit: 1a2bf911b97936 ("LU-13641 socklnd: announce deprecation of 'use_tcp_bonding'") Signed-off-by: Serguei Smirnov Reviewed-on: https://review.whamcloud.com/41088 Reviewed-by: Andreas Dilger Reviewed-by: Cyril Bordage Signed-off-by: James Simmons --- net/lnet/lnet/api-ni.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/net/lnet/lnet/api-ni.c b/net/lnet/lnet/api-ni.c index 322b25d..c3bf444 100644 --- a/net/lnet/lnet/api-ni.c +++ b/net/lnet/lnet/api-ni.c @@ -70,7 +70,7 @@ struct lnet the_lnet = { static int use_tcp_bonding = false; module_param(use_tcp_bonding, int, 0444); MODULE_PARM_DESC(use_tcp_bonding, - "Set to 1 to use socklnd bonding. 0 to use Multi-Rail"); + "use_tcp_bonding parameter has been deprecated"); unsigned int lnet_numa_range; EXPORT_SYMBOL(lnet_numa_range); @@ -2610,8 +2610,10 @@ void lnet_lib_exit(void) goto err_empty_list; } - /* - * If LNet is being initialized via DLC it is possible + if (use_tcp_bonding) + CWARN("'use_tcp_bonding' option has been deprecated. See LU-13641\n"); + + /* If LNet is being initialized via DLC it is possible * that the user requests not to load module parameters (ones which * are supported by DLC) on initialization. Therefore, make sure not * to load networks, routes and forwarding from module parameters From patchwork Thu Jan 21 17:16:58 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12037215 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D48CDC433E0 for ; Thu, 21 Jan 2021 17:19:13 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 7000123A57 for ; Thu, 21 Jan 2021 17:19:13 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7000123A57 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 3716521FFE7; Thu, 21 Jan 2021 09:18:17 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 98B5721FBE8 for ; Thu, 21 Jan 2021 09:17:15 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 88B721008493; Thu, 21 Jan 2021 12:17:05 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 878CD1B49C; Thu, 21 Jan 2021 12:17:05 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 21 Jan 2021 12:16:58 -0500 Message-Id: <1611249422-556-36-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org> References: <1611249422-556-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 35/39] lnet: o2iblnd: remove FMR-pool support. X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Mr NeilBrown Linux 5.8 removes the FMR-pool API. WC-bug-id: https://jira.whamcloud.com/browse/LU-13783 Lustre-commit: 6fd5c8bef83aaf ("LU-13783 o2iblnd: make FMR-pool support optional.") Signed-off-by: Mr NeilBrown Reviewed-on: https://review.whamcloud.com/40287 Reviewed-by: Sergey Gorenko Reviewed-by: James Simmons Reviewed-by: Chris Horn Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- net/lnet/klnds/o2iblnd/o2iblnd.c | 268 +++++++++++------------------------- net/lnet/klnds/o2iblnd/o2iblnd.h | 6 - net/lnet/klnds/o2iblnd/o2iblnd_cb.c | 27 +--- 3 files changed, 81 insertions(+), 220 deletions(-) diff --git a/net/lnet/klnds/o2iblnd/o2iblnd.c b/net/lnet/klnds/o2iblnd/o2iblnd.c index fc515fc..9147d17 100644 --- a/net/lnet/klnds/o2iblnd/o2iblnd.c +++ b/net/lnet/klnds/o2iblnd/o2iblnd.c @@ -1313,27 +1313,23 @@ static void kiblnd_map_tx_pool(struct kib_tx_pool *tpo) static void kiblnd_destroy_fmr_pool(struct kib_fmr_pool *fpo) { - LASSERT(!fpo->fpo_map_count); + struct kib_fast_reg_descriptor *frd; + int i = 0; - if (!IS_ERR_OR_NULL(fpo->fmr.fpo_fmr_pool)) { - ib_destroy_fmr_pool(fpo->fmr.fpo_fmr_pool); - } else { - struct kib_fast_reg_descriptor *frd; - int i = 0; + LASSERT(!fpo->fpo_map_count); - while (!list_empty(&fpo->fast_reg.fpo_pool_list)) { - frd = list_first_entry(&fpo->fast_reg.fpo_pool_list, - struct kib_fast_reg_descriptor, - frd_list); - list_del(&frd->frd_list); - ib_dereg_mr(frd->frd_mr); - kfree(frd); - i++; - } - if (i < fpo->fast_reg.fpo_pool_size) - CERROR("FastReg pool still has %d regions registered\n", - fpo->fast_reg.fpo_pool_size - i); + while (!list_empty(&fpo->fast_reg.fpo_pool_list)) { + frd = list_first_entry(&fpo->fast_reg.fpo_pool_list, + struct kib_fast_reg_descriptor, + frd_list); + list_del(&frd->frd_list); + ib_dereg_mr(frd->frd_mr); + kfree(frd); + i++; } + if (i < fpo->fast_reg.fpo_pool_size) + CERROR("FastReg pool still has %d regions registered\n", + fpo->fast_reg.fpo_pool_size - i); if (fpo->fpo_hdev) kiblnd_hdev_decref(fpo->fpo_hdev); @@ -1370,34 +1366,6 @@ static void kiblnd_destroy_fmr_pool_list(struct list_head *head) return max(IBLND_FMR_POOL_FLUSH, size); } -static int kiblnd_alloc_fmr_pool(struct kib_fmr_poolset *fps, struct kib_fmr_pool *fpo) -{ - struct ib_fmr_pool_param param = { - .max_pages_per_fmr = LNET_MAX_IOV, - .page_shift = PAGE_SHIFT, - .access = (IB_ACCESS_LOCAL_WRITE | - IB_ACCESS_REMOTE_WRITE), - .pool_size = fps->fps_pool_size, - .dirty_watermark = fps->fps_flush_trigger, - .flush_function = NULL, - .flush_arg = NULL, - .cache = !!fps->fps_cache - }; - int rc = 0; - - fpo->fmr.fpo_fmr_pool = ib_create_fmr_pool(fpo->fpo_hdev->ibh_pd, - ¶m); - if (IS_ERR(fpo->fmr.fpo_fmr_pool)) { - rc = PTR_ERR(fpo->fmr.fpo_fmr_pool); - if (rc != -ENOSYS) - CERROR("Failed to create FMR pool: %d\n", rc); - else - CERROR("FMRs are not supported\n"); - } - - return rc; -} - static int kiblnd_alloc_freg_pool(struct kib_fmr_poolset *fps, struct kib_fmr_pool *fpo, enum kib_dev_caps dev_caps) @@ -1481,10 +1449,7 @@ static int kiblnd_create_fmr_pool(struct kib_fmr_poolset *fps, fpo->fpo_hdev = kiblnd_current_hdev(dev); dev_attr = &fpo->fpo_hdev->ibh_ibdev->attrs; - if (dev->ibd_dev_caps & IBLND_DEV_CAPS_FMR_ENABLED) - rc = kiblnd_alloc_fmr_pool(fps, fpo); - else - rc = kiblnd_alloc_freg_pool(fps, fpo, dev->ibd_dev_caps); + rc = kiblnd_alloc_freg_pool(fps, fpo, dev->ibd_dev_caps); if (rc) goto out_fpo; @@ -1568,61 +1533,25 @@ static int kiblnd_fmr_pool_is_idle(struct kib_fmr_pool *fpo, time64_t now) return now >= fpo->fpo_deadline; } -static int -kiblnd_map_tx_pages(struct kib_tx *tx, struct kib_rdma_desc *rd) -{ - u64 *pages = tx->tx_pages; - struct kib_hca_dev *hdev; - int npages; - int size; - int i; - - hdev = tx->tx_pool->tpo_hdev; - - for (i = 0, npages = 0; i < rd->rd_nfrags; i++) { - for (size = 0; size < rd->rd_frags[i].rf_nob; - size += hdev->ibh_page_size) { - pages[npages++] = (rd->rd_frags[i].rf_addr & - hdev->ibh_page_mask) + size; - } - } - - return npages; -} - void kiblnd_fmr_pool_unmap(struct kib_fmr *fmr, int status) { + struct kib_fast_reg_descriptor *frd = fmr->fmr_frd; LIST_HEAD(zombies); struct kib_fmr_pool *fpo = fmr->fmr_pool; struct kib_fmr_poolset *fps; time64_t now = ktime_get_seconds(); struct kib_fmr_pool *tmp; - int rc; if (!fpo) return; fps = fpo->fpo_owner; - if (!IS_ERR_OR_NULL(fpo->fmr.fpo_fmr_pool)) { - if (fmr->fmr_pfmr) { - ib_fmr_pool_unmap(fmr->fmr_pfmr); - fmr->fmr_pfmr = NULL; - } - - if (status) { - rc = ib_flush_fmr_pool(fpo->fmr.fpo_fmr_pool); - LASSERT(!rc); - } - } else { - struct kib_fast_reg_descriptor *frd = fmr->fmr_frd; - - if (frd) { - frd->frd_valid = false; - spin_lock(&fps->fps_lock); - list_add_tail(&frd->frd_list, &fpo->fast_reg.fpo_pool_list); - spin_unlock(&fps->fps_lock); - fmr->fmr_frd = NULL; - } + if (frd) { + frd->frd_valid = false; + spin_lock(&fps->fps_lock); + list_add_tail(&frd->frd_list, &fpo->fast_reg.fpo_pool_list); + spin_unlock(&fps->fps_lock); + fmr->fmr_frd = NULL; } fmr->fmr_pool = NULL; @@ -1649,11 +1578,8 @@ int kiblnd_fmr_pool_map(struct kib_fmr_poolset *fps, struct kib_tx *tx, struct kib_rdma_desc *rd, u32 nob, u64 iov, struct kib_fmr *fmr) { - u64 *pages = tx->tx_pages; bool is_rx = (rd != tx->tx_rd); - bool tx_pages_mapped = false; struct kib_fmr_pool *fpo; - int npages = 0; u64 version; int rc; @@ -1664,96 +1590,65 @@ int kiblnd_fmr_pool_map(struct kib_fmr_poolset *fps, struct kib_tx *tx, fpo->fpo_deadline = ktime_get_seconds() + IBLND_POOL_DEADLINE; fpo->fpo_map_count++; - if (!IS_ERR_OR_NULL(fpo->fmr.fpo_fmr_pool)) { - struct ib_pool_fmr *pfmr; + if (!list_empty(&fpo->fast_reg.fpo_pool_list)) { + struct kib_fast_reg_descriptor *frd; + struct ib_reg_wr *wr; + struct ib_mr *mr; + int n; + frd = list_first_entry(&fpo->fast_reg.fpo_pool_list, + struct kib_fast_reg_descriptor, + frd_list); + list_del(&frd->frd_list); spin_unlock(&fps->fps_lock); - if (!tx_pages_mapped) { - npages = kiblnd_map_tx_pages(tx, rd); - tx_pages_mapped = 1; - } + mr = frd->frd_mr; - pfmr = ib_fmr_pool_map_phys(fpo->fmr.fpo_fmr_pool, - pages, npages, iov); - if (likely(!IS_ERR(pfmr))) { - fmr->fmr_key = is_rx ? pfmr->fmr->rkey : - pfmr->fmr->lkey; - fmr->fmr_frd = NULL; - fmr->fmr_pfmr = pfmr; - fmr->fmr_pool = fpo; - return 0; + if (!frd->frd_valid) { + u32 key = is_rx ? mr->rkey : mr->lkey; + struct ib_send_wr *inv_wr; + + inv_wr = &frd->frd_inv_wr; + memset(inv_wr, 0, sizeof(*inv_wr)); + inv_wr->opcode = IB_WR_LOCAL_INV; + inv_wr->wr_id = IBLND_WID_MR; + inv_wr->ex.invalidate_rkey = key; + + /* Bump the key */ + key = ib_inc_rkey(key); + ib_update_fast_reg_key(mr, key); } - rc = PTR_ERR(pfmr); - } else { - if (!list_empty(&fpo->fast_reg.fpo_pool_list)) { - struct kib_fast_reg_descriptor *frd; - struct ib_reg_wr *wr; - struct ib_mr *mr; - int n; - - frd = list_first_entry(&fpo->fast_reg.fpo_pool_list, - struct kib_fast_reg_descriptor, - frd_list); - list_del(&frd->frd_list); - spin_unlock(&fps->fps_lock); - - mr = frd->frd_mr; - - if (!frd->frd_valid) { - u32 key = is_rx ? mr->rkey : mr->lkey; - struct ib_send_wr *inv_wr; - - inv_wr = &frd->frd_inv_wr; - memset(inv_wr, 0, sizeof(*inv_wr)); - inv_wr->opcode = IB_WR_LOCAL_INV; - inv_wr->wr_id = IBLND_WID_MR; - inv_wr->ex.invalidate_rkey = key; - - /* Bump the key */ - key = ib_inc_rkey(key); - ib_update_fast_reg_key(mr, key); - } - - n = ib_map_mr_sg(mr, tx->tx_frags, - rd->rd_nfrags, NULL, - PAGE_SIZE); - if (unlikely(n != rd->rd_nfrags)) { - CERROR("Failed to map mr %d/%d elements\n", - n, rd->rd_nfrags); - return n < 0 ? n : -EINVAL; - } - - /* Prepare FastReg WR */ - wr = &frd->frd_fastreg_wr; - memset(wr, 0, sizeof(*wr)); - wr->wr.opcode = IB_WR_REG_MR; - wr->wr.wr_id = IBLND_WID_MR; - wr->wr.num_sge = 0; - wr->wr.send_flags = 0; - wr->mr = mr; - wr->key = is_rx ? mr->rkey : mr->lkey; - wr->access = (IB_ACCESS_LOCAL_WRITE | - IB_ACCESS_REMOTE_WRITE); - - fmr->fmr_key = is_rx ? mr->rkey : mr->lkey; - fmr->fmr_frd = frd; - fmr->fmr_pfmr = NULL; - fmr->fmr_pool = fpo; - return 0; + + n = ib_map_mr_sg(mr, tx->tx_frags, + rd->rd_nfrags, NULL, + PAGE_SIZE); + if (unlikely(n != rd->rd_nfrags)) { + CERROR("Failed to map mr %d/%d elements\n", + n, rd->rd_nfrags); + return n < 0 ? n : -EINVAL; } - spin_unlock(&fps->fps_lock); - rc = -EAGAIN; - } - spin_lock(&fps->fps_lock); - fpo->fpo_map_count--; - if (rc != -EAGAIN) { - spin_unlock(&fps->fps_lock); - return rc; + /* Prepare FastReg WR */ + wr = &frd->frd_fastreg_wr; + memset(wr, 0, sizeof(*wr)); + wr->wr.opcode = IB_WR_REG_MR; + wr->wr.wr_id = IBLND_WID_MR; + wr->wr.num_sge = 0; + wr->wr.send_flags = 0; + wr->mr = mr; + wr->key = is_rx ? mr->rkey : mr->lkey; + wr->access = (IB_ACCESS_LOCAL_WRITE | + IB_ACCESS_REMOTE_WRITE); + + fmr->fmr_key = is_rx ? mr->rkey : mr->lkey; + fmr->fmr_frd = frd; + fmr->fmr_pool = fpo; + return 0; } /* EAGAIN and ... */ + rc = -EAGAIN; + fpo->fpo_map_count--; if (version != fps->fps_version) { spin_unlock(&fps->fps_lock); goto again; @@ -2353,32 +2248,25 @@ static int kiblnd_hdev_get_attr(struct kib_hca_dev *hdev) hdev->ibh_page_size = 1 << PAGE_SHIFT; hdev->ibh_page_mask = ~((u64)hdev->ibh_page_size - 1); - if (hdev->ibh_ibdev->ops.alloc_fmr && - hdev->ibh_ibdev->ops.dealloc_fmr && - hdev->ibh_ibdev->ops.map_phys_fmr && - hdev->ibh_ibdev->ops.unmap_fmr) { - LCONSOLE_INFO("Using FMR for registration\n"); - hdev->ibh_dev->ibd_dev_caps |= IBLND_DEV_CAPS_FMR_ENABLED; - } else if (dev_attr->device_cap_flags & IB_DEVICE_MEM_MGT_EXTENSIONS) { + hdev->ibh_mr_size = dev_attr->max_mr_size; + hdev->ibh_max_qp_wr = dev_attr->max_qp_wr; + + if (dev_attr->device_cap_flags & IB_DEVICE_MEM_MGT_EXTENSIONS) { LCONSOLE_INFO("Using FastReg for registration\n"); hdev->ibh_dev->ibd_dev_caps |= IBLND_DEV_CAPS_FASTREG_ENABLED; if (dev_attr->device_cap_flags & IB_DEVICE_SG_GAPS_REG) hdev->ibh_dev->ibd_dev_caps |= IBLND_DEV_CAPS_FASTREG_GAPS_SUPPORT; } else { - CERROR("IB device does not support FMRs nor FastRegs, can't register memory: %d\n", + CERROR("IB device does not support FastRegs, can't register memory: %d\n", -ENXIO); return -ENXIO; } - hdev->ibh_mr_size = dev_attr->max_mr_size; - hdev->ibh_max_qp_wr = dev_attr->max_qp_wr; - rc2 = kiblnd_port_get_attr(hdev); if (rc2 != 0) - return rc2; + CERROR("Invalid mr size: %#llx\n", hdev->ibh_mr_size); - CERROR("Invalid mr size: %#llx\n", hdev->ibh_mr_size); - return -EINVAL; + return rc2; } void kiblnd_hdev_destroy(struct kib_hca_dev *hdev) diff --git a/net/lnet/klnds/o2iblnd/o2iblnd.h b/net/lnet/klnds/o2iblnd/o2iblnd.h index 424ca07..12d220c 100644 --- a/net/lnet/klnds/o2iblnd/o2iblnd.h +++ b/net/lnet/klnds/o2iblnd/o2iblnd.h @@ -60,7 +60,6 @@ #include #include #include -#include #define DEBUG_SUBSYSTEM S_LND @@ -146,7 +145,6 @@ struct kib_tunables { enum kib_dev_caps { IBLND_DEV_CAPS_FASTREG_ENABLED = BIT(0), IBLND_DEV_CAPS_FASTREG_GAPS_SUPPORT = BIT(1), - IBLND_DEV_CAPS_FMR_ENABLED = BIT(2), }; struct kib_dev { @@ -281,9 +279,6 @@ struct kib_fmr_pool { struct kib_hca_dev *fpo_hdev; /* device for this pool */ struct kib_fmr_poolset *fpo_owner; /* owner of this pool */ union { - struct { - struct ib_fmr_pool *fpo_fmr_pool; /* IB FMR pool */ - } fmr; struct { /* For fast registration */ struct list_head fpo_pool_list; int fpo_pool_size; @@ -296,7 +291,6 @@ struct kib_fmr_pool { struct kib_fmr { struct kib_fmr_pool *fmr_pool; /* pool of FMR */ - struct ib_pool_fmr *fmr_pfmr; /* IB pool fmr */ struct kib_fast_reg_descriptor *fmr_frd; u32 fmr_key; }; diff --git a/net/lnet/klnds/o2iblnd/o2iblnd_cb.c b/net/lnet/klnds/o2iblnd/o2iblnd_cb.c index 5cd367e5..c799453 100644 --- a/net/lnet/klnds/o2iblnd/o2iblnd_cb.c +++ b/net/lnet/klnds/o2iblnd/o2iblnd_cb.c @@ -575,23 +575,6 @@ static int kiblnd_init_rdma(struct kib_conn *conn, struct kib_tx *tx, int type, return -EPROTONOSUPPORT; } - /* - * FMR does not support gaps but the tx has gaps then - * we should make sure that the number of fragments we'll be sending - * over fits within the number of fragments negotiated on the - * connection, otherwise, we won't be able to RDMA the data. - * We need to maintain the number of fragments negotiation on the - * connection for backwards compatibility. - */ - if (tx->tx_gaps && (dev->ibd_dev_caps & IBLND_DEV_CAPS_FMR_ENABLED)) { - if (tx->tx_conn && - tx->tx_conn->ibc_max_frags <= rd->rd_nfrags) { - CERROR("TX number of frags (%d) is <= than connection number of frags (%d). Consider setting peer's map_on_demand to 256\n", - tx->tx_nfrags, tx->tx_conn->ibc_max_frags); - return -EFBIG; - } - } - fps = net->ibn_fmr_ps[cpt]; rc = kiblnd_fmr_pool_map(fps, tx, rd, nob, 0, &tx->tx_fmr); if (rc) { @@ -606,14 +589,10 @@ static int kiblnd_init_rdma(struct kib_conn *conn, struct kib_tx *tx, int type, */ rd->rd_key = tx->tx_fmr.fmr_key; /* - * for FastReg or FMR with no gaps we can accumulate all + * for FastReg with no gaps we can accumulate all * the fragments in one FastReg or FMR fragment. */ - if (((dev->ibd_dev_caps & IBLND_DEV_CAPS_FMR_ENABLED) && !tx->tx_gaps) || - (dev->ibd_dev_caps & IBLND_DEV_CAPS_FASTREG_ENABLED)) { - /* FMR requires zero based address */ - if (dev->ibd_dev_caps & IBLND_DEV_CAPS_FMR_ENABLED) - rd->rd_frags[0].rf_addr &= ~hdev->ibh_page_mask; + if (dev->ibd_dev_caps & IBLND_DEV_CAPS_FASTREG_ENABLED) { rd->rd_frags[0].rf_nob = nob; rd->rd_nfrags = 1; } else { @@ -633,7 +612,7 @@ static int kiblnd_init_rdma(struct kib_conn *conn, struct kib_tx *tx, int type, static void kiblnd_unmap_tx(struct kib_tx *tx) { - if (tx->tx_fmr.fmr_pfmr || tx->tx_fmr.fmr_frd) + if (tx->tx_fmr.fmr_frd) kiblnd_fmr_pool_unmap(&tx->tx_fmr, tx->tx_status); if (tx->tx_nfrags) { From patchwork Thu Jan 21 17:16:59 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12037217 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 59B30C433E0 for ; Thu, 21 Jan 2021 17:19:22 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E7A1223A57 for ; Thu, 21 Jan 2021 17:19:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E7A1223A57 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 5D9E221FD89; Thu, 21 Jan 2021 09:18:20 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id E2DAB21FC18 for ; Thu, 21 Jan 2021 09:17:15 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 8B78D1008494; Thu, 21 Jan 2021 12:17:05 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 8A9F81B49D; Thu, 21 Jan 2021 12:17:05 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 21 Jan 2021 12:16:59 -0500 Message-Id: <1611249422-556-37-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org> References: <1611249422-556-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 36/39] lustre: llite: return EOPNOTSUPP if fallocate is not supported X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: "John L. Hammond" In ll_fallocate() if the server returns the NFSv3 specific error code ENOTSUPP then replace it with EOPNOTSUPP to avoid confusing applications. WC-bug-id: https://jira.whamcloud.com/browse/LU-14301 Lustre-commit: 71a9f5a466bfa4 ("LU-14301 llite: return EOPNOTSUPP if fallocate is not supported") Signed-off-by: John L. Hammond Reviewed-on: https://review.whamcloud.com/41148 Reviewed-by: Andreas Dilger Reviewed-by: Arshad Hussain Reviewed-by: Wang Shilong Signed-off-by: James Simmons --- fs/lustre/llite/file.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/fs/lustre/llite/file.c b/fs/lustre/llite/file.c index a3a8d1a..7c7ac01 100644 --- a/fs/lustre/llite/file.c +++ b/fs/lustre/llite/file.c @@ -4934,6 +4934,7 @@ int cl_falloc(struct inode *inode, int mode, loff_t offset, loff_t len) long ll_fallocate(struct file *filp, int mode, loff_t offset, loff_t len) { struct inode *inode = filp->f_path.dentry->d_inode; + int rc; /* * Encrypted inodes can't handle collapse range or zero range or insert @@ -4955,7 +4956,17 @@ long ll_fallocate(struct file *filp, int mode, loff_t offset, loff_t len) ll_stats_ops_tally(ll_i2sbi(inode), LPROC_LL_FALLOCATE, 1); - return cl_falloc(inode, mode, offset, len); + rc = cl_falloc(inode, mode, offset, len); + /* + * ENOTSUPP (524) is an NFSv3 specific error code erroneously + * used by Lustre in several places. Retuning it here would + * confuse applications that explicity test for EOPNOTSUPP + * (95) and fall back to ftruncate(). + */ + if (rc == -ENOTSUPP) + rc = -EOPNOTSUPP; + + return rc; } static int ll_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo, From patchwork Thu Jan 21 17:17:00 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12037203 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9EAE0C433E0 for ; Thu, 21 Jan 2021 17:18:51 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 2CFA123A5D for ; Thu, 21 Jan 2021 17:18:51 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2CFA123A5D Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 867C021FF4A; Thu, 21 Jan 2021 09:18:03 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 2749921FCA0 for ; Thu, 21 Jan 2021 09:17:16 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 8F2311008495; Thu, 21 Jan 2021 12:17:05 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 8DA421B49E; Thu, 21 Jan 2021 12:17:05 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 21 Jan 2021 12:17:00 -0500 Message-Id: <1611249422-556-38-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org> References: <1611249422-556-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 37/39] lnet: use an unbound cred in kiblnd_resolve_addr() X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: "John L. Hammond" In kiblnd_resolve_addr() call prepare_kernel_cred(NULL) rather than prepare_creds() to get a cred with unbound capabilities. Fixes: 5fc342b471a ("lnet: o2ib: raise bind cap before resolving address") WC-bug-id: https://jira.whamcloud.com/browse/LU-14296 Lustre-commit: 30b356a28b5094 ("LU-14296 lnet: use an unbound cred in kiblnd_resolve_addr()") Signed-off-by: John L. Hammond Reviewed-on: https://review.whamcloud.com/41137 Reviewed-by: Amir Shehata Reviewed-by: Andreas Dilger Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- net/lnet/klnds/o2iblnd/o2iblnd_cb.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/net/lnet/klnds/o2iblnd/o2iblnd_cb.c b/net/lnet/klnds/o2iblnd/o2iblnd_cb.c index c799453..e29cb4b 100644 --- a/net/lnet/klnds/o2iblnd/o2iblnd_cb.c +++ b/net/lnet/klnds/o2iblnd/o2iblnd_cb.c @@ -1207,8 +1207,6 @@ static int kiblnd_map_tx(struct lnet_ni *ni, struct kib_tx *tx, unsigned short port; int rc; - LASSERT(capable(CAP_NET_BIND_SERVICE)); - /* allow the port to be reused */ rc = rdma_set_reuseaddr(cmid, 1); if (rc) { @@ -1234,7 +1232,8 @@ static int kiblnd_map_tx(struct lnet_ni *ni, struct kib_tx *tx, } } - CERROR("Failed to bind to a free privileged port\n"); + CERROR("cannot bind to a free privileged port: rc = %d\n", rc); + return rc; } @@ -1249,7 +1248,7 @@ static int kiblnd_map_tx(struct lnet_ni *ni, struct kib_tx *tx, int rc; if (!capable(CAP_NET_BIND_SERVICE)) { - new_creds = prepare_creds(); + new_creds = prepare_kernel_cred(NULL); if (!new_creds) return -ENOMEM; From patchwork Thu Jan 21 17:17:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12037183 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0AFC9C433DB for ; Thu, 21 Jan 2021 17:18:15 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A998523A5A for ; Thu, 21 Jan 2021 17:18:14 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A998523A5A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 86FF221FE70; Thu, 21 Jan 2021 09:17:47 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 617C721FCAD for ; Thu, 21 Jan 2021 09:17:16 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 91D151008496; Thu, 21 Jan 2021 12:17:05 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 909EF1B49B; Thu, 21 Jan 2021 12:17:05 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 21 Jan 2021 12:17:01 -0500 Message-Id: <1611249422-556-39-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org> References: <1611249422-556-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 38/39] lustre: lov: correctly set OST obj size X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Bobi Jam When extends a PFL file to a size locating at a boundary of a stripe in a component, the truncate won't set the size of the OST object in the prior stripe. This patch record the prior stripe in lov_layout_raid0::lo_trunc_stripeno and add the stripe in the truncate IO and enqueue the lock covering it. WC-bug-id: https://jira.whamcloud.com/browse/LU-14128 Lustre-commit: 98015004516cad ("LU-14128 lov: correctly set OST obj size") Signed-off-by: Bobi Jam Reviewed-on: https://review.whamcloud.com/40581 Reviewed-by: Andreas Dilger Reviewed-by: Mike Pershin Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/lov/lov_cl_internal.h | 5 +++ fs/lustre/lov/lov_internal.h | 1 + fs/lustre/lov/lov_io.c | 97 +++++++++++++++++++++++++++++++++++------ fs/lustre/lov/lov_lock.c | 31 +++++++++---- fs/lustre/lov/lov_object.c | 1 + fs/lustre/lov/lov_offset.c | 2 +- 6 files changed, 114 insertions(+), 23 deletions(-) diff --git a/fs/lustre/lov/lov_cl_internal.h b/fs/lustre/lov/lov_cl_internal.h index 7128224..f86176a 100644 --- a/fs/lustre/lov/lov_cl_internal.h +++ b/fs/lustre/lov/lov_cl_internal.h @@ -176,6 +176,11 @@ struct lov_comp_layout_entry_ops { struct lov_layout_raid0 { unsigned int lo_nr; /** + * record the stripe no before the truncate size, used for setting OST + * object size for truncate. LU-14128. + */ + int lo_trunc_stripeno; + /** * When this is true, lov_object::lo_attr contains * valid up to date attributes for a top-level * object. This field is reset to 0 when attributes of diff --git a/fs/lustre/lov/lov_internal.h b/fs/lustre/lov/lov_internal.h index 202e4b5..5d726fd 100644 --- a/fs/lustre/lov/lov_internal.h +++ b/fs/lustre/lov/lov_internal.h @@ -264,6 +264,7 @@ int lov_merge_lvb_kms(struct lov_stripe_md *lsm, int index, struct ost_lvb *lvb, u64 *kms_place); /* lov_offset.c */ +u64 stripe_width(struct lov_stripe_md *lsm, unsigned int index); u64 lov_stripe_size(struct lov_stripe_md *lsm, int index, u64 ost_size, int stripeno); int lov_stripe_offset(struct lov_stripe_md *lsm, int index, u64 lov_off, diff --git a/fs/lustre/lov/lov_io.c b/fs/lustre/lov/lov_io.c index d4a0c9d..daceab0 100644 --- a/fs/lustre/lov/lov_io.c +++ b/fs/lustre/lov/lov_io.c @@ -752,6 +752,24 @@ static u64 lov_offset_mod(u64 val, int delta) return val; } +static int lov_io_add_sub(const struct lu_env *env, struct lov_io *lio, + struct lov_io_sub *sub, u64 start, u64 end) +{ + int rc; + + end = lov_offset_mod(end, 1); + lov_io_sub_inherit(sub, lio, start, end); + rc = cl_io_iter_init(sub->sub_env, &sub->sub_io); + if (rc != 0) { + cl_io_iter_fini(sub->sub_env, &sub->sub_io); + return rc; + } + + list_add_tail(&sub->sub_linkage, &lio->lis_active); + + return rc; +} + static int lov_io_iter_init(const struct lu_env *env, const struct cl_io_slice *ios) { @@ -768,10 +786,13 @@ static int lov_io_iter_init(const struct lu_env *env, lov_foreach_io_layout(index, lio, &ext) { struct lov_layout_entry *le = lov_entry(lio->lis_object, index); struct lov_layout_raid0 *r0 = &le->lle_raid0; + bool tested_trunc_stripe = false; int stripe; u64 start; u64 end; + r0->lo_trunc_stripeno = -1; + CDEBUG(D_VFSTRACE, "component[%d] flags %#x\n", index, lsm->lsm_entries[index]->lsme_flags); if (!lsm_entry_inited(lsm, index)) { @@ -801,28 +822,76 @@ static int lov_io_iter_init(const struct lu_env *env, continue; } - end = lov_offset_mod(end, 1); + if (cl_io_is_trunc(ios->cis_io) && + !tested_trunc_stripe) { + int prev; + u64 tr_start; + + prev = (stripe == 0) ? r0->lo_nr - 1 : + stripe - 1; + /** + * Only involving previous stripe if the + * truncate in this component is at the + * beginning of this stripe. + */ + tested_trunc_stripe = true; + if (ext.e_start < + lsm->lsm_entries[index]->lsme_extent.e_start) { + /* need previous stripe involvement */ + r0->lo_trunc_stripeno = prev; + } else { + tr_start = ext.e_start; + tr_start = lov_do_div64(tr_start, + stripe_width(lsm, index)); + /* tr_start %= stripe_swidth */ + if (tr_start == stripe * lsm->lsm_entries[index]->lsme_stripe_size) + r0->lo_trunc_stripeno = prev; + } + } + + /* if the last stripe is the trunc stripeno */ + if (r0->lo_trunc_stripeno == stripe) + r0->lo_trunc_stripeno = -1; + sub = lov_sub_get(env, lio, lov_comp_index(index, stripe)); - if (IS_ERR(sub)) { - rc = PTR_ERR(sub); - break; - } + if (IS_ERR(sub)) + return PTR_ERR(sub); - lov_io_sub_inherit(sub, lio, start, end); - rc = cl_io_iter_init(sub->sub_env, &sub->sub_io); - if (rc) { - cl_io_iter_fini(sub->sub_env, &sub->sub_io); + rc = lov_io_add_sub(env, lio, sub, start, end); + if (rc) break; + } + if (rc != 0) + break; + + if (r0->lo_trunc_stripeno != -1) { + stripe = r0->lo_trunc_stripeno; + if (unlikely(!r0->lo_sub[stripe])) { + r0->lo_trunc_stripeno = -1; + continue; } + sub = lov_sub_get(env, lio, + lov_comp_index(index, stripe)); + if (IS_ERR(sub)) + return PTR_ERR(sub); - CDEBUG(D_VFSTRACE, "shrink: %d [%llu, %llu)\n", - stripe, start, end); + /** + * the prev sub could be used by another truncate, we'd + * skip it. LU-14128 happends when expand truncate + + * read get wrong kms. + */ + if (!list_empty(&sub->sub_linkage)) { + r0->lo_trunc_stripeno = -1; + continue; + } - list_add_tail(&sub->sub_linkage, &lio->lis_active); + (void)lov_stripe_intersects(lsm, index, stripe, &ext, + &start, &end); + rc = lov_io_add_sub(env, lio, sub, start, end); + if (rc != 0) + break; } - if (rc) - break; } return rc; } diff --git a/fs/lustre/lov/lov_lock.c b/fs/lustre/lov/lov_lock.c index 7dae13f..c79f728 100644 --- a/fs/lustre/lov/lov_lock.c +++ b/fs/lustre/lov/lov_lock.c @@ -111,6 +111,7 @@ static int lov_sublock_init(const struct lu_env *env, * through already created sub-locks (possibly shared with other top-locks). */ static struct lov_lock *lov_lock_sub_init(const struct lu_env *env, + const struct cl_io *io, const struct cl_object *obj, struct cl_lock *lock) { @@ -135,10 +136,14 @@ static struct lov_lock *lov_lock_sub_init(const struct lu_env *env, struct lov_layout_raid0 *r0 = lov_r0(lov, index); for (i = 0; i < r0->lo_nr; i++) { - if (likely(r0->lo_sub[i]) && /* spare layout */ - lov_stripe_intersects(lov->lo_lsm, index, i, - &ext, &start, &end)) - nr++; + if (likely(r0->lo_sub[i])) { /* spare layout */ + if (lov_stripe_intersects(lov->lo_lsm, index, i, + &ext, &start, &end)) + nr++; + else if (cl_io_is_trunc(io) && + r0->lo_trunc_stripeno == i) + nr++; + } } } /** @@ -160,12 +165,22 @@ static struct lov_lock *lov_lock_sub_init(const struct lu_env *env, for (i = 0; i < r0->lo_nr; ++i) { struct lov_lock_sub *lls = &lovlck->lls_sub[nr]; struct cl_lock_descr *descr = &lls->sub_lock.cll_descr; + bool intersect = false; - if (unlikely(!r0->lo_sub[i]) || - !lov_stripe_intersects(lov->lo_lsm, index, i, - &ext, &start, &end)) + if (unlikely(!r0->lo_sub[i])) continue; + intersect = lov_stripe_intersects(lov->lo_lsm, index, i, + &ext, &start, &end); + if (intersect) + goto init_sublock; + + if (cl_io_is_trunc(io) && i == r0->lo_trunc_stripeno) + goto init_sublock; + + continue; + +init_sublock: LASSERT(!descr->cld_obj); descr->cld_obj = lovsub2cl(r0->lo_sub[i]); descr->cld_start = cl_index(descr->cld_obj, start); @@ -308,7 +323,7 @@ int lov_lock_init_composite(const struct lu_env *env, struct cl_object *obj, struct lov_lock *lck; int result = 0; - lck = lov_lock_sub_init(env, obj, lock); + lck = lov_lock_sub_init(env, io, obj, lock); if (!IS_ERR(lck)) cl_lock_slice_add(lock, &lck->lls_cl, obj, &lov_lock_ops); else diff --git a/fs/lustre/lov/lov_object.c b/fs/lustre/lov/lov_object.c index 3fcd342..d9729c8 100644 --- a/fs/lustre/lov/lov_object.c +++ b/fs/lustre/lov/lov_object.c @@ -215,6 +215,7 @@ static int lov_init_raid0(const struct lu_env *env, struct lov_device *dev, spin_lock_init(&r0->lo_sub_lock); r0->lo_nr = lse->lsme_stripe_count; + r0->lo_trunc_stripeno = -1; flags = memalloc_nofs_save(); r0->lo_sub = kvmalloc_array(r0->lo_nr, sizeof(r0->lo_sub[0]), diff --git a/fs/lustre/lov/lov_offset.c b/fs/lustre/lov/lov_offset.c index ca763af..2493331 100644 --- a/fs/lustre/lov/lov_offset.c +++ b/fs/lustre/lov/lov_offset.c @@ -37,7 +37,7 @@ #include "lov_internal.h" -static u64 stripe_width(struct lov_stripe_md *lsm, unsigned int index) +u64 stripe_width(struct lov_stripe_md *lsm, unsigned int index) { struct lov_stripe_md_entry *entry = lsm->lsm_entries[index]; From patchwork Thu Jan 21 17:17:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12037207 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.9 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,UNWANTED_LANGUAGE_BODY, URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 48CEBC433DB for ; Thu, 21 Jan 2021 17:18:58 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D054423A57 for ; Thu, 21 Jan 2021 17:18:57 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D054423A57 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id B533C21FF8C; Thu, 21 Jan 2021 09:18:06 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id AA73921FAED for ; Thu, 21 Jan 2021 09:17:16 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 9498D1008497; Thu, 21 Jan 2021 12:17:05 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 93D1A1B49C; Thu, 21 Jan 2021 12:17:05 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 21 Jan 2021 12:17:02 -0500 Message-Id: <1611249422-556-40-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611249422-556-1-git-send-email-jsimmons@infradead.org> References: <1611249422-556-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 39/39] lustre: cksum: add lprocfs checksum support in MDC/MDT X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mikhail Pershin , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Mikhail Pershin Add missed support for checksum parameters in MDC and MDT Handle T10-PI parameters in MDT similar to OFD, move all functionality to target code and unify its usage in both targets WC-bug-id: https://jira.whamcloud.com/browse/LU-14194 Lustre-commit: 18d61a910bcc76 ("LU-14194 cksum: add lprocfs checksum support in MDC/MDT") Signed-off-by: Mikhail Pershin Reviewed-on: https://review.whamcloud.com/40971 Reviewed-by: Andreas Dilger Reviewed-by: Li Xi Signed-off-by: James Simmons --- fs/lustre/mdc/lproc_mdc.c | 126 ++++++++++++++++++++++++++++++++++++++++++++++ fs/lustre/osc/lproc_osc.c | 10 ++-- 2 files changed, 131 insertions(+), 5 deletions(-) diff --git a/fs/lustre/mdc/lproc_mdc.c b/fs/lustre/mdc/lproc_mdc.c index ce03999..3a2c37a2 100644 --- a/fs/lustre/mdc/lproc_mdc.c +++ b/fs/lustre/mdc/lproc_mdc.c @@ -34,6 +34,7 @@ #include #include +#include #include #include #include @@ -87,6 +88,127 @@ static ssize_t mdc_max_dirty_mb_seq_write(struct file *file, } LDEBUGFS_SEQ_FOPS(mdc_max_dirty_mb); +DECLARE_CKSUM_NAME; + +static int mdc_checksum_type_seq_show(struct seq_file *m, void *v) +{ + struct obd_device *obd = m->private; + int i; + + if (!obd) + return 0; + + for (i = 0; i < ARRAY_SIZE(cksum_name); i++) { + if ((BIT(i) & obd->u.cli.cl_supp_cksum_types) == 0) + continue; + if (obd->u.cli.cl_cksum_type == BIT(i)) + seq_printf(m, "[%s] ", cksum_name[i]); + else + seq_printf(m, "%s ", cksum_name[i]); + } + seq_puts(m, "\n"); + + return 0; +} + +static ssize_t mdc_checksum_type_seq_write(struct file *file, + const char __user *buffer, + size_t count, loff_t *off) +{ + struct seq_file *m = file->private_data; + struct obd_device *obd = m->private; + char kernbuf[10]; + int rc = -EINVAL; + int i; + + if (!obd) + return 0; + + if (count > sizeof(kernbuf) - 1) + return -EINVAL; + if (copy_from_user(kernbuf, buffer, count)) + return -EFAULT; + + if (count > 0 && kernbuf[count - 1] == '\n') + kernbuf[count - 1] = '\0'; + else + kernbuf[count] = '\0'; + + for (i = 0; i < ARRAY_SIZE(cksum_name); i++) { + if (strcmp(kernbuf, cksum_name[i]) == 0) { + obd->u.cli.cl_preferred_cksum_type = BIT(i); + if (obd->u.cli.cl_supp_cksum_types & BIT(i)) { + obd->u.cli.cl_cksum_type = BIT(i); + rc = count; + } else { + rc = -ENOTSUPP; + } + break; + } + } + + return rc; +} +LDEBUGFS_SEQ_FOPS(mdc_checksum_type); + +static ssize_t checksums_show(struct kobject *kobj, + struct attribute *attr, char *buf) +{ + struct obd_device *obd = container_of(kobj, struct obd_device, + obd_kset.kobj); + + return scnprintf(buf, PAGE_SIZE, "%d\n", !!obd->u.cli.cl_checksum); +} + +static ssize_t checksums_store(struct kobject *kobj, + struct attribute *attr, + const char *buffer, + size_t count) +{ + struct obd_device *obd = container_of(kobj, struct obd_device, + obd_kset.kobj); + bool val; + int rc; + + rc = kstrtobool(buffer, &val); + if (rc) + return rc; + + obd->u.cli.cl_checksum = val; + + return count; +} +LUSTRE_RW_ATTR(checksums); + +static ssize_t checksum_dump_show(struct kobject *kobj, + struct attribute *attr, char *buf) +{ + struct obd_device *obd = container_of(kobj, struct obd_device, + obd_kset.kobj); + + return scnprintf(buf, PAGE_SIZE, "%d\n", !!obd->u.cli.cl_checksum_dump); +} + +static ssize_t checksum_dump_store(struct kobject *kobj, + struct attribute *attr, + const char *buffer, + size_t count) +{ + struct obd_device *obd = container_of(kobj, struct obd_device, + obd_kset.kobj); + bool val; + int rc; + + rc = kstrtobool(buffer, &val); + if (rc) + return rc; + + obd->u.cli.cl_checksum_dump = val; + + return count; +} +LUSTRE_RW_ATTR(checksum_dump); + static int mdc_cached_mb_seq_show(struct seq_file *m, void *v) { struct obd_device *obd = m->private; @@ -503,6 +625,8 @@ static ssize_t mdc_dom_min_repsize_seq_write(struct file *file, .fops = &mdc_max_dirty_mb_fops }, { .name = "mdc_cached_mb", .fops = &mdc_cached_mb_fops }, + { .name = "checksum_type", + .fops = &mdc_checksum_type_fops }, { .name = "timeouts", .fops = &mdc_timeouts_fops }, { .name = "contention_seconds", @@ -526,6 +650,8 @@ static ssize_t mdc_dom_min_repsize_seq_write(struct file *file, static struct attribute *mdc_attrs[] = { &lustre_attr_active.attr, + &lustre_attr_checksums.attr, + &lustre_attr_checksum_dump.attr, &lustre_attr_max_rpcs_in_flight.attr, &lustre_attr_max_mod_rpcs_in_flight.attr, &lustre_attr_max_pages_per_rpc.attr, diff --git a/fs/lustre/osc/lproc_osc.c b/fs/lustre/osc/lproc_osc.c index 89b55c3..e64176e 100644 --- a/fs/lustre/osc/lproc_osc.c +++ b/fs/lustre/osc/lproc_osc.c @@ -358,7 +358,7 @@ static ssize_t checksums_show(struct kobject *kobj, struct obd_device *obd = container_of(kobj, struct obd_device, obd_kset.kobj); - return sprintf(buf, "%d\n", obd->u.cli.cl_checksum ? 1 : 0); + return scnprintf(buf, PAGE_SIZE, "%d\n", !!obd->u.cli.cl_checksum); } static ssize_t checksums_store(struct kobject *kobj, @@ -381,10 +381,11 @@ static ssize_t checksums_store(struct kobject *kobj, } LUSTRE_RW_ATTR(checksums); +DECLARE_CKSUM_NAME; + static int osc_checksum_type_seq_show(struct seq_file *m, void *v) { struct obd_device *obd = m->private; - DECLARE_CKSUM_NAME; int i; if (!obd) @@ -408,10 +409,9 @@ static ssize_t osc_checksum_type_seq_write(struct file *file, { struct seq_file *m = file->private_data; struct obd_device *obd = m->private; - DECLARE_CKSUM_NAME; char kernbuf[10]; - int i; int rc = -EINVAL; + int i; if (!obd) return 0; @@ -479,7 +479,7 @@ static ssize_t checksum_dump_show(struct kobject *kobj, struct obd_device *obd = container_of(kobj, struct obd_device, obd_kset.kobj); - return sprintf(buf, "%d\n", obd->u.cli.cl_checksum_dump ? 1 : 0); + return scnprintf(buf, PAGE_SIZE, "%d\n", !!obd->u.cli.cl_checksum_dump); } static ssize_t checksum_dump_store(struct kobject *kobj,