From patchwork Mon Sep 30 18:54:32 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11167111 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9537A13BD for ; Mon, 30 Sep 2019 18:59:09 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 7D4FE224D5 for ; Mon, 30 Sep 2019 18:59:09 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7D4FE224D5 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 62B785C3E91; Mon, 30 Sep 2019 11:58:09 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 7310B5C3546 for ; Mon, 30 Sep 2019 11:57:02 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 4B922100538D; Mon, 30 Sep 2019 14:56:56 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 490ABB4; Mon, 30 Sep 2019 14:56:56 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Mon, 30 Sep 2019 14:54:32 -0400 Message-Id: <1569869810-23848-14-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1569869810-23848-1-git-send-email-jsimmons@infradead.org> References: <1569869810-23848-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 013/151] lustre: fld: fld client lookup should retry X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Wang Di , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Wang Di If FLD client lookup fails because of the remote target is shutdown (or deactive), it should retry another target, otherwise it will cause the application failure. And FLD client should stop retry if the import has been deactive. WC-bug-id: https://jira.whamcloud.com/browse/LU-6419 Lustre-commit: 3ededde903c ("LU-6419 fld: fld client lookup should retry") Signed-off-by: wang di Reviewed-on: http://review.whamcloud.com/14313 Reviewed-by: Lai Siyao Reviewed-by: Fan Yong Reviewed-by: Alex Zhuravlev Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/fld/fld_request.c | 23 +++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-) diff --git a/fs/lustre/fld/fld_request.c b/fs/lustre/fld/fld_request.c index 062f19f..75cba18 100644 --- a/fs/lustre/fld/fld_request.c +++ b/fs/lustre/fld/fld_request.c @@ -367,7 +367,7 @@ int fld_client_rpc(struct obd_export *exp, rc = ptlrpc_queue_wait(req); obd_put_request_slot(&exp->exp_obd->u.cli); if (rc != 0) { - if (imp->imp_state != LUSTRE_IMP_CLOSED) { + if (imp->imp_state != LUSTRE_IMP_CLOSED && !imp->imp_deactive) { /* Since LWP is not replayable, so it will keep * trying unless umount happens, otherwise it would * cause unnecessary failure of the application. @@ -404,6 +404,7 @@ int fld_client_lookup(struct lu_client_fld *fld, u64 seq, u32 *mds, { struct lu_seq_range res = { 0 }; struct lu_fld_target *target; + struct lu_fld_target *origin; int rc; rc = fld_cache_lookup(fld->lcf_cache, seq, &res); @@ -415,7 +416,8 @@ int fld_client_lookup(struct lu_client_fld *fld, u64 seq, u32 *mds, /* Can not find it in the cache */ target = fld_client_get_target(fld, seq); LASSERT(target); - + origin = target; +again: CDEBUG(D_INFO, "%s: Lookup fld entry (seq: %#llx) on target %s (idx %llu)\n", fld->lcf_name, seq, fld_target_name(target), target->ft_idx); @@ -424,6 +426,23 @@ int fld_client_lookup(struct lu_client_fld *fld, u64 seq, u32 *mds, fld_range_set_type(&res, flags); rc = fld_client_rpc(target->ft_exp, &res, FLD_QUERY, NULL); + if (rc == -ESHUTDOWN) { + /* If fld lookup failed because the target has been shutdown, + * then try next target in the list, until trying all targets + * or fld lookup succeeds + */ + spin_lock(&fld->lcf_lock); + if (target->ft_chain.next == fld->lcf_targets.prev) + target = list_entry(fld->lcf_targets.next, + struct lu_fld_target, ft_chain); + else + target = list_entry(target->ft_chain.next, + struct lu_fld_target, + ft_chain); + spin_unlock(&fld->lcf_lock); + if (target != origin) + goto again; + } if (rc == 0) { *mds = res.lsr_index;