From patchwork Tue Nov 26 12:24:22 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiubo Li X-Patchwork-Id: 11262091 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id ED4461393 for ; Tue, 26 Nov 2019 12:24:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id CE1B62068E for ; Tue, 26 Nov 2019 12:24:53 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="F2BkRN5T" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728373AbfKZMYx (ORCPT ); Tue, 26 Nov 2019 07:24:53 -0500 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:38559 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728361AbfKZMYw (ORCPT ); Tue, 26 Nov 2019 07:24:52 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1574771091; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=95xHkjFfntMGoj7eo2xSGgnX9aGcbuQcnLi3CaS/buI=; b=F2BkRN5TsIgyLV9PRr30xLjZlFsRGV7Mqu3nM46Vu1mgvxoTkRnUTDWG7HOzoeIqYL1r/G hseNKFQYcbD9jHlx+vtUofJy6RILC6guhN2/hAxJqiGPObdVdxThpk6Fkkf2bk5UdkJQpC K3JpTFTtd8+zd+CIGMAd10bHQes9gV4= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-359-i-iLk7HAOfGXzveMwgqrbg-1; Tue, 26 Nov 2019 07:24:49 -0500 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 6F40E477; Tue, 26 Nov 2019 12:24:48 +0000 (UTC) Received: from localhost.localdomain (ovpn-12-66.pek2.redhat.com [10.72.12.66]) by smtp.corp.redhat.com (Postfix) with ESMTP id B60325D6C3; Tue, 26 Nov 2019 12:24:45 +0000 (UTC) From: xiubli@redhat.com To: jlayton@kernel.org, zyan@redhat.com Cc: sage@redhat.com, idryomov@gmail.com, pdonnell@redhat.com, ceph-devel@vger.kernel.org, Xiubo Li Subject: [PATCH v3 3/3] mdsmap: only choose one MDS who is in up:active state without laggy Date: Tue, 26 Nov 2019 07:24:22 -0500 Message-Id: <20191126122422.12396-4-xiubli@redhat.com> In-Reply-To: <20191126122422.12396-1-xiubli@redhat.com> References: <20191126122422.12396-1-xiubli@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-MC-Unique: i-iLk7HAOfGXzveMwgqrbg-1 X-Mimecast-Spam-Score: 0 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Xiubo Li Even the MDS is in up:active state, but it also maybe laggy. Here will skip the laggy MDSs. Signed-off-by: Xiubo Li --- fs/ceph/mds_client.c | 13 +++++++++---- fs/ceph/mdsmap.c | 30 +++++++++++++++++++++++------- 2 files changed, 32 insertions(+), 11 deletions(-) diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 0444288fe87e..2c92a1452876 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -972,14 +972,14 @@ static int __choose_mds(struct ceph_mds_client *mdsc, frag.frag, mds, (int)r, frag.ndist); if (ceph_mdsmap_get_state(mdsc->mdsmap, mds) >= - CEPH_MDS_STATE_ACTIVE) + CEPH_MDS_STATE_ACTIVE && + !ceph_mdsmap_is_laggy(mdsc->mdsmap, mds)) goto out; } /* since this file/dir wasn't known to be * replicated, then we want to look for the * authoritative mds. */ - mode = USE_AUTH_MDS; if (frag.mds >= 0) { /* choose auth mds */ mds = frag.mds; @@ -987,9 +987,14 @@ static int __choose_mds(struct ceph_mds_client *mdsc, "frag %u mds%d (auth)\n", inode, ceph_vinop(inode), frag.frag, mds); if (ceph_mdsmap_get_state(mdsc->mdsmap, mds) >= - CEPH_MDS_STATE_ACTIVE) - goto out; + CEPH_MDS_STATE_ACTIVE) { + if (mode == USE_ANY_MDS && + !ceph_mdsmap_is_laggy(mdsc->mdsmap, + mds)) + goto out; + } } + mode = USE_AUTH_MDS; } } diff --git a/fs/ceph/mdsmap.c b/fs/ceph/mdsmap.c index 3418cf2c6a12..284d68646c40 100644 --- a/fs/ceph/mdsmap.c +++ b/fs/ceph/mdsmap.c @@ -13,22 +13,24 @@ #include "super.h" +#define CEPH_MDS_IS_READY(i, ignore_laggy) \ + (m->m_info[i].state > 0 && (ignore_laggy ? true : !m->m_info[i].laggy)) -/* - * choose a random mds that is "up" (i.e. has a state > 0), or -1. - */ -int ceph_mdsmap_get_random_mds(struct ceph_mdsmap *m) +static int __mdsmap_get_random_mds(struct ceph_mdsmap *m, bool ignore_laggy) { int n = 0; int i, j; - /* special case for one mds */ + /* + * special case for one mds, no matter it is laggy or + * not we have no choice + */ if (1 == m->m_num_mds && m->m_info[0].state > 0) return 0; /* count */ for (i = 0; i < m->m_num_mds; i++) - if (m->m_info[i].state > 0) + if (CEPH_MDS_IS_READY(i, ignore_laggy)) n++; if (n == 0) return -1; @@ -36,7 +38,7 @@ int ceph_mdsmap_get_random_mds(struct ceph_mdsmap *m) /* pick */ n = prandom_u32() % n; for (j = 0, i = 0; i < m->m_num_mds; i++) { - if (m->m_info[i].state > 0) + if (CEPH_MDS_IS_READY(i, ignore_laggy)) j++; if (j > n) break; @@ -45,6 +47,20 @@ int ceph_mdsmap_get_random_mds(struct ceph_mdsmap *m) return i; } +/* + * choose a random mds that is "up" (i.e. has a state > 0), or -1. + */ +int ceph_mdsmap_get_random_mds(struct ceph_mdsmap *m) +{ + int mds; + + mds = __mdsmap_get_random_mds(m, false); + if (mds == m->m_num_mds || mds == -1) + mds = __mdsmap_get_random_mds(m, true); + + return mds == m->m_num_mds ? -1 : mds; +} + #define __decode_and_drop_type(p, end, type, bad) \ do { \ if (*p + sizeof(type) > end) \