From patchwork Sun Jun 28 08:42:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiubo Li X-Patchwork-Id: 11629963 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 249A3174A for ; Sun, 28 Jun 2020 08:42:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0C4FC20857 for ; Sun, 28 Jun 2020 08:42:36 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="cusBPgC0" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726167AbgF1Imf (ORCPT ); Sun, 28 Jun 2020 04:42:35 -0400 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:46229 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726156AbgF1Imf (ORCPT ); Sun, 28 Jun 2020 04:42:35 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1593333754; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:in-reply-to:in-reply-to:references:references; bh=0m8kZ4gRdQmZ7XYTCzz5brAZXZHRJSvphFqKqSIPNOY=; b=cusBPgC0vBC4KKujnlc4j3mmtJ4ga+3H7Mce4LV4ILlmXX5vrSX2GGS2A+isl7tQs0NMKo u/EcVp6faIJE4Si3ughBYw8RaAqxqnKTkocR6yniuAJoO5dtJWDD03nXgvQW2Y4CYiaMMk xAxlt+KKQTotxvs2xEq0okuXpvW8Zbs= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-10-N0jXpK8SPF6H9POrhXXPxQ-1; Sun, 28 Jun 2020 04:42:24 -0400 X-MC-Unique: N0jXpK8SPF6H9POrhXXPxQ-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id C432C1005512; Sun, 28 Jun 2020 08:42:23 +0000 (UTC) Received: from lxbceph0.gsslab.pek2.redhat.com (vm37-55.gsslab.pek2.redhat.com [10.72.37.55]) by smtp.corp.redhat.com (Postfix) with ESMTP id E9D9B1944D; Sun, 28 Jun 2020 08:42:21 +0000 (UTC) From: xiubli@redhat.com To: jlayton@kernel.org, idryomov@gmail.com Cc: zyan@redhat.com, pdonnell@redhat.com, ceph-devel@vger.kernel.org, Xiubo Li Subject: [PATCH v4 1/5] ceph: add check_session_state helper and make it global Date: Sun, 28 Jun 2020 04:42:10 -0400 Message-Id: <1593333734-27480-2-git-send-email-xiubli@redhat.com> In-Reply-To: <1593333734-27480-1-git-send-email-xiubli@redhat.com> References: <1593333734-27480-1-git-send-email-xiubli@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Xiubo Li This will be used by followed sending metrics patches. URL: https://tracker.ceph.com/issues/43215 Signed-off-by: Xiubo Li --- fs/ceph/mds_client.c | 43 ++++++++++++++++++++++++++----------------- fs/ceph/mds_client.h | 4 ++++ 2 files changed, 30 insertions(+), 17 deletions(-) diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index a504971..608fb5c 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -4263,6 +4263,30 @@ static void maybe_recover_session(struct ceph_mds_client *mdsc) ceph_force_reconnect(fsc->sb); } +bool check_session_state(struct ceph_mds_client *mdsc, + struct ceph_mds_session *s) +{ + if (s->s_state == CEPH_MDS_SESSION_CLOSING) { + dout("resending session close request for mds%d\n", + s->s_mds); + request_close_session(mdsc, s); + return false; + } + if (s->s_ttl && time_after(jiffies, s->s_ttl)) { + if (s->s_state == CEPH_MDS_SESSION_OPEN) { + s->s_state = CEPH_MDS_SESSION_HUNG; + pr_info("mds%d hung\n", s->s_mds); + } + } + if (s->s_state == CEPH_MDS_SESSION_NEW || + s->s_state == CEPH_MDS_SESSION_RESTARTING || + s->s_state == CEPH_MDS_SESSION_REJECTED) + /* this mds is failed or recovering, just wait */ + return false; + + return true; +} + /* * delayed work -- periodically trim expired leases, renew caps with mds */ @@ -4294,23 +4318,8 @@ static void delayed_work(struct work_struct *work) struct ceph_mds_session *s = __ceph_lookup_mds_session(mdsc, i); if (!s) continue; - if (s->s_state == CEPH_MDS_SESSION_CLOSING) { - dout("resending session close request for mds%d\n", - s->s_mds); - request_close_session(mdsc, s); - ceph_put_mds_session(s); - continue; - } - if (s->s_ttl && time_after(jiffies, s->s_ttl)) { - if (s->s_state == CEPH_MDS_SESSION_OPEN) { - s->s_state = CEPH_MDS_SESSION_HUNG; - pr_info("mds%d hung\n", s->s_mds); - } - } - if (s->s_state == CEPH_MDS_SESSION_NEW || - s->s_state == CEPH_MDS_SESSION_RESTARTING || - s->s_state == CEPH_MDS_SESSION_REJECTED) { - /* this mds is failed or recovering, just wait */ + + if (!check_session_state(mdsc, s)) { ceph_put_mds_session(s); continue; } diff --git a/fs/ceph/mds_client.h b/fs/ceph/mds_client.h index 5e0c407..bcb3892 100644 --- a/fs/ceph/mds_client.h +++ b/fs/ceph/mds_client.h @@ -18,6 +18,7 @@ #include #include "metric.h" +#include "super.h" /* The first 8 bits are reserved for old ceph releases */ enum ceph_feature_type { @@ -476,6 +477,9 @@ struct ceph_mds_client { extern const char *ceph_mds_op_name(int op); +extern bool check_session_state(struct ceph_mds_client *mdsc, + struct ceph_mds_session *s); + extern struct ceph_mds_session * __ceph_lookup_mds_session(struct ceph_mds_client *, int mds); From patchwork Sun Jun 28 08:42:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiubo Li X-Patchwork-Id: 11629961 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E6F2613B4 for ; Sun, 28 Jun 2020 08:42:35 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CD37020857 for ; Sun, 28 Jun 2020 08:42:35 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="KoWyPthA" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726139AbgF1Ime (ORCPT ); Sun, 28 Jun 2020 04:42:34 -0400 Received: from us-smtp-2.mimecast.com ([207.211.31.81]:43546 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726105AbgF1Ime (ORCPT ); Sun, 28 Jun 2020 04:42:34 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1593333752; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:in-reply-to:in-reply-to:references:references; bh=XJvCYta9kvuo2nFld2eAMUgvnv4oS9CcRBzrSGcWMuo=; b=KoWyPthAP4scKC3dUC0+RBe2Hjb0e4mHFQJWiQsi/WBfNilY3LcCbVrDvs+qeC6MpNjYfl wMajLBi/H4p57PJw4YRSIoYDk9tjZps/EmP/ZfIyyCTop3pZ/8n6J0BN+6i/shrKG8D9Ve jFkZshadHghEetlslGKVmRiZnFq++VU= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-277-Ihpq9HcUNKijl8g3ppI9mQ-1; Sun, 28 Jun 2020 04:42:27 -0400 X-MC-Unique: Ihpq9HcUNKijl8g3ppI9mQ-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 6338480183C; Sun, 28 Jun 2020 08:42:26 +0000 (UTC) Received: from lxbceph0.gsslab.pek2.redhat.com (vm37-55.gsslab.pek2.redhat.com [10.72.37.55]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4AA141944D; Sun, 28 Jun 2020 08:42:24 +0000 (UTC) From: xiubli@redhat.com To: jlayton@kernel.org, idryomov@gmail.com Cc: zyan@redhat.com, pdonnell@redhat.com, ceph-devel@vger.kernel.org, Xiubo Li Subject: [PATCH v4 2/5] ceph: add global total_caps to count the mdsc's total caps number Date: Sun, 28 Jun 2020 04:42:11 -0400 Message-Id: <1593333734-27480-3-git-send-email-xiubli@redhat.com> In-Reply-To: <1593333734-27480-1-git-send-email-xiubli@redhat.com> References: <1593333734-27480-1-git-send-email-xiubli@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Xiubo Li This will help to reduce using the global mdsc->metux lock in many places. URL: https://tracker.ceph.com/issues/43215 Signed-off-by: Xiubo Li --- fs/ceph/caps.c | 2 ++ fs/ceph/debugfs.c | 14 ++------------ fs/ceph/mds_client.c | 1 + fs/ceph/metric.c | 1 + fs/ceph/metric.h | 1 + 5 files changed, 7 insertions(+), 12 deletions(-) diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index 972c13a..5f48940 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -668,6 +668,7 @@ void ceph_add_cap(struct inode *inode, spin_lock(&session->s_cap_lock); list_add_tail(&cap->session_caps, &session->s_caps); session->s_nr_caps++; + atomic64_inc(&mdsc->metric.total_caps); spin_unlock(&session->s_cap_lock); } else { spin_lock(&session->s_cap_lock); @@ -1161,6 +1162,7 @@ void __ceph_remove_cap(struct ceph_cap *cap, bool queue_release) } else { list_del_init(&cap->session_caps); session->s_nr_caps--; + atomic64_dec(&mdsc->metric.total_caps); cap->session = NULL; removed = 1; } diff --git a/fs/ceph/debugfs.c b/fs/ceph/debugfs.c index 070ed84..3030f55 100644 --- a/fs/ceph/debugfs.c +++ b/fs/ceph/debugfs.c @@ -145,7 +145,7 @@ static int metric_show(struct seq_file *s, void *p) struct ceph_fs_client *fsc = s->private; struct ceph_mds_client *mdsc = fsc->mdsc; struct ceph_client_metric *m = &mdsc->metric; - int i, nr_caps = 0; + int nr_caps = 0; s64 total, sum, avg, min, max, sq; seq_printf(s, "item total avg_lat(us) min_lat(us) max_lat(us) stdev(us)\n"); @@ -190,17 +190,7 @@ static int metric_show(struct seq_file *s, void *p) percpu_counter_sum(&m->d_lease_mis), percpu_counter_sum(&m->d_lease_hit)); - mutex_lock(&mdsc->mutex); - for (i = 0; i < mdsc->max_sessions; i++) { - struct ceph_mds_session *s; - - s = __ceph_lookup_mds_session(mdsc, i); - if (!s) - continue; - nr_caps += s->s_nr_caps; - ceph_put_mds_session(s); - } - mutex_unlock(&mdsc->mutex); + nr_caps = atomic64_read(&m->total_caps); seq_printf(s, "%-14s%-16d%-16lld%lld\n", "caps", nr_caps, percpu_counter_sum(&m->i_caps_mis), percpu_counter_sum(&m->i_caps_hit)); diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 608fb5c..2eeab10 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -1485,6 +1485,7 @@ int ceph_iterate_session_caps(struct ceph_mds_session *session, cap->session = NULL; list_del_init(&cap->session_caps); session->s_nr_caps--; + atomic64_dec(&session->s_mdsc->metric.total_caps); if (cap->queue_release) __ceph_queue_cap_release(session, cap); else diff --git a/fs/ceph/metric.c b/fs/ceph/metric.c index 9217f35..269eacb 100644 --- a/fs/ceph/metric.c +++ b/fs/ceph/metric.c @@ -22,6 +22,7 @@ int ceph_metric_init(struct ceph_client_metric *m) if (ret) goto err_d_lease_mis; + atomic64_set(&m->total_caps, 0); ret = percpu_counter_init(&m->i_caps_hit, 0, GFP_KERNEL); if (ret) goto err_i_caps_hit; diff --git a/fs/ceph/metric.h b/fs/ceph/metric.h index ccd8128..23a3373 100644 --- a/fs/ceph/metric.h +++ b/fs/ceph/metric.h @@ -12,6 +12,7 @@ struct ceph_client_metric { struct percpu_counter d_lease_hit; struct percpu_counter d_lease_mis; + atomic64_t total_caps; struct percpu_counter i_caps_hit; struct percpu_counter i_caps_mis; From patchwork Sun Jun 28 08:42:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiubo Li X-Patchwork-Id: 11629965 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0176A138C for ; Sun, 28 Jun 2020 08:42:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D2129206D7 for ; Sun, 28 Jun 2020 08:42:37 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="gVtFcMUs" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726055AbgF1Img (ORCPT ); Sun, 28 Jun 2020 04:42:36 -0400 Received: from us-smtp-1.mimecast.com ([205.139.110.61]:38636 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726105AbgF1Img (ORCPT ); Sun, 28 Jun 2020 04:42:36 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1593333754; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:in-reply-to:in-reply-to:references:references; bh=lN3Z8+PYz3ibbgnOa0vbBYKkClaCPVwVP/s9CCIBiBU=; b=gVtFcMUsxyR1+/Xe16aBDxZKIm/uuv7GCZuHGtCCymz6zM3OEtB5iB5h94Ek7YOjsun8vO lK05lpQWfXH5q8Sd4PiFfBJCjk789E+SBKwRLpkiwZusCWN6mmUV0arMVNCt5x2eR2ae6Y u27JmRxhTmLqMniJbLSEJB8WYX082gg= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-118-aISY8DbDPDaJtNj-tgSo0Q-1; Sun, 28 Jun 2020 04:42:30 -0400 X-MC-Unique: aISY8DbDPDaJtNj-tgSo0Q-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 031AE8015CE; Sun, 28 Jun 2020 08:42:29 +0000 (UTC) Received: from lxbceph0.gsslab.pek2.redhat.com (vm37-55.gsslab.pek2.redhat.com [10.72.37.55]) by smtp.corp.redhat.com (Postfix) with ESMTP id DE1E096B62; Sun, 28 Jun 2020 08:42:26 +0000 (UTC) From: xiubli@redhat.com To: jlayton@kernel.org, idryomov@gmail.com Cc: zyan@redhat.com, pdonnell@redhat.com, ceph-devel@vger.kernel.org, Xiubo Li Subject: [PATCH v4 3/5] ceph: periodically send perf metrics to ceph Date: Sun, 28 Jun 2020 04:42:12 -0400 Message-Id: <1593333734-27480-4-git-send-email-xiubli@redhat.com> In-Reply-To: <1593333734-27480-1-git-send-email-xiubli@redhat.com> References: <1593333734-27480-1-git-send-email-xiubli@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Xiubo Li This will send the caps/read/write/metadata metrics to any available MDS only once per second as default, which will be the same as the userland client. It will skip the MDS sessions which don't support the metric collection, or the MDSs will close the socket connections directly when it get an unknown type message. We can disable the metric sending via the enable_send_metric module parameter. URL: https://tracker.ceph.com/issues/43215 Signed-off-by: Xiubo Li --- fs/ceph/mds_client.c | 29 +++++++++ fs/ceph/mds_client.h | 6 +- fs/ceph/metric.c | 142 +++++++++++++++++++++++++++++++++++++++++++ fs/ceph/metric.h | 75 +++++++++++++++++++++++ fs/ceph/super.c | 44 ++++++++++++++ fs/ceph/super.h | 2 + include/linux/ceph/ceph_fs.h | 1 + 7 files changed, 298 insertions(+), 1 deletion(-) diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 2eeab10..18f43a4 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -754,6 +754,7 @@ static struct ceph_mds_session *register_session(struct ceph_mds_client *mdsc, s->s_cap_iterator = NULL; INIT_LIST_HEAD(&s->s_cap_releases); INIT_WORK(&s->s_cap_release_work, ceph_cap_release_work); + INIT_DELAYED_WORK(&s->metric_delayed_work, ceph_metric_delayed_work); INIT_LIST_HEAD(&s->s_cap_dirty); INIT_LIST_HEAD(&s->s_cap_flushing); @@ -1801,6 +1802,20 @@ static int request_close_session(struct ceph_mds_client *mdsc, return 1; } +static void try_reset_metric_work(struct ceph_mds_client *mdsc, + struct ceph_mds_session *session) +{ + mutex_lock(&mdsc->mutex); + if (mdsc->metric.mds == session->s_mds) { + mdsc->metric.mds = -1; + cancel_delayed_work_sync(&session->metric_delayed_work); + + /* Choose a new session to run the metric work */ + ceph_choose_new_metric_session(mdsc); + } + mutex_unlock(&mdsc->mutex); +} + /* * Called with s_mutex held. */ @@ -1810,6 +1825,9 @@ static int __close_session(struct ceph_mds_client *mdsc, if (session->s_state >= CEPH_MDS_SESSION_CLOSING) return 0; session->s_state = CEPH_MDS_SESSION_CLOSING; + + try_reset_metric_work(mdsc, session); + return request_close_session(mdsc, session); } @@ -3312,6 +3330,15 @@ static void handle_session(struct ceph_mds_session *session, session->s_features = features; renewed_caps(mdsc, session, 0); wake = 1; + + mutex_lock(&mdsc->mutex); + if (mdsc->metric.mds < 0 && + test_bit(CEPHFS_FEATURE_METRIC_COLLECT, &session->s_features)) { + ceph_metric_schedule_delayed(session); + mdsc->metric.mds = session->s_mds; + } + mutex_unlock(&mdsc->mutex); + if (mdsc->stopping) __close_session(mdsc, session); break; @@ -3806,6 +3833,8 @@ static void send_mds_reconnect(struct ceph_mds_client *mdsc, xa_destroy(&session->s_delegated_inos); + try_reset_metric_work(mdsc, session); + mutex_lock(&session->s_mutex); session->s_state = CEPH_MDS_SESSION_RECONNECTING; session->s_seq = 0; diff --git a/fs/ceph/mds_client.h b/fs/ceph/mds_client.h index bcb3892..daa905f 100644 --- a/fs/ceph/mds_client.h +++ b/fs/ceph/mds_client.h @@ -28,8 +28,9 @@ enum ceph_feature_type { CEPHFS_FEATURE_LAZY_CAP_WANTED, CEPHFS_FEATURE_MULTI_RECONNECT, CEPHFS_FEATURE_DELEG_INO, + CEPHFS_FEATURE_METRIC_COLLECT, - CEPHFS_FEATURE_MAX = CEPHFS_FEATURE_DELEG_INO, + CEPHFS_FEATURE_MAX = CEPHFS_FEATURE_METRIC_COLLECT, }; /* @@ -43,6 +44,7 @@ enum ceph_feature_type { CEPHFS_FEATURE_LAZY_CAP_WANTED, \ CEPHFS_FEATURE_MULTI_RECONNECT, \ CEPHFS_FEATURE_DELEG_INO, \ + CEPHFS_FEATURE_METRIC_COLLECT, \ \ CEPHFS_FEATURE_MAX, \ } @@ -212,6 +214,8 @@ struct ceph_mds_session { struct list_head s_waiting; /* waiting requests */ struct list_head s_unsafe; /* unsafe requests */ struct xarray s_delegated_inos; + + struct delayed_work metric_delayed_work; /* delayed work */ }; /* diff --git a/fs/ceph/metric.c b/fs/ceph/metric.c index 269eacb..47eba86 100644 --- a/fs/ceph/metric.c +++ b/fs/ceph/metric.c @@ -1,10 +1,150 @@ /* SPDX-License-Identifier: GPL-2.0 */ +#include #include #include #include #include "metric.h" +#include "mds_client.h" + +void ceph_metric_delayed_work(struct work_struct *work) +{ + struct ceph_mds_session *s = + container_of(work, struct ceph_mds_session, metric_delayed_work.work); + struct ceph_mds_client *mdsc = s->s_mdsc; + u64 nr_caps = atomic64_read(&mdsc->metric.total_caps); + struct ceph_metric_head *head; + struct ceph_metric_cap *cap; + struct ceph_metric_read_latency *read; + struct ceph_metric_write_latency *write; + struct ceph_metric_metadata_latency *meta; + struct ceph_client_metric *m = &mdsc->metric; + struct ceph_msg *msg; + struct timespec64 ts; + s64 sum, total; + s32 items = 0; + s32 len; + + len = sizeof(*head) + sizeof(*cap) + sizeof(*read) + sizeof(*write) + + sizeof(*meta); + + msg = ceph_msg_new(CEPH_MSG_CLIENT_METRICS, len, GFP_NOFS, true); + if (!msg) { + pr_err("send metrics to mds%d, failed to allocate message\n", + s->s_mds); + return; + } + + head = msg->front.iov_base; + + /* encode the cap metric */ + cap = (struct ceph_metric_cap *)(head + 1); + cap->type = cpu_to_le32(CLIENT_METRIC_TYPE_CAP_INFO); + cap->ver = 1; + cap->compat = 1; + cap->data_len = cpu_to_le32(sizeof(*cap) - 10); + cap->hit = cpu_to_le64(percpu_counter_sum(&mdsc->metric.i_caps_hit)); + cap->mis = cpu_to_le64(percpu_counter_sum(&mdsc->metric.i_caps_mis)); + cap->total = cpu_to_le64(nr_caps); + items++; + + /* encode the read latency metric */ + read = (struct ceph_metric_read_latency *)(cap + 1); + read->type = cpu_to_le32(CLIENT_METRIC_TYPE_READ_LATENCY); + read->ver = 1; + read->compat = 1; + read->data_len = cpu_to_le32(sizeof(*read) - 10); + total = m->total_reads; + sum = m->read_latency_sum; + jiffies_to_timespec64(sum, &ts); + read->sec = cpu_to_le32(ts.tv_sec); + read->nsec = cpu_to_le32(ts.tv_nsec); + items++; + + /* encode the write latency metric */ + write = (struct ceph_metric_write_latency *)(read + 1); + write->type = cpu_to_le32(CLIENT_METRIC_TYPE_WRITE_LATENCY); + write->ver = 1; + write->compat = 1; + write->data_len = cpu_to_le32(sizeof(*write) - 10); + total = m->total_writes; + sum = m->write_latency_sum; + jiffies_to_timespec64(sum, &ts); + write->sec = cpu_to_le32(ts.tv_sec); + write->nsec = cpu_to_le32(ts.tv_nsec); + items++; + + /* encode the metadata latency metric */ + meta = (struct ceph_metric_metadata_latency *)(write + 1); + meta->type = cpu_to_le32(CLIENT_METRIC_TYPE_METADATA_LATENCY); + meta->ver = 1; + meta->compat = 1; + meta->data_len = cpu_to_le32(sizeof(*meta) - 10); + total = m->total_metadatas; + sum = m->metadata_latency_sum; + jiffies_to_timespec64(sum, &ts); + meta->sec = cpu_to_le32(ts.tv_sec); + meta->nsec = cpu_to_le32(ts.tv_nsec); + items++; + + put_unaligned_le32(items, &head->num); + msg->front.iov_len = cpu_to_le32(len); + msg->hdr.version = cpu_to_le16(1); + msg->hdr.compat_version = cpu_to_le16(1); + msg->hdr.front_len = cpu_to_le32(msg->front.iov_len); + dout("client%llu send metrics to mds%d\n", + ceph_client_gid(mdsc->fsc->client), s->s_mds); + ceph_con_send(&s->s_con, msg); + + ceph_metric_schedule_delayed(s); +} + +/* + * called under mdsc->mutex + */ +void ceph_choose_new_metric_session(struct ceph_mds_client *mdsc) +{ + struct ceph_mds_session *s; + int i; + + if (mdsc->stopping) + return; + + for (i = 0; i < mdsc->max_sessions; i++) { + s = __ceph_lookup_mds_session(mdsc, i); + if (!s) + continue; + + if (!check_session_state(mdsc, s)) { + ceph_put_mds_session(s); + continue; + } + + /* + * Skip it if MDS doesn't support the metric collection, + * or the MDS will close the session's socket connection + * directly when it get this message. + */ + if (test_bit(CEPHFS_FEATURE_METRIC_COLLECT, &s->s_features)) { + mdsc->metric.mds = i; + ceph_metric_schedule_delayed(s); + ceph_put_mds_session(s); + return; + } + + ceph_put_mds_session(s); + } +} + +void ceph_metric_schedule_delayed(struct ceph_mds_session *s) +{ + if (!enable_send_metrics) + return; + + /* per second */ + schedule_delayed_work(&s->metric_delayed_work, round_jiffies_relative(HZ)); +} int ceph_metric_init(struct ceph_client_metric *m) { @@ -52,6 +192,8 @@ int ceph_metric_init(struct ceph_client_metric *m) m->total_metadatas = 0; m->metadata_latency_sum = 0; + m->mds = -1; + return 0; err_i_caps_mis: diff --git a/fs/ceph/metric.h b/fs/ceph/metric.h index 23a3373..1a7e3fb 100644 --- a/fs/ceph/metric.h +++ b/fs/ceph/metric.h @@ -6,6 +6,74 @@ #include #include +struct ceph_mds_client; +struct ceph_mds_session; + +extern bool enable_send_metrics; + +enum ceph_metric_type { + CLIENT_METRIC_TYPE_CAP_INFO, + CLIENT_METRIC_TYPE_READ_LATENCY, + CLIENT_METRIC_TYPE_WRITE_LATENCY, + CLIENT_METRIC_TYPE_METADATA_LATENCY, + CLIENT_METRIC_TYPE_DENTRY_LEASE, + + CLIENT_METRIC_TYPE_MAX = CLIENT_METRIC_TYPE_DENTRY_LEASE, +}; + +/* metric caps header */ +struct ceph_metric_cap { + __le32 type; /* ceph metric type */ + + __u8 ver; + __u8 compat; + + __le32 data_len; /* length of sizeof(hit + mis + total) */ + __le64 hit; + __le64 mis; + __le64 total; +} __packed; + +/* metric read latency header */ +struct ceph_metric_read_latency { + __le32 type; /* ceph metric type */ + + __u8 ver; + __u8 compat; + + __le32 data_len; /* length of sizeof(sec + nsec) */ + __le32 sec; + __le32 nsec; +} __packed; + +/* metric write latency header */ +struct ceph_metric_write_latency { + __le32 type; /* ceph metric type */ + + __u8 ver; + __u8 compat; + + __le32 data_len; /* length of sizeof(sec + nsec) */ + __le32 sec; + __le32 nsec; +} __packed; + +/* metric metadata latency header */ +struct ceph_metric_metadata_latency { + __le32 type; /* ceph metric type */ + + __u8 ver; + __u8 compat; + + __le32 data_len; /* length of sizeof(sec + nsec) */ + __le32 sec; + __le32 nsec; +} __packed; + +struct ceph_metric_head { + __le32 num; /* the number of metrics that will be sent */ +} __packed; + /* This is the global metrics */ struct ceph_client_metric { atomic64_t total_dentries; @@ -36,8 +104,15 @@ struct ceph_client_metric { ktime_t metadata_latency_sq_sum; ktime_t metadata_latency_min; ktime_t metadata_latency_max; + + /* The MDS session running the metric work */ + int mds; }; +extern void ceph_metric_delayed_work(struct work_struct *work); +extern void ceph_metric_schedule_delayed(struct ceph_mds_session *s); +extern void ceph_choose_new_metric_session(struct ceph_mds_client *mdsc); + extern int ceph_metric_init(struct ceph_client_metric *m); extern void ceph_metric_destroy(struct ceph_client_metric *m); diff --git a/fs/ceph/super.c b/fs/ceph/super.c index c9784eb1..510ccb1 100644 --- a/fs/ceph/super.c +++ b/fs/ceph/super.c @@ -27,6 +27,9 @@ #include #include +static DEFINE_MUTEX(ceph_fsc_lock); +static LIST_HEAD(ceph_fsc_list); + /* * Ceph superblock operations * @@ -691,6 +694,10 @@ static struct ceph_fs_client *create_fs_client(struct ceph_mount_options *fsopt, if (!fsc->wb_pagevec_pool) goto fail_cap_wq; + mutex_lock(&ceph_fsc_lock); + list_add_tail(&fsc->list, &ceph_fsc_list); + mutex_unlock(&ceph_fsc_lock); + return fsc; fail_cap_wq: @@ -717,6 +724,10 @@ static void destroy_fs_client(struct ceph_fs_client *fsc) { dout("destroy_fs_client %p\n", fsc); + mutex_lock(&ceph_fsc_lock); + list_del(&fsc->list); + mutex_unlock(&ceph_fsc_lock); + ceph_mdsc_destroy(fsc); destroy_workqueue(fsc->inode_wq); destroy_workqueue(fsc->cap_wq); @@ -1282,6 +1293,39 @@ static void __exit exit_ceph(void) destroy_caches(); } +static int param_set_metrics(const char *val, const struct kernel_param *kp) +{ + struct ceph_fs_client *fsc; + int ret; + + ret = param_set_bool(val, kp); + if (ret) { + pr_err("Failed to parse sending metrics switch value '%s'\n", + val); + return ret; + } else if (enable_send_metrics) { + // wake up all the mds clients + mutex_lock(&ceph_fsc_lock); + list_for_each_entry(fsc, &ceph_fsc_list, list) { + mutex_lock(&fsc->mdsc->mutex); + ceph_choose_new_metric_session(fsc->mdsc); + mutex_unlock(&fsc->mdsc->mutex); + } + mutex_unlock(&ceph_fsc_lock); + } + + return 0; +} + +static const struct kernel_param_ops param_ops_metrics = { + .set = param_set_metrics, + .get = param_get_bool, +}; + +bool enable_send_metrics = true; +module_param_cb(enable_send_metrics, ¶m_ops_metrics, &enable_send_metrics, 0644); +MODULE_PARM_DESC(enable_send_metrics, "Enable sending perf metrics to ceph cluster (default: on)"); + module_init(init_ceph); module_exit(exit_ceph); diff --git a/fs/ceph/super.h b/fs/ceph/super.h index 5a6cdd3..05edc9a 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -101,6 +101,8 @@ struct ceph_mount_options { struct ceph_fs_client { struct super_block *sb; + struct list_head list; + struct ceph_mount_options *mount_options; struct ceph_client *client; diff --git a/include/linux/ceph/ceph_fs.h b/include/linux/ceph/ceph_fs.h index ebf5ba6..455e9b9 100644 --- a/include/linux/ceph/ceph_fs.h +++ b/include/linux/ceph/ceph_fs.h @@ -130,6 +130,7 @@ struct ceph_dir_layout { #define CEPH_MSG_CLIENT_REQUEST 24 #define CEPH_MSG_CLIENT_REQUEST_FORWARD 25 #define CEPH_MSG_CLIENT_REPLY 26 +#define CEPH_MSG_CLIENT_METRICS 29 #define CEPH_MSG_CLIENT_CAPS 0x310 #define CEPH_MSG_CLIENT_LEASE 0x311 #define CEPH_MSG_CLIENT_SNAP 0x312 From patchwork Sun Jun 28 08:42:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiubo Li X-Patchwork-Id: 11629967 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 09469138C for ; Sun, 28 Jun 2020 08:42:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E283B206D7 for ; Sun, 28 Jun 2020 08:42:40 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="UJcaZC2q" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726182AbgF1Imk (ORCPT ); Sun, 28 Jun 2020 04:42:40 -0400 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:39265 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726038AbgF1Imj (ORCPT ); Sun, 28 Jun 2020 04:42:39 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1593333756; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:in-reply-to:in-reply-to:references:references; bh=e5+thvU8JVFJRquPvRQf6oqeMZulIlX8MqcpFcz1SNk=; b=UJcaZC2qyqrN+BS5clA9EQd7mghHW0FFpkYV20KzjLyuElOKDist1zKTuFfj21YAXAaIvO 3jSskZlNYBTmUTSydQaqJyJfB5Bzf0m9rGmbJfVjszRQuVFtfkAkbHWpOfx+UETdw06DMK JH+QbARvEaJREYVxHjJCilQwU31KfOI= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-251-5AIVuOD8N76i76ycywL0rw-1; Sun, 28 Jun 2020 04:42:32 -0400 X-MC-Unique: 5AIVuOD8N76i76ycywL0rw-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 98A451800D42; Sun, 28 Jun 2020 08:42:31 +0000 (UTC) Received: from lxbceph0.gsslab.pek2.redhat.com (vm37-55.gsslab.pek2.redhat.com [10.72.37.55]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7E8451944D; Sun, 28 Jun 2020 08:42:29 +0000 (UTC) From: xiubli@redhat.com To: jlayton@kernel.org, idryomov@gmail.com Cc: zyan@redhat.com, pdonnell@redhat.com, ceph-devel@vger.kernel.org, Xiubo Li Subject: [PATCH v4 4/5] ceph: switch to WARN_ON_ONCE and bubble up errnos to the callers Date: Sun, 28 Jun 2020 04:42:13 -0400 Message-Id: <1593333734-27480-5-git-send-email-xiubli@redhat.com> In-Reply-To: <1593333734-27480-1-git-send-email-xiubli@redhat.com> References: <1593333734-27480-1-git-send-email-xiubli@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Xiubo Li Signed-off-by: Xiubo Li --- fs/ceph/mds_client.c | 46 +++++++++++++++++++++++++++++++++++----------- 1 file changed, 35 insertions(+), 11 deletions(-) diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 18f43a4..eda788b 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -1169,7 +1169,7 @@ static struct ceph_msg *create_session_msg(u32 op, u64 seq) static const unsigned char feature_bits[] = CEPHFS_FEATURES_CLIENT_SUPPORTED; #define FEATURE_BYTES(c) (DIV_ROUND_UP((size_t)feature_bits[c - 1] + 1, 64) * 8) -static void encode_supported_features(void **p, void *end) +static int encode_supported_features(void **p, void *end) { static const size_t count = ARRAY_SIZE(feature_bits); @@ -1177,16 +1177,22 @@ static void encode_supported_features(void **p, void *end) size_t i; size_t size = FEATURE_BYTES(count); - BUG_ON(*p + 4 + size > end); + if (WARN_ON_ONCE(*p + 4 + size > end)) + return -ERANGE; + ceph_encode_32(p, size); memset(*p, 0, size); for (i = 0; i < count; i++) ((unsigned char*)(*p))[i / 8] |= BIT(feature_bits[i] % 8); *p += size; } else { - BUG_ON(*p + 4 > end); + if (WARN_ON_ONCE(*p + 4 > end)) + return -ERANGE; + ceph_encode_32(p, 0); } + + return 0; } /* @@ -1204,6 +1210,7 @@ static struct ceph_msg *create_session_open_msg(struct ceph_mds_client *mdsc, u6 struct ceph_mount_options *fsopt = mdsc->fsc->mount_options; size_t size, count; void *p, *end; + int ret; const char* metadata[][2] = { {"hostname", mdsc->nodename}, @@ -1233,7 +1240,7 @@ static struct ceph_msg *create_session_open_msg(struct ceph_mds_client *mdsc, u6 GFP_NOFS, false); if (!msg) { pr_err("create_session_msg ENOMEM creating msg\n"); - return NULL; + return ERR_PTR(-ENOMEM); } p = msg->front.iov_base; end = p + msg->front.iov_len; @@ -1270,7 +1277,13 @@ static struct ceph_msg *create_session_open_msg(struct ceph_mds_client *mdsc, u6 p += val_len; } - encode_supported_features(&p, end); + ret = encode_supported_features(&p, end); + if (ret) { + pr_err("encode_supported_features failed!\n"); + ceph_msg_put(msg); + return ERR_PTR(ret); + } + msg->front.iov_len = p - msg->front.iov_base; msg->hdr.front_len = cpu_to_le32(msg->front.iov_len); @@ -1298,8 +1311,8 @@ static int __open_session(struct ceph_mds_client *mdsc, /* send connect message */ msg = create_session_open_msg(mdsc, session->s_seq); - if (!msg) - return -ENOMEM; + if (IS_ERR(msg)) + return PTR_ERR(msg); ceph_con_send(&session->s_con, msg); return 0; } @@ -1313,6 +1326,7 @@ static int __open_session(struct ceph_mds_client *mdsc, __open_export_target_session(struct ceph_mds_client *mdsc, int target) { struct ceph_mds_session *session; + int ret; session = __ceph_lookup_mds_session(mdsc, target); if (!session) { @@ -1321,8 +1335,11 @@ static int __open_session(struct ceph_mds_client *mdsc, return session; } if (session->s_state == CEPH_MDS_SESSION_NEW || - session->s_state == CEPH_MDS_SESSION_CLOSING) - __open_session(mdsc, session); + session->s_state == CEPH_MDS_SESSION_CLOSING) { + ret = __open_session(mdsc, session); + if (ret) + return ERR_PTR(ret); + } return session; } @@ -2539,7 +2556,12 @@ static struct ceph_msg *create_request_message(struct ceph_mds_client *mdsc, ceph_encode_copy(&p, &ts, sizeof(ts)); } - BUG_ON(p > end); + if (WARN_ON_ONCE(p > end)) { + ceph_msg_put(msg); + msg = ERR_PTR(-ERANGE); + goto out_free2; + } + msg->front.iov_len = p - msg->front.iov_base; msg->hdr.front_len = cpu_to_le32(msg->front.iov_len); @@ -2775,7 +2797,9 @@ static void __do_request(struct ceph_mds_client *mdsc, } if (session->s_state == CEPH_MDS_SESSION_NEW || session->s_state == CEPH_MDS_SESSION_CLOSING) { - __open_session(mdsc, session); + err = __open_session(mdsc, session); + if (err) + goto out_session; /* retry the same mds later */ if (random) req->r_resend_mds = mds; From patchwork Sun Jun 28 08:42:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiubo Li X-Patchwork-Id: 11629969 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 231A413B4 for ; Sun, 28 Jun 2020 08:42:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0C04E206D7 for ; Sun, 28 Jun 2020 08:42:48 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="gU3CyRM8" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726189AbgF1Imr (ORCPT ); Sun, 28 Jun 2020 04:42:47 -0400 Received: from us-smtp-2.mimecast.com ([205.139.110.61]:46366 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726038AbgF1Imq (ORCPT ); Sun, 28 Jun 2020 04:42:46 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1593333765; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:in-reply-to:in-reply-to:references:references; bh=BCof1nnpSGj+PiX1OxJGZ2vWWATpWW5V9RparxIpk+0=; b=gU3CyRM88lIomiofrZfyJ4hE5Nc0WFzAX016/6p2YCLYyBZxrpoIjGFvbWhLroLS9zcTu5 t2c8iQHvFzDfzRRqhZq6auEWd8KIhdqMX3WtUs6KH57QkOSexH47pwjp4IlwI4sMCaAJ9r eXKltI6U3AB5RtAgMS98scJeGeG2kTc= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-262-QYfMxbrSOCWoq6JMwj7H6g-1; Sun, 28 Jun 2020 04:42:40 -0400 X-MC-Unique: QYfMxbrSOCWoq6JMwj7H6g-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 632901009440; Sun, 28 Jun 2020 08:42:39 +0000 (UTC) Received: from lxbceph0.gsslab.pek2.redhat.com (vm37-55.gsslab.pek2.redhat.com [10.72.37.55]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1EB1396B62; Sun, 28 Jun 2020 08:42:31 +0000 (UTC) From: xiubli@redhat.com To: jlayton@kernel.org, idryomov@gmail.com Cc: zyan@redhat.com, pdonnell@redhat.com, ceph-devel@vger.kernel.org, Xiubo Li Subject: [PATCH v4 5/5] ceph: send client provided metric flags in client metadata Date: Sun, 28 Jun 2020 04:42:14 -0400 Message-Id: <1593333734-27480-6-git-send-email-xiubli@redhat.com> In-Reply-To: <1593333734-27480-1-git-send-email-xiubli@redhat.com> References: <1593333734-27480-1-git-send-email-xiubli@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Xiubo Li Will send the metric flags to MDS, currently it supports the cap, read latency, write latency and metadata latency. URL: https://tracker.ceph.com/issues/43435 Signed-off-by: Xiubo Li --- fs/ceph/mds_client.c | 60 ++++++++++++++++++++++++++++++++++++++++++++++++++-- fs/ceph/metric.h | 13 ++++++++++++ 2 files changed, 71 insertions(+), 2 deletions(-) diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index eda788b..a3f75d8 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -1195,6 +1195,48 @@ static int encode_supported_features(void **p, void *end) return 0; } +static const unsigned char metric_bits[] = CEPHFS_METRIC_SPEC_CLIENT_SUPPORTED; +#define METRIC_BYTES(cnt) (DIV_ROUND_UP((size_t)metric_bits[cnt - 1] + 1, 64) * 8) +static int encode_metric_spec(void **p, void *end) +{ + static const size_t count = ARRAY_SIZE(metric_bits); + + /* header */ + if (WARN_ON_ONCE(*p + 2 > end)) + return -ERANGE; + + ceph_encode_8(p, 1); /* version */ + ceph_encode_8(p, 1); /* compat */ + + if (count > 0) { + size_t i; + size_t size = METRIC_BYTES(count); + + if (WARN_ON_ONCE(*p + 4 + 4 + size > end)) + return -ERANGE; + + /* metric spec info length */ + ceph_encode_32(p, 4 + size); + + /* metric spec */ + ceph_encode_32(p, size); + memset(*p, 0, size); + for (i = 0; i < count; i++) + ((unsigned char *)(*p))[i / 8] |= BIT(metric_bits[i] % 8); + *p += size; + } else { + if (WARN_ON_ONCE(*p + 4 + 4 > end)) + return -ERANGE; + + /* metric spec info length */ + ceph_encode_32(p, 4); + /* metric spec */ + ceph_encode_32(p, 0); + } + + return 0; +} + /* * session message, specialization for CEPH_SESSION_REQUEST_OPEN * to include additional client metadata fields. @@ -1235,6 +1277,13 @@ static struct ceph_msg *create_session_open_msg(struct ceph_mds_client *mdsc, u6 size = FEATURE_BYTES(count); extra_bytes += 4 + size; + /* metric spec */ + size = 0; + count = ARRAY_SIZE(metric_bits); + if (count > 0) + size = METRIC_BYTES(count); + extra_bytes += 2 + 4 + 4 + size; + /* Allocate the message */ msg = ceph_msg_new(CEPH_MSG_CLIENT_SESSION, sizeof(*h) + extra_bytes, GFP_NOFS, false); @@ -1253,9 +1302,9 @@ static struct ceph_msg *create_session_open_msg(struct ceph_mds_client *mdsc, u6 * Serialize client metadata into waiting buffer space, using * the format that userspace expects for map * - * ClientSession messages with metadata are v3 + * ClientSession messages with metadata are v4 */ - msg->hdr.version = cpu_to_le16(3); + msg->hdr.version = cpu_to_le16(4); msg->hdr.compat_version = cpu_to_le16(1); /* The write pointer, following the session_head structure */ @@ -1284,6 +1333,13 @@ static struct ceph_msg *create_session_open_msg(struct ceph_mds_client *mdsc, u6 return ERR_PTR(ret); } + ret = encode_metric_spec(&p, end); + if (ret) { + pr_err("encode_metric_spec failed!\n"); + ceph_msg_put(msg); + return ERR_PTR(ret); + } + msg->front.iov_len = p - msg->front.iov_base; msg->hdr.front_len = cpu_to_le32(msg->front.iov_len); diff --git a/fs/ceph/metric.h b/fs/ceph/metric.h index 1a7e3fb..6de86d3 100644 --- a/fs/ceph/metric.h +++ b/fs/ceph/metric.h @@ -21,6 +21,19 @@ enum ceph_metric_type { CLIENT_METRIC_TYPE_MAX = CLIENT_METRIC_TYPE_DENTRY_LEASE, }; +/* + * This will always have the highest metric bit value + * as the last element of the array. + */ +#define CEPHFS_METRIC_SPEC_CLIENT_SUPPORTED { \ + CLIENT_METRIC_TYPE_CAP_INFO, \ + CLIENT_METRIC_TYPE_READ_LATENCY, \ + CLIENT_METRIC_TYPE_WRITE_LATENCY, \ + CLIENT_METRIC_TYPE_METADATA_LATENCY, \ + \ + CLIENT_METRIC_TYPE_MAX, \ +} + /* metric caps header */ struct ceph_metric_cap { __le32 type; /* ceph metric type */