mbox series

[v5,0/10] ceph: add perf metrics support

Message ID 20200128130248.4266-1-xiubli@redhat.com (mailing list archive)
Headers show
Series ceph: add perf metrics support | expand

Message

Xiubo Li Jan. 28, 2020, 1:02 p.m. UTC
From: Xiubo Li <xiubli@redhat.com>

Changed in V2:
- add read/write/metadata latency metric support.
- add and send client provided metric flags in client metadata
- addressed the comments from Ilya and merged the 4/4 patch into 3/4.
- addressed all the other comments in v1 series.

Changed in V3:
- addressed Jeff's comments and let's the callers do the metric
counting.
- with some small fixes for the read/write latency
- tested based on the latest testing branch

Changed in V4:
- fix the lock issue

Changed in V5:
- add r_end_stamp for the osdc request
- delete reset metric and move it to metric sysfs
- move ceph_osdc_{read,write}pages to ceph.ko
- use percpu counters instead for read/write/metadata latencies

It will send the metrics to the MDSs every second if sending_metrics is enabled, disable as default.


We can get the metrics from the debugfs:

$ cat /sys/kernel/debug/ceph/0c93a60d-5645-4c46-8568-4c8f63db4c7f.client4267/metrics 
item          total       sum_lat(us)     avg_lat(us)
-----------------------------------------------------
read          13          417000          32076
write         42          131205000       3123928
metadata      104         493000          4740

item          total           miss            hit
-------------------------------------------------
d_lease       204             0               918

session       caps            miss            hit
-------------------------------------------------
0             204             213             368218


In the MDS side, we can get the metrics(NOTE: the latency is in
nanosecond):

$ ./bin/ceph fs perf stats | python -m json.tool
{
    "client_metadata": {
        "client.4267": {
            "IP": "v1:192.168.195.165",
            "hostname": "fedora1",
            "mount_point": "N/A",
            "root": "/"
        }
    },
    "counters": [
        "cap_hit"
    ],
    "global_counters": [
        "read_latency",
        "write_latency",
        "metadata_latency",
        "dentry_lease_hit"
    ],
    "global_metrics": {
        "client.4267": [
            [
                0,
                32076923
            ],
            [
                3,
                123928571
            ],
            [
                0,
                4740384
            ],
            [
                918,
                0
            ]
        ]
    },
    "metrics": {
        "delayed_ranks": [],
        "mds.0": {
            "client.4267": [
                [
                    368218,
                    213
                ]
            ]
        }
    }
}


The provided metric flags in client metadata

$./bin/cephfs-journal-tool --rank=1:0 event get --type=SESSION json
Wrote output to JSON file 'dump'
$ cat dump
[ 
    {
        "client instance": "client.4275 v1:192.168.195.165:0/461391971",
        "open": "true",
        "client map version": 1,
        "inos": "[]",
        "inotable version": 0,
        "client_metadata": {
            "client_features": {
                "feature_bits": "0000000000001bff"
            },
            "metric_spec": {
                "metric_flags": {
                    "feature_bits": "000000000000001f"
                }
            },
            "entity_id": "",
            "hostname": "fedora1",
            "kernel_version": "5.5.0-rc2+",
            "root": "/"
        }
    },
[...]




*** BLURB HERE ***

Xiubo Li (10):
  ceph: add caps perf metric for each session
  ceph: move ceph_osdc_{read,write}pages to ceph.ko
  ceph: add r_end_stamp for the osdc request
  ceph: add global read latency metric support
  ceph: add global write latency metric support
  ceph: add global metadata perf metric support
  ceph: periodically send perf metrics to MDS
  ceph: add CEPH_DEFINE_RW_FUNC helper support
  ceph: add reset metrics support
  ceph: send client provided metric flags in client metadata

 fs/ceph/acl.c                   |   2 +
 fs/ceph/addr.c                  | 106 ++++++++++-
 fs/ceph/caps.c                  |  74 ++++++++
 fs/ceph/debugfs.c               | 140 +++++++++++++-
 fs/ceph/dir.c                   |   9 +-
 fs/ceph/file.c                  |  26 +++
 fs/ceph/mds_client.c            | 327 +++++++++++++++++++++++++++++---
 fs/ceph/mds_client.h            |  15 +-
 fs/ceph/metric.h                | 150 +++++++++++++++
 fs/ceph/quota.c                 |   9 +-
 fs/ceph/super.h                 |  13 ++
 fs/ceph/xattr.c                 |  17 +-
 include/linux/ceph/ceph_fs.h    |   1 +
 include/linux/ceph/debugfs.h    |  14 ++
 include/linux/ceph/osd_client.h |  18 +-
 net/ceph/osd_client.c           |  81 +-------
 16 files changed, 862 insertions(+), 140 deletions(-)
 create mode 100644 fs/ceph/metric.h

Comments

Ilya Dryomov Jan. 28, 2020, 2:16 p.m. UTC | #1
On Tue, Jan 28, 2020 at 2:03 PM <xiubli@redhat.com> wrote:
>
> From: Xiubo Li <xiubli@redhat.com>
>
> Changed in V2:
> - add read/write/metadata latency metric support.
> - add and send client provided metric flags in client metadata
> - addressed the comments from Ilya and merged the 4/4 patch into 3/4.
> - addressed all the other comments in v1 series.
>
> Changed in V3:
> - addressed Jeff's comments and let's the callers do the metric
> counting.
> - with some small fixes for the read/write latency
> - tested based on the latest testing branch
>
> Changed in V4:
> - fix the lock issue
>
> Changed in V5:
> - add r_end_stamp for the osdc request
> - delete reset metric and move it to metric sysfs
> - move ceph_osdc_{read,write}pages to ceph.ko
> - use percpu counters instead for read/write/metadata latencies
>
> It will send the metrics to the MDSs every second if sending_metrics is enabled, disable as default.

Hi Xiubo,

What is this series based on?  "[PATCH v5 01/10] ceph: add caps perf
metric for each session" changes metric_show() in fs/ceph/debugfs.c,
but there is no such function upstream or in the testing branch.

Thanks,

                Ilya
Xiubo Li Jan. 29, 2020, 8:23 a.m. UTC | #2
On 2020/1/28 22:16, Ilya Dryomov wrote:
> On Tue, Jan 28, 2020 at 2:03 PM <xiubli@redhat.com> wrote:
>> From: Xiubo Li <xiubli@redhat.com>
>>
>> Changed in V2:
>> - add read/write/metadata latency metric support.
>> - add and send client provided metric flags in client metadata
>> - addressed the comments from Ilya and merged the 4/4 patch into 3/4.
>> - addressed all the other comments in v1 series.
>>
>> Changed in V3:
>> - addressed Jeff's comments and let's the callers do the metric
>> counting.
>> - with some small fixes for the read/write latency
>> - tested based on the latest testing branch
>>
>> Changed in V4:
>> - fix the lock issue
>>
>> Changed in V5:
>> - add r_end_stamp for the osdc request
>> - delete reset metric and move it to metric sysfs
>> - move ceph_osdc_{read,write}pages to ceph.ko
>> - use percpu counters instead for read/write/metadata latencies
>>
>> It will send the metrics to the MDSs every second if sending_metrics is enabled, disable as default.
> Hi Xiubo,
>
> What is this series based on?  "[PATCH v5 01/10] ceph: add caps perf
> metric for each session" changes metric_show() in fs/ceph/debugfs.c,
> but there is no such function upstream or in the testing branch.

There actually 11 patches, I missed the first one, will resend it.

Thanks.


> Thanks,
>
>                  Ilya
>