[RFC,v1,0/4] Implement performance impact measurement tool

Message ID	20240816005943.1832694-1-ivanov.mikhail1@huawei-partners.com (mailing list archive)
Headers	show Received: from szxga04-in.huawei.com (szxga04-in.huawei.com [45.249.212.190]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 889D5BA33; Fri, 16 Aug 2024 01:00:17 +0000 (UTC) From: Mikhail Ivanov <ivanov.mikhail1@huawei-partners.com> To: <mic@digikod.net> CC: <willemdebruijn.kernel@gmail.com>, <gnoack3000@gmail.com>, <linux-security-module@vger.kernel.org>, <netdev@vger.kernel.org>, <netfilter-devel@vger.kernel.org>, <yusongping@huawei.com>, <artem.kuzin@huawei.com>, <konstantin.meskhidze@huawei.com> Subject: [RFC PATCH v1 0/4] Implement performance impact measurement tool Date: Fri, 16 Aug 2024 08:59:39 +0800 Message-ID: <20240816005943.1832694-1-ivanov.mikhail1@huawei-partners.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain
Series	Implement performance impact measurement tool \| expand [RFC,v1,0/4] Implement performance impact measurement tool [RFC,v1,1/4] selftests/landlock: Implement performance impact measurement tool [RFC,v1,2/4] selftests/landlock: Implement per-syscall microbenchmarks [RFC,v1,3/4] selftests/landlock: Implement custom libbpf-based tracer [RFC,v1,4/4] selftests/landlock: Add realworld workload based on find tool

Message ID

20240816005943.1832694-1-ivanov.mikhail1@huawei-partners.com (mailing list archive)

Headers

From: Mikhail Ivanov <ivanov.mikhail1@huawei-partners.com>
To: <mic@digikod.net>
CC: <willemdebruijn.kernel@gmail.com>, <gnoack3000@gmail.com>,
	<linux-security-module@vger.kernel.org>, <netdev@vger.kernel.org>,
	<netfilter-devel@vger.kernel.org>, <yusongping@huawei.com>,
	<artem.kuzin@huawei.com>, <konstantin.meskhidze@huawei.com>
Subject: [RFC PATCH v1 0/4] Implement performance impact measurement tool
Date: Fri, 16 Aug 2024 08:59:39 +0800
Message-ID: <20240816005943.1832694-1-ivanov.mikhail1@huawei-partners.com>
Precedence: bulk
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Content-Type: text/plain

Series

Implement performance impact measurement tool | expand

Message

Mikhail Ivanov Aug. 16, 2024, 12:59 a.m. UTC

Hello! This is v1 RFC patch dedicated to Landlock performance measurement.

Landlock LSM hooks are executed with many operations on Linux internal
objects (files, sockets). This hooks can noticeably affect performance
of such operations as it was demonstrated in the filesystem caching
patchset [1]. Having ability to calculate Landlock performance overhead
allows to compare kernel changes and estimate the acceptability
of new features (e.g. [2], [3], [4]).

A syscall execution time was chosen as the measured metric.
Landlock performance overhead is defined as the difference between syscall
duration in sandboxed mode and default mode.

Initially, perf trace was chosen as tracer that measures syscalls
durations. I've figured out that it can show imprecise values.
It doesn't affect real overhead value, but it shows the wrong
proportion of overhead relative to syscall baseline duration. Moreover,
using perf trace caused some measurement noise.

AFAICS all this happens due to its implementation and perf event handlers.
Until someone figures out if it's possible to fix this issues somehow I
suggest using libbpf-based simple program provided in this patchset
that uses per-syscall tracepoints and calculates average durations for
specified syscalls. In fact it has simple implementation based on a small
BPF programs and provides more precise metrics.

This patchset implements Landlock sandboxer which provides the ability to
customize the ruleset in a variable way.

Currently, following workloads are implemented:
* Simple script for syscalls microbenchmarking with `openat` support.
* Script that executes find tool under Linux source files with various
  depth and sandboxer configurations.

Microbenchmarks can have only simple rulesets with few number
of rules but in the next patches they should be extended with support of
large rulesets with different number of layers.

Here is an example of how this tool can be used to measure read access
Landlock overhead for workload that uses find tool on linux source files
(with depth 5):

    # ./bench/run.sh -t fs:.topology:4 -e openat -s -b \
    #    $FIND $LINUX_SRC -mindepth 5 -maxdepth 5 -exec file '{}' \;

    Tracing baseline workload...
    376.294s elapsed
    Tracing sandboxed workload...
    381.298s elapsed

    Tracing results
    ===============
    cmd: /usr/bin/find /root/linux -mindepth 5 -maxdepth 5 -exec file '{}' \;
    syscalls: openat
    access: 4
    overhead:
        syscall                  bcalls     scalls   duration+overhead(us)
        =======                  ======     ======   =====================
        syscall-257             1498623    1770882       1.88+0.46(+24.0%)

Please, share your opinion on the design of the tool and your ideas for
improving measurement and workloads!

[1] https://lore.kernel.org/all/20210630224856.1313928-1-mic@digikod.net/
[2] https://github.com/landlock-lsm/linux/issues/10
[3] https://github.com/landlock-lsm/linux/issues/19
[4] https://github.com/landlock-lsm/linux/issues/1

Closes: https://github.com/landlock-lsm/linux/issues/24

Mikhail Ivanov (4):
  selftests/landlock: Implement performance impact measurement tool
  selftests/landlock: Implement per-syscall microbenchmarks
  selftests/landlock: Implement custom libbpf-based tracer
  selftests/landlock: Add realworld workload based on find tool

 tools/testing/selftests/Makefile              |   1 +
 .../testing/selftests/landlock/bench/Makefile | 179 ++++++++
 .../landlock/bench/bench_find_on_linux.sh     |  84 ++++
 .../testing/selftests/landlock/bench/common.c | 283 ++++++++++++
 .../testing/selftests/landlock/bench/common.h |  18 +
 tools/testing/selftests/landlock/bench/config |  10 +
 .../selftests/landlock/bench/microbench.c     | 192 ++++++++
 .../selftests/landlock/bench/progs/tracer.c   | 126 ++++++
 tools/testing/selftests/landlock/bench/run.sh | 409 ++++++++++++++++++
 .../selftests/landlock/bench/sandboxer.c      | 117 +++++
 .../testing/selftests/landlock/bench/tracer.c | 278 ++++++++++++
 .../selftests/landlock/bench/tracer_common.h  |  15 +
 12 files changed, 1712 insertions(+)
 create mode 100644 tools/testing/selftests/landlock/bench/Makefile
 create mode 100755 tools/testing/selftests/landlock/bench/bench_find_on_linux.sh
 create mode 100644 tools/testing/selftests/landlock/bench/common.c
 create mode 100644 tools/testing/selftests/landlock/bench/common.h
 create mode 100644 tools/testing/selftests/landlock/bench/config
 create mode 100644 tools/testing/selftests/landlock/bench/microbench.c
 create mode 100644 tools/testing/selftests/landlock/bench/progs/tracer.c
 create mode 100755 tools/testing/selftests/landlock/bench/run.sh
 create mode 100644 tools/testing/selftests/landlock/bench/sandboxer.c
 create mode 100644 tools/testing/selftests/landlock/bench/tracer.c
 create mode 100644 tools/testing/selftests/landlock/bench/tracer_common.h


base-commit: 8400291e289ee6b2bf9779ff1c83a291501f017b

Comments

Mikhail Ivanov Sept. 23, 2024, 3:05 p.m. UTC | #1

On 8/16/2024 3:59 AM, Mikhail Ivanov wrote:
> Hello! This is v1 RFC patch dedicated to Landlock performance measurement.
> 
> Landlock LSM hooks are executed with many operations on Linux internal
> objects (files, sockets). This hooks can noticeably affect performance
> of such operations as it was demonstrated in the filesystem caching
> patchset [1]. Having ability to calculate Landlock performance overhead
> allows to compare kernel changes and estimate the acceptability
> of new features (e.g. [2], [3], [4]).

Hello! Kindly reminder about this patchset. UDP-dedicated RFC v1 [1]
patchset was published and this patchset can be useful to benchmark
sendmsg/recvmsg hooks. But it would probably be better to apply other
network patches first. WDYT?

[1] 
https://lore.kernel.org/all/20240916122230.114800-1-matthieu@buffet.re/#t