mbox series

[RFC,v1,0/4] Implement performance impact measurement tool

Message ID 20240816005943.1832694-1-ivanov.mikhail1@huawei-partners.com (mailing list archive)
Headers show
Series Implement performance impact measurement tool | expand

Message

Mikhail Ivanov Aug. 16, 2024, 12:59 a.m. UTC
Hello! This is v1 RFC patch dedicated to Landlock performance measurement.

Landlock LSM hooks are executed with many operations on Linux internal
objects (files, sockets). This hooks can noticeably affect performance
of such operations as it was demonstrated in the filesystem caching
patchset [1]. Having ability to calculate Landlock performance overhead
allows to compare kernel changes and estimate the acceptability
of new features (e.g. [2], [3], [4]).

A syscall execution time was chosen as the measured metric.
Landlock performance overhead is defined as the difference between syscall
duration in sandboxed mode and default mode.

Initially, perf trace was chosen as tracer that measures syscalls
durations. I've figured out that it can show imprecise values.
It doesn't affect real overhead value, but it shows the wrong
proportion of overhead relative to syscall baseline duration. Moreover,
using perf trace caused some measurement noise.

AFAICS all this happens due to its implementation and perf event handlers.
Until someone figures out if it's possible to fix this issues somehow I
suggest using libbpf-based simple program provided in this patchset
that uses per-syscall tracepoints and calculates average durations for
specified syscalls. In fact it has simple implementation based on a small
BPF programs and provides more precise metrics.

This patchset implements Landlock sandboxer which provides the ability to
customize the ruleset in a variable way.

Currently, following workloads are implemented:
* Simple script for syscalls microbenchmarking with `openat` support.
* Script that executes find tool under Linux source files with various
  depth and sandboxer configurations.

Microbenchmarks can have only simple rulesets with few number
of rules but in the next patches they should be extended with support of
large rulesets with different number of layers.

Here is an example of how this tool can be used to measure read access
Landlock overhead for workload that uses find tool on linux source files
(with depth 5):

    # ./bench/run.sh -t fs:.topology:4 -e openat -s -b \
    #    $FIND $LINUX_SRC -mindepth 5 -maxdepth 5 -exec file '{}' \;

    Tracing baseline workload...
    376.294s elapsed
    Tracing sandboxed workload...
    381.298s elapsed

    Tracing results
    ===============
    cmd: /usr/bin/find /root/linux -mindepth 5 -maxdepth 5 -exec file '{}' \;
    syscalls: openat
    access: 4
    overhead:
        syscall                  bcalls     scalls   duration+overhead(us)
        =======                  ======     ======   =====================
        syscall-257             1498623    1770882       1.88+0.46(+24.0%)

Please, share your opinion on the design of the tool and your ideas for
improving measurement and workloads!

[1] https://lore.kernel.org/all/20210630224856.1313928-1-mic@digikod.net/
[2] https://github.com/landlock-lsm/linux/issues/10
[3] https://github.com/landlock-lsm/linux/issues/19
[4] https://github.com/landlock-lsm/linux/issues/1

Closes: https://github.com/landlock-lsm/linux/issues/24

Mikhail Ivanov (4):
  selftests/landlock: Implement performance impact measurement tool
  selftests/landlock: Implement per-syscall microbenchmarks
  selftests/landlock: Implement custom libbpf-based tracer
  selftests/landlock: Add realworld workload based on find tool

 tools/testing/selftests/Makefile              |   1 +
 .../testing/selftests/landlock/bench/Makefile | 179 ++++++++
 .../landlock/bench/bench_find_on_linux.sh     |  84 ++++
 .../testing/selftests/landlock/bench/common.c | 283 ++++++++++++
 .../testing/selftests/landlock/bench/common.h |  18 +
 tools/testing/selftests/landlock/bench/config |  10 +
 .../selftests/landlock/bench/microbench.c     | 192 ++++++++
 .../selftests/landlock/bench/progs/tracer.c   | 126 ++++++
 tools/testing/selftests/landlock/bench/run.sh | 409 ++++++++++++++++++
 .../selftests/landlock/bench/sandboxer.c      | 117 +++++
 .../testing/selftests/landlock/bench/tracer.c | 278 ++++++++++++
 .../selftests/landlock/bench/tracer_common.h  |  15 +
 12 files changed, 1712 insertions(+)
 create mode 100644 tools/testing/selftests/landlock/bench/Makefile
 create mode 100755 tools/testing/selftests/landlock/bench/bench_find_on_linux.sh
 create mode 100644 tools/testing/selftests/landlock/bench/common.c
 create mode 100644 tools/testing/selftests/landlock/bench/common.h
 create mode 100644 tools/testing/selftests/landlock/bench/config
 create mode 100644 tools/testing/selftests/landlock/bench/microbench.c
 create mode 100644 tools/testing/selftests/landlock/bench/progs/tracer.c
 create mode 100755 tools/testing/selftests/landlock/bench/run.sh
 create mode 100644 tools/testing/selftests/landlock/bench/sandboxer.c
 create mode 100644 tools/testing/selftests/landlock/bench/tracer.c
 create mode 100644 tools/testing/selftests/landlock/bench/tracer_common.h


base-commit: 8400291e289ee6b2bf9779ff1c83a291501f017b

Comments

Mikhail Ivanov Sept. 23, 2024, 3:05 p.m. UTC | #1
On 8/16/2024 3:59 AM, Mikhail Ivanov wrote:
> Hello! This is v1 RFC patch dedicated to Landlock performance measurement.
> 
> Landlock LSM hooks are executed with many operations on Linux internal
> objects (files, sockets). This hooks can noticeably affect performance
> of such operations as it was demonstrated in the filesystem caching
> patchset [1]. Having ability to calculate Landlock performance overhead
> allows to compare kernel changes and estimate the acceptability
> of new features (e.g. [2], [3], [4]).

Hello! Kindly reminder about this patchset. UDP-dedicated RFC v1 [1]
patchset was published and this patchset can be useful to benchmark
sendmsg/recvmsg hooks. But it would probably be better to apply other
network patches first. WDYT?

[1] 
https://lore.kernel.org/all/20240916122230.114800-1-matthieu@buffet.re/#t