mbox series

[v1,0/7] PMU performance improvements

Message ID 20231007021326.4156714-1-irogers@google.com (mailing list archive)
Headers show
Series PMU performance improvements | expand

Message

Ian Rogers Oct. 7, 2023, 2:13 a.m. UTC
Performance improvements to pmu scanning by holding onto the
event/metric tables for a cpuid (avoid regular expression comparisons)
and by lazily computing the default perf_event_attr for a PMU.

Before
% Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 100 times
  Average core PMU scanning took: 251.990 usec (+- 4.009 usec)
  Average PMU scanning took: 3222.460 usec (+- 211.234 usec)
% Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 100 times
  Average core PMU scanning took: 260.120 usec (+- 7.905 usec)
  Average PMU scanning took: 3228.995 usec (+- 211.196 usec)
% Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 100 times
  Average core PMU scanning took: 252.310 usec (+- 3.980 usec)
  Average PMU scanning took: 3220.675 usec (+- 210.844 usec)


After:
% Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 100 times
  Average core PMU scanning took: 28.530 usec (+- 0.602 usec)
  Average PMU scanning took: 275.725 usec (+- 18.253 usec)
% Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 100 times
  Average core PMU scanning took: 28.720 usec (+- 0.446 usec)
  Average PMU scanning took: 271.015 usec (+- 18.762 usec)
% Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 100 times
  Average core PMU scanning took: 31.040 usec (+- 0.612 usec)
  Average PMU scanning took: 267.340 usec (+- 17.209 usec)

Measuring the pmu-scan benchmark on a Tigerlake laptop: core PMU
scanning is reduced to 11.5% of the previous execution time, all PMU
scanning is reduced to 8.4% of the previous execution time. There is a
4.3% reduction in openat system calls.

Ian Rogers (7):
  perf pmu: Rename perf_pmu__get_default_config to perf_pmu__arch_init
  perf intel-pt: Move PMU initialization from default config code
  perf arm-spe: Move PMU initialization from default config code
  perf pmu: Const-ify file APIs
  perf pmu: Const-ify perf_pmu__config_terms
  perf pmu-events: Remember the events and metrics table
  perf pmu: Lazily compute default config

 tools/perf/arch/arm/util/cs-etm.c    | 13 ++------
 tools/perf/arch/arm/util/pmu.c       | 10 +++---
 tools/perf/arch/arm64/util/arm-spe.c | 48 +++++++++++++---------------
 tools/perf/arch/s390/util/pmu.c      |  3 +-
 tools/perf/arch/x86/util/intel-pt.c  | 27 +++++++---------
 tools/perf/arch/x86/util/pmu.c       |  6 ++--
 tools/perf/pmu-events/jevents.py     | 48 ++++++++++++++++------------
 tools/perf/util/arm-spe.h            |  4 ++-
 tools/perf/util/cs-etm.h             |  2 +-
 tools/perf/util/intel-pt.h           |  3 +-
 tools/perf/util/parse-events.c       | 12 +++----
 tools/perf/util/pmu.c                | 39 +++++++++++-----------
 tools/perf/util/pmu.h                | 18 ++++++-----
 tools/perf/util/python.c             |  2 +-
 14 files changed, 117 insertions(+), 118 deletions(-)