mbox series

[0/8] Perf stack unwinding with pointer authentication

Message ID 20220704145333.22557-1-andrew.kilroy@arm.com (mailing list archive)
Headers show
Series Perf stack unwinding with pointer authentication | expand

Message

Andrew Kilroy July 4, 2022, 2:53 p.m. UTC
This patch series addresses issues that perf has when attempting to show
userspace stacks in the presence of pointer authentication on arm64.

Depending on whether libunwind or libdw is used, perf incorrectly
displays the userspace stack in 'perf report --stdio'.  With libunwind,
only the leaf function is shown.

            |
            ---0x200000004005bf
               0x200000004005bf
               my_leaf_function

With libdw, only the leaf function is shown even though there are
callers in the application.

            |
            ---my_leaf_function


The reason perf cannot show the stack upon a perf report --stdio is
because the unwinders are given instruction pointers which contain a
pointer authentication code (PAC).  For the libraries to correctly
unwind, they need to know which bits of the instruction pointer to turn
off.

The kernel exposes the set of PAC bits via the NT_ARM_PAC_MASK regset.
It is expected that this may vary per-task in future. The kernel also
exposes which pointer authentication keys are enabled via the
NT_ARM_PAC_ENABLED_KEYS regset, and this can change dynamically. These
are per-task state which perf would need to sample.

It's not always feasible for perf to acquire these regsets via ptrace.
When sampling system-wide or with inherited events this may require a
large volume of ptrace requests, and by the time the perf tool processes
a sample for a task, that task might already have terminated.

Instead, these patches allow this state to be sampled into the perf
ringbuffer, where it can be consumed more easily by the perf tool.

The first patch changes the kernel to send the authentication PAC masks
to userspace perf via the perf ring buffer.  This is published in the
sample, using a new sample field PERF_SAMPLE_ARCH_1.

The subsequent patches are changes to userspace perf to

1) request the PERF_SAMPLE_ARCH_1
2) supply the instruction mask to libunwind
3) ensure perf can cope with an older kernel that does not know about
   the PERF_SAMPLE_ARCH_1 sample field.
4) checks if the version of libunwind has the capability to accept
   an instruction mask from perf and if so enable the feature.

These changes depend on a change to libunwind, that is yet to be
released, although the patch has been merged.

  https://github.com/libunwind/libunwind/pull/360


Andrew Kilroy (6):
  perf arm64: Send pointer auth masks to ring buffer
  perf evsel: Do not request ptrauth sample field if not supported
  perf tools: arm64: Read ptrauth data from kernel
  perf libunwind: Feature check for libunwind ptrauth callback
  perf libunwind: arm64 pointer authentication
  perf tools: Print ptrauth struct in perf report

German Gomez (2):
  perf test: Update arm64 tests to expect ptrauth masks
  perf test arm64: Test unwinding with PACs on gcc & clang compilers

 arch/arm64/include/asm/arch_sample_data.h     |  38 ++++++
 arch/arm64/kernel/Makefile                    |   2 +-
 arch/arm64/kernel/arch_sample_data.c          |  37 ++++++
 include/linux/perf_event.h                    |  24 ++++
 include/uapi/linux/perf_event.h               |   5 +-
 kernel/events/core.c                          |  35 ++++++
 tools/build/Makefile.feature                  |   2 +
 tools/build/feature/Makefile                  |   4 +
 tools/build/feature/test-all.c                |   5 +
 .../feature/test-libunwind-arm64-ptrauth.c    |  26 ++++
 tools/include/uapi/linux/perf_event.h         |   5 +-
 tools/perf/Makefile.config                    |  10 ++
 tools/perf/Makefile.perf                      |   1 +
 tools/perf/tests/Build                        |   1 +
 tools/perf/tests/arm_unwind_pac.c             | 113 ++++++++++++++++++
 tools/perf/tests/arm_unwind_pac.sh            |  57 +++++++++
 tools/perf/tests/attr/README                  |   1 +
 .../attr/test-record-graph-default-aarch64    |   3 +-
 tools/perf/tests/attr/test-record-graph-dwarf |   1 +
 .../attr/test-record-graph-dwarf-aarch64      |  13 ++
 .../tests/attr/test-record-graph-fp-aarch64   |   3 +-
 tools/perf/tests/builtin-test.c               |   1 +
 tools/perf/tests/sample-parsing.c             |   2 +-
 tools/perf/tests/tests.h                      |   1 +
 tools/perf/util/event.h                       |   8 ++
 tools/perf/util/evsel.c                       |  64 ++++++++++
 tools/perf/util/evsel.h                       |   1 +
 tools/perf/util/perf_event_attr_fprintf.c     |   2 +-
 tools/perf/util/session.c                     |  15 +++
 tools/perf/util/unwind-libunwind-local.c      |  12 ++
 30 files changed, 485 insertions(+), 7 deletions(-)
 create mode 100644 arch/arm64/include/asm/arch_sample_data.h
 create mode 100644 arch/arm64/kernel/arch_sample_data.c
 create mode 100644 tools/build/feature/test-libunwind-arm64-ptrauth.c
 create mode 100644 tools/perf/tests/arm_unwind_pac.c
 create mode 100755 tools/perf/tests/arm_unwind_pac.sh
 create mode 100644 tools/perf/tests/attr/test-record-graph-dwarf-aarch64

Comments

James Clark Sept. 7, 2022, 3 p.m. UTC | #1
On 04/07/2022 15:53, Andrew Kilroy wrote:
> This patch series addresses issues that perf has when attempting to show
> userspace stacks in the presence of pointer authentication on arm64.
> 
> Depending on whether libunwind or libdw is used, perf incorrectly
> displays the userspace stack in 'perf report --stdio'.  With libunwind,
> only the leaf function is shown.
> 
>             |
>             ---0x200000004005bf
>                0x200000004005bf
>                my_leaf_function
> 
> With libdw, only the leaf function is shown even though there are
> callers in the application.
> 
>             |
>             ---my_leaf_function
> 
> 
> The reason perf cannot show the stack upon a perf report --stdio is
> because the unwinders are given instruction pointers which contain a
> pointer authentication code (PAC).  For the libraries to correctly
> unwind, they need to know which bits of the instruction pointer to turn
> off.
> 
> The kernel exposes the set of PAC bits via the NT_ARM_PAC_MASK regset.
> It is expected that this may vary per-task in future. The kernel also
> exposes which pointer authentication keys are enabled via the
> NT_ARM_PAC_ENABLED_KEYS regset, and this can change dynamically. These
> are per-task state which perf would need to sample.
> 
> It's not always feasible for perf to acquire these regsets via ptrace.
> When sampling system-wide or with inherited events this may require a
> large volume of ptrace requests, and by the time the perf tool processes
> a sample for a task, that task might already have terminated.
> 
> Instead, these patches allow this state to be sampled into the perf
> ringbuffer, where it can be consumed more easily by the perf tool.
> 
> The first patch changes the kernel to send the authentication PAC masks
> to userspace perf via the perf ring buffer.  This is published in the
> sample, using a new sample field PERF_SAMPLE_ARCH_1.
> 
> The subsequent patches are changes to userspace perf to
> 
> 1) request the PERF_SAMPLE_ARCH_1
> 2) supply the instruction mask to libunwind
> 3) ensure perf can cope with an older kernel that does not know about
>    the PERF_SAMPLE_ARCH_1 sample field.
> 4) checks if the version of libunwind has the capability to accept
>    an instruction mask from perf and if so enable the feature.
> 
> These changes depend on a change to libunwind, that is yet to be
> released, although the patch has been merged.
> 
>   https://github.com/libunwind/libunwind/pull/360
> 

For the whole set:

Reviewed-by: James Clark <james.clark@arm.com>

I checked that the new test passes on an AWS Graviton 3 instance and
with a build of mainline libunwind. I also checked that the PAC masks on
the samples look sensible.

The tests also still pass when run on N1SDP which doesn't have pointer
authentication.

> 
> Andrew Kilroy (6):
>   perf arm64: Send pointer auth masks to ring buffer
>   perf evsel: Do not request ptrauth sample field if not supported
>   perf tools: arm64: Read ptrauth data from kernel
>   perf libunwind: Feature check for libunwind ptrauth callback
>   perf libunwind: arm64 pointer authentication
>   perf tools: Print ptrauth struct in perf report
> 
> German Gomez (2):
>   perf test: Update arm64 tests to expect ptrauth masks
>   perf test arm64: Test unwinding with PACs on gcc & clang compilers
> 
>  arch/arm64/include/asm/arch_sample_data.h     |  38 ++++++
>  arch/arm64/kernel/Makefile                    |   2 +-
>  arch/arm64/kernel/arch_sample_data.c          |  37 ++++++
>  include/linux/perf_event.h                    |  24 ++++
>  include/uapi/linux/perf_event.h               |   5 +-
>  kernel/events/core.c                          |  35 ++++++
>  tools/build/Makefile.feature                  |   2 +
>  tools/build/feature/Makefile                  |   4 +
>  tools/build/feature/test-all.c                |   5 +
>  .../feature/test-libunwind-arm64-ptrauth.c    |  26 ++++
>  tools/include/uapi/linux/perf_event.h         |   5 +-
>  tools/perf/Makefile.config                    |  10 ++
>  tools/perf/Makefile.perf                      |   1 +
>  tools/perf/tests/Build                        |   1 +
>  tools/perf/tests/arm_unwind_pac.c             | 113 ++++++++++++++++++
>  tools/perf/tests/arm_unwind_pac.sh            |  57 +++++++++
>  tools/perf/tests/attr/README                  |   1 +
>  .../attr/test-record-graph-default-aarch64    |   3 +-
>  tools/perf/tests/attr/test-record-graph-dwarf |   1 +
>  .../attr/test-record-graph-dwarf-aarch64      |  13 ++
>  .../tests/attr/test-record-graph-fp-aarch64   |   3 +-
>  tools/perf/tests/builtin-test.c               |   1 +
>  tools/perf/tests/sample-parsing.c             |   2 +-
>  tools/perf/tests/tests.h                      |   1 +
>  tools/perf/util/event.h                       |   8 ++
>  tools/perf/util/evsel.c                       |  64 ++++++++++
>  tools/perf/util/evsel.h                       |   1 +
>  tools/perf/util/perf_event_attr_fprintf.c     |   2 +-
>  tools/perf/util/session.c                     |  15 +++
>  tools/perf/util/unwind-libunwind-local.c      |  12 ++
>  30 files changed, 485 insertions(+), 7 deletions(-)
>  create mode 100644 arch/arm64/include/asm/arch_sample_data.h
>  create mode 100644 arch/arm64/kernel/arch_sample_data.c
>  create mode 100644 tools/build/feature/test-libunwind-arm64-ptrauth.c
>  create mode 100644 tools/perf/tests/arm_unwind_pac.c
>  create mode 100755 tools/perf/tests/arm_unwind_pac.sh
>  create mode 100644 tools/perf/tests/attr/test-record-graph-dwarf-aarch64
>