Message ID | 20210705232524.4024832-2-edwin@etorok.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | RISC-V: make perf record --call-graph=dwarf work | expand |
On Mon, 05 Jul 2021 16:25:21 PDT (-0700), edwin@etorok.net wrote: > For libdw-based callgraph sampling to work we need to sample registers. > > Tested on HiFive Unmatched with a trunk version of OCaml: These generally LGTM, but I don't have patch #2. Sometimes that means they're just not targeted at the RISC-V tree, which is fine with me but I'm happy to take some time to look closer and take them if that's what you're looking for. A cover letter can be a good bet, to describe this sort of stuff. > ``` > # perf record --user-regs=? > available registers: pc ra sp gp tp t0 t1 t2 s0 s1 a0 a1 a2 a3 a4 a5 a6 a7 > s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 t3 t4 t5 t6 > > # sed -e 's/(\*\*//' -e 's/\*\*)//' \ > testsuite/tests/misc-unsafe/almabench.ml >almabench_benchmark.ml > # ocamlopt almabench_benchmark.ml -unsafe -o almabench_benchmark > # perf record -e cpu-clock -v --call-graph=dwarf -F 99 \ > ./almabench_benchmark > > callchain: type DWARF > callchain: stack dump size 8192 > nr_cblocks: 0 > affinity: SYS > mmap flush: 1 > comp level: 0 > mmap size 528384B > mmap size 528384B > Control descriptor is not initialized > 0 17.00 -26.06 > 1 12.34 1.29 > 2 6.83 22.95 > 3 0.04 -1.26 > 4 2.30 12.54 > 5 2.93 14.35 > 6 21.27 -16.57 > 7 20.41 -19.04 > [ perf record: Woken up 65 times to write data ] > Looking at the vmlinux_path (8 entries long) > Failed to open /proc/kcore. > Note /proc/kcore requires CAP_SYS_RAWIO capability to access. > Using /proc/kallsyms for symbols > failed to write feature CPUDESC > failed to write feature CPUID > failed to write feature NUMA_TOPOLOGY > failed to write feature MEM_TOPOLOGY > [ perf record: Captured and wrote 16.265 MB perf.data (1997 samples) ] > > # perf script > ... > almabench_bench 1193 14816.399235: 10101010 cpu-clock: > 3fbbd63d54 reduce_sincos+0x268 (inlined) > 3fbbd63d54 __cos+0x268 (/lib/libm-2.33.so) > 2aac4b261b Almabench_benchmark.planetpv+0x25f > (/home/root/ocaml/almabench_benchmark) > 2aac4b3e03 Almabench_benchmark.entry+0x12bb > (/home/root/ocaml/almabench_benchmark) > 2aac4b0097 caml_program+0x143 > (/home/root/ocaml/almabench_benchmark) > 2aac4e4e3d caml_start_program+0x71 > (/home/root/ocaml/almabench_benchmark) > 2aac4e5409 caml_startup_common+0x18f > (/home/root/ocaml/almabench_benchmark) > 2aac4e5465 caml_startup_exn+0x9 (inlined) > 2aac4e5465 caml_startup+0x9 (inlined) > 2aac4e5465 caml_main+0x9 > (/home/root/ocaml/almabench_benchmark) > 2aac4afe89 main+0x9 > (/home/root/ocaml/almabench_benchmark) > 3fbbc47b0b __libc_start_main+0x85 (/lib/libc-2.33.so) > 2aac4afebb _start+0x2b (/home/root/ocaml/almabench_benchmark) > ``` > > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Ingo Molnar <mingo@redhat.com> > Cc: Arnaldo Carvalho de Melo <acme@kernel.org> > Cc: Jiri Olsa <jolsa@redhat.com> > Cc: Namhyung Kim <namhyung@kernel.org> > Cc: Paul Walmsley <paul.walmsley@sifive.com> > Cc: Palmer Dabbelt <palmer@dabbelt.com> > Cc: Albert Ou <aou@eecs.berkeley.edu> > Cc: linux-perf-users@vger.kernel.org > Cc: linux-riscv@lists.infradead.org > Signed-off-by: Edwin Török <edwin@etorok.net> > --- > tools/perf/arch/riscv/util/perf_regs.c | 32 ++++++++++++++++++++++++++ > 1 file changed, 32 insertions(+) > > diff --git a/tools/perf/arch/riscv/util/perf_regs.c b/tools/perf/arch/riscv/util/perf_regs.c > index 2864e2e3776d..8c9f511e8322 100644 > --- a/tools/perf/arch/riscv/util/perf_regs.c > +++ b/tools/perf/arch/riscv/util/perf_regs.c > @@ -2,5 +2,37 @@ > #include "../../util/perf_regs.h" > > const struct sample_reg sample_reg_masks[] = { > + SMPL_REG(pc, PERF_REG_RISCV_PC), > + SMPL_REG(ra, PERF_REG_RISCV_RA), > + SMPL_REG(sp, PERF_REG_RISCV_SP), > + SMPL_REG(gp, PERF_REG_RISCV_GP), > + SMPL_REG(tp, PERF_REG_RISCV_TP), > + SMPL_REG(t0, PERF_REG_RISCV_T0), > + SMPL_REG(t1, PERF_REG_RISCV_T1), > + SMPL_REG(t2, PERF_REG_RISCV_T2), > + SMPL_REG(s0, PERF_REG_RISCV_S0), > + SMPL_REG(s1, PERF_REG_RISCV_S1), > + SMPL_REG(a0, PERF_REG_RISCV_A0), > + SMPL_REG(a1, PERF_REG_RISCV_A1), > + SMPL_REG(a2, PERF_REG_RISCV_A2), > + SMPL_REG(a3, PERF_REG_RISCV_A3), > + SMPL_REG(a4, PERF_REG_RISCV_A4), > + SMPL_REG(a5, PERF_REG_RISCV_A5), > + SMPL_REG(a6, PERF_REG_RISCV_A6), > + SMPL_REG(a7, PERF_REG_RISCV_A7), > + SMPL_REG(s2, PERF_REG_RISCV_S2), > + SMPL_REG(s3, PERF_REG_RISCV_S3), > + SMPL_REG(s4, PERF_REG_RISCV_S4), > + SMPL_REG(s5, PERF_REG_RISCV_S5), > + SMPL_REG(s6, PERF_REG_RISCV_S6), > + SMPL_REG(s7, PERF_REG_RISCV_S7), > + SMPL_REG(s8, PERF_REG_RISCV_S8), > + SMPL_REG(s9, PERF_REG_RISCV_S9), > + SMPL_REG(s10, PERF_REG_RISCV_S10), > + SMPL_REG(s11, PERF_REG_RISCV_S11), > + SMPL_REG(t3, PERF_REG_RISCV_T3), > + SMPL_REG(t4, PERF_REG_RISCV_T4), > + SMPL_REG(t5, PERF_REG_RISCV_T5), > + SMPL_REG(t6, PERF_REG_RISCV_T6), > SMPL_REG_END > };
On Tue, 2021-08-03 at 21:22 -0700, Palmer Dabbelt wrote: > On Mon, 05 Jul 2021 16:25:21 PDT (-0700), edwin@etorok.net wrote: > > For libdw-based callgraph sampling to work we need to sample > > registers. > > > > Tested on HiFive Unmatched with a trunk version of OCaml: > > These generally LGTM, but I don't have patch #2. PATCH 2/4 is here: https://lore.kernel.org/linux-riscv/20210705232524.4024832-3-edwin@etorok.net/ https://patchwork.kernel.org/project/linux-riscv/patch/20210705232524.4024832-3-edwin@etorok.net/ > Sometimes that means > they're just not targeted at the RISC-V tree, which is fine with me but > I'm happy to take some time to look closer and take them if that's what > you're looking for. > A cover letter can be a good bet, to describe this sort of stuff. Thanks for taking a look, I'll try to ensure that RISC-V maintainers are CC-ed on the entire series in the future (I used scripts/get_maintainer.pl which might've been too selective). All patches should be on the RISC-V mailing list though, cover letter included: https://lore.kernel.org/linux-riscv/20210705232524.4024832-1-edwin@etorok.net/T/#t https://patchwork.kernel.org/project/linux-riscv/list/?series=511107 PATCH 2/4 is patching tools/perf/check-headers.sh, so although technically outside of the RISC-V specific tree it actually adds one line to include the riscv headers, so if you could take all 4 patches into your tree that'd be great. Thanks, --Edwin
diff --git a/tools/perf/arch/riscv/util/perf_regs.c b/tools/perf/arch/riscv/util/perf_regs.c index 2864e2e3776d..8c9f511e8322 100644 --- a/tools/perf/arch/riscv/util/perf_regs.c +++ b/tools/perf/arch/riscv/util/perf_regs.c @@ -2,5 +2,37 @@ #include "../../util/perf_regs.h" const struct sample_reg sample_reg_masks[] = { + SMPL_REG(pc, PERF_REG_RISCV_PC), + SMPL_REG(ra, PERF_REG_RISCV_RA), + SMPL_REG(sp, PERF_REG_RISCV_SP), + SMPL_REG(gp, PERF_REG_RISCV_GP), + SMPL_REG(tp, PERF_REG_RISCV_TP), + SMPL_REG(t0, PERF_REG_RISCV_T0), + SMPL_REG(t1, PERF_REG_RISCV_T1), + SMPL_REG(t2, PERF_REG_RISCV_T2), + SMPL_REG(s0, PERF_REG_RISCV_S0), + SMPL_REG(s1, PERF_REG_RISCV_S1), + SMPL_REG(a0, PERF_REG_RISCV_A0), + SMPL_REG(a1, PERF_REG_RISCV_A1), + SMPL_REG(a2, PERF_REG_RISCV_A2), + SMPL_REG(a3, PERF_REG_RISCV_A3), + SMPL_REG(a4, PERF_REG_RISCV_A4), + SMPL_REG(a5, PERF_REG_RISCV_A5), + SMPL_REG(a6, PERF_REG_RISCV_A6), + SMPL_REG(a7, PERF_REG_RISCV_A7), + SMPL_REG(s2, PERF_REG_RISCV_S2), + SMPL_REG(s3, PERF_REG_RISCV_S3), + SMPL_REG(s4, PERF_REG_RISCV_S4), + SMPL_REG(s5, PERF_REG_RISCV_S5), + SMPL_REG(s6, PERF_REG_RISCV_S6), + SMPL_REG(s7, PERF_REG_RISCV_S7), + SMPL_REG(s8, PERF_REG_RISCV_S8), + SMPL_REG(s9, PERF_REG_RISCV_S9), + SMPL_REG(s10, PERF_REG_RISCV_S10), + SMPL_REG(s11, PERF_REG_RISCV_S11), + SMPL_REG(t3, PERF_REG_RISCV_T3), + SMPL_REG(t4, PERF_REG_RISCV_T4), + SMPL_REG(t5, PERF_REG_RISCV_T5), + SMPL_REG(t6, PERF_REG_RISCV_T6), SMPL_REG_END };
For libdw-based callgraph sampling to work we need to sample registers. Tested on HiFive Unmatched with a trunk version of OCaml: ``` # perf record --user-regs=? available registers: pc ra sp gp tp t0 t1 t2 s0 s1 a0 a1 a2 a3 a4 a5 a6 a7 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 t3 t4 t5 t6 # sed -e 's/(\*\*//' -e 's/\*\*)//' \ testsuite/tests/misc-unsafe/almabench.ml >almabench_benchmark.ml # ocamlopt almabench_benchmark.ml -unsafe -o almabench_benchmark # perf record -e cpu-clock -v --call-graph=dwarf -F 99 \ ./almabench_benchmark callchain: type DWARF callchain: stack dump size 8192 nr_cblocks: 0 affinity: SYS mmap flush: 1 comp level: 0 mmap size 528384B mmap size 528384B Control descriptor is not initialized 0 17.00 -26.06 1 12.34 1.29 2 6.83 22.95 3 0.04 -1.26 4 2.30 12.54 5 2.93 14.35 6 21.27 -16.57 7 20.41 -19.04 [ perf record: Woken up 65 times to write data ] Looking at the vmlinux_path (8 entries long) Failed to open /proc/kcore. Note /proc/kcore requires CAP_SYS_RAWIO capability to access. Using /proc/kallsyms for symbols failed to write feature CPUDESC failed to write feature CPUID failed to write feature NUMA_TOPOLOGY failed to write feature MEM_TOPOLOGY [ perf record: Captured and wrote 16.265 MB perf.data (1997 samples) ] # perf script ... almabench_bench 1193 14816.399235: 10101010 cpu-clock: 3fbbd63d54 reduce_sincos+0x268 (inlined) 3fbbd63d54 __cos+0x268 (/lib/libm-2.33.so) 2aac4b261b Almabench_benchmark.planetpv+0x25f (/home/root/ocaml/almabench_benchmark) 2aac4b3e03 Almabench_benchmark.entry+0x12bb (/home/root/ocaml/almabench_benchmark) 2aac4b0097 caml_program+0x143 (/home/root/ocaml/almabench_benchmark) 2aac4e4e3d caml_start_program+0x71 (/home/root/ocaml/almabench_benchmark) 2aac4e5409 caml_startup_common+0x18f (/home/root/ocaml/almabench_benchmark) 2aac4e5465 caml_startup_exn+0x9 (inlined) 2aac4e5465 caml_startup+0x9 (inlined) 2aac4e5465 caml_main+0x9 (/home/root/ocaml/almabench_benchmark) 2aac4afe89 main+0x9 (/home/root/ocaml/almabench_benchmark) 3fbbc47b0b __libc_start_main+0x85 (/lib/libc-2.33.so) 2aac4afebb _start+0x2b (/home/root/ocaml/almabench_benchmark) ``` Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: linux-perf-users@vger.kernel.org Cc: linux-riscv@lists.infradead.org Signed-off-by: Edwin Török <edwin@etorok.net> --- tools/perf/arch/riscv/util/perf_regs.c | 32 ++++++++++++++++++++++++++ 1 file changed, 32 insertions(+)