Message ID | 20220908063754.1369709-1-namhyung@kernel.org (mailing list archive) |
---|---|
Headers | show |
Series | perf lock contention: Improve call stack handling (v1) | expand |
Em Wed, Sep 07, 2022 at 11:37:50PM -0700, Namhyung Kim escreveu: > Hello, > > I found that call stack from the lock tracepoint (using bpf_get_stackid) > can be different on each configuration. For example it's very different > when I run it on a VM than on a real machine. > > The perf lock contention relies on the stack trace to get the lock > caller names, this kind of difference can be annoying. Ideally we could > skip stack trace entries for internal BPF or lock functions and get the > correct caller, but it's not the case as of today. Currently it's hard > coded to control the behavior of stack traces for the lock contention > tracepoints. > > To handle those differences, add two new options to control the number of > stack entries and how many it skips. The default value worked well on > my VM setup, but I had to use --stack-skip=5 on real machines. > > You can get it from 'perf/lock-stack-v1' branch in > > git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git This clashed with a patch you Acked earlier, so lets see if someone has extra review comments and a v2 become needed for other reason, when you can refresh it, ok? - Arnaldo > Thanks, > Namhyung > > > Namhyung Kim (4): > perf lock contention: Factor out get_symbol_name_offset() > perf lock contention: Show full callstack with -v option > perf lock contention: Allow to change stack depth and skip > perf lock contention: Skip stack trace from BPF > > tools/perf/Documentation/perf-lock.txt | 6 ++ > tools/perf/builtin-lock.c | 89 ++++++++++++++----- > tools/perf/util/bpf_lock_contention.c | 21 +++-- > .../perf/util/bpf_skel/lock_contention.bpf.c | 3 +- > tools/perf/util/lock-contention.h | 3 + > 5 files changed, 96 insertions(+), 26 deletions(-) > > > base-commit: 6c3bd8d3e01d9014312caa52e4ef1c29d5249648 > -- > 2.37.2.789.g6183377224-goog
Hi Arnaldo, On Thu, Sep 8, 2022 at 11:43 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > Em Wed, Sep 07, 2022 at 11:37:50PM -0700, Namhyung Kim escreveu: > > Hello, > > > > I found that call stack from the lock tracepoint (using bpf_get_stackid) > > can be different on each configuration. For example it's very different > > when I run it on a VM than on a real machine. > > > > The perf lock contention relies on the stack trace to get the lock > > caller names, this kind of difference can be annoying. Ideally we could > > skip stack trace entries for internal BPF or lock functions and get the > > correct caller, but it's not the case as of today. Currently it's hard > > coded to control the behavior of stack traces for the lock contention > > tracepoints. > > > > To handle those differences, add two new options to control the number of > > stack entries and how many it skips. The default value worked well on > > my VM setup, but I had to use --stack-skip=5 on real machines. > > > > You can get it from 'perf/lock-stack-v1' branch in > > > > git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git > > This clashed with a patch you Acked earlier, so lets see if someone has > extra review comments and a v2 become needed for other reason, when you > can refresh it, ok? Sounds good! Thanks, Namhyung
Em Thu, Sep 08, 2022 at 04:44:15PM -0700, Namhyung Kim escreveu: > Hi Arnaldo, > > On Thu, Sep 8, 2022 at 11:43 AM Arnaldo Carvalho de Melo > <acme@kernel.org> wrote: > > > > Em Wed, Sep 07, 2022 at 11:37:50PM -0700, Namhyung Kim escreveu: > > > Hello, > > > > > > I found that call stack from the lock tracepoint (using bpf_get_stackid) > > > can be different on each configuration. For example it's very different > > > when I run it on a VM than on a real machine. > > > > > > The perf lock contention relies on the stack trace to get the lock > > > caller names, this kind of difference can be annoying. Ideally we could > > > skip stack trace entries for internal BPF or lock functions and get the > > > correct caller, but it's not the case as of today. Currently it's hard > > > coded to control the behavior of stack traces for the lock contention > > > tracepoints. > > > > > > To handle those differences, add two new options to control the number of > > > stack entries and how many it skips. The default value worked well on > > > my VM setup, but I had to use --stack-skip=5 on real machines. > > > > > > You can get it from 'perf/lock-stack-v1' branch in > > > > > > git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git > > > > This clashed with a patch you Acked earlier, so lets see if someone has > > extra review comments and a v2 become needed for other reason, when you > > can refresh it, ok? > > Sounds good! Have you resubmitted this? /me goes on the backlog... - Arnaldo
On Tue, Sep 20, 2022 at 1:22 PM Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > Em Thu, Sep 08, 2022 at 04:44:15PM -0700, Namhyung Kim escreveu: > > Hi Arnaldo, > > > > On Thu, Sep 8, 2022 at 11:43 AM Arnaldo Carvalho de Melo > > <acme@kernel.org> wrote: > > > > > > Em Wed, Sep 07, 2022 at 11:37:50PM -0700, Namhyung Kim escreveu: > > > > Hello, > > > > > > > > I found that call stack from the lock tracepoint (using bpf_get_stackid) > > > > can be different on each configuration. For example it's very different > > > > when I run it on a VM than on a real machine. > > > > > > > > The perf lock contention relies on the stack trace to get the lock > > > > caller names, this kind of difference can be annoying. Ideally we could > > > > skip stack trace entries for internal BPF or lock functions and get the > > > > correct caller, but it's not the case as of today. Currently it's hard > > > > coded to control the behavior of stack traces for the lock contention > > > > tracepoints. > > > > > > > > To handle those differences, add two new options to control the number of > > > > stack entries and how many it skips. The default value worked well on > > > > my VM setup, but I had to use --stack-skip=5 on real machines. > > > > > > > > You can get it from 'perf/lock-stack-v1' branch in > > > > > > > > git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git > > > > > > This clashed with a patch you Acked earlier, so lets see if someone has > > > extra review comments and a v2 become needed for other reason, when you > > > can refresh it, ok? > > > > Sounds good! > > Have you resubmitted this? /me goes on the backlog... Yep :) https://lore.kernel.org/r/20220912055314.744552-1-namhyung@kernel.org
Em Tue, Sep 20, 2022 at 02:04:47PM -0700, Namhyung Kim escreveu: > On Tue, Sep 20, 2022 at 1:22 PM Arnaldo Carvalho de Melo > <acme@kernel.org> wrote: > > > > Em Thu, Sep 08, 2022 at 04:44:15PM -0700, Namhyung Kim escreveu: > > > Hi Arnaldo, > > > > > > On Thu, Sep 8, 2022 at 11:43 AM Arnaldo Carvalho de Melo > > > <acme@kernel.org> wrote: > > > > > > > > Em Wed, Sep 07, 2022 at 11:37:50PM -0700, Namhyung Kim escreveu: > > > > > Hello, > > > > > > > > > > I found that call stack from the lock tracepoint (using bpf_get_stackid) > > > > > can be different on each configuration. For example it's very different > > > > > when I run it on a VM than on a real machine. > > > > > > > > > > The perf lock contention relies on the stack trace to get the lock > > > > > caller names, this kind of difference can be annoying. Ideally we could > > > > > skip stack trace entries for internal BPF or lock functions and get the > > > > > correct caller, but it's not the case as of today. Currently it's hard > > > > > coded to control the behavior of stack traces for the lock contention > > > > > tracepoints. > > > > > > > > > > To handle those differences, add two new options to control the number of > > > > > stack entries and how many it skips. The default value worked well on > > > > > my VM setup, but I had to use --stack-skip=5 on real machines. > > > > > > > > > > You can get it from 'perf/lock-stack-v1' branch in > > > > > > > > > > git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git > > > > > > > > This clashed with a patch you Acked earlier, so lets see if someone has > > > > extra review comments and a v2 become needed for other reason, when you > > > > can refresh it, ok? > > > > > > Sounds good! > > > > Have you resubmitted this? /me goes on the backlog... > > Yep :) > > https://lore.kernel.org/r/20220912055314.744552-1-namhyung@kernel.org It applies now, testing :-) - Arnaldo