mbox series

[0/4] perf lock contention: Improve call stack handling (v1)

Message ID 20220908063754.1369709-1-namhyung@kernel.org (mailing list archive)
Headers show
Series perf lock contention: Improve call stack handling (v1) | expand

Message

Namhyung Kim Sept. 8, 2022, 6:37 a.m. UTC
Hello,

I found that call stack from the lock tracepoint (using bpf_get_stackid)
can be different on each configuration.  For example it's very different
when I run it on a VM than on a real machine.

The perf lock contention relies on the stack trace to get the lock
caller names, this kind of difference can be annoying.  Ideally we could
skip stack trace entries for internal BPF or lock functions and get the
correct caller, but it's not the case as of today.  Currently it's hard
coded to control the behavior of stack traces for the lock contention
tracepoints.

To handle those differences, add two new options to control the number of
stack entries and how many it skips.  The default value worked well on
my VM setup, but I had to use --stack-skip=5 on real machines.

You can get it from 'perf/lock-stack-v1' branch in

  git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Thanks,
Namhyung


Namhyung Kim (4):
  perf lock contention: Factor out get_symbol_name_offset()
  perf lock contention: Show full callstack with -v option
  perf lock contention: Allow to change stack depth and skip
  perf lock contention: Skip stack trace from BPF

 tools/perf/Documentation/perf-lock.txt        |  6 ++
 tools/perf/builtin-lock.c                     | 89 ++++++++++++++-----
 tools/perf/util/bpf_lock_contention.c         | 21 +++--
 .../perf/util/bpf_skel/lock_contention.bpf.c  |  3 +-
 tools/perf/util/lock-contention.h             |  3 +
 5 files changed, 96 insertions(+), 26 deletions(-)


base-commit: 6c3bd8d3e01d9014312caa52e4ef1c29d5249648

Comments

Arnaldo Carvalho de Melo Sept. 8, 2022, 6:43 p.m. UTC | #1
Em Wed, Sep 07, 2022 at 11:37:50PM -0700, Namhyung Kim escreveu:
> Hello,
> 
> I found that call stack from the lock tracepoint (using bpf_get_stackid)
> can be different on each configuration.  For example it's very different
> when I run it on a VM than on a real machine.
> 
> The perf lock contention relies on the stack trace to get the lock
> caller names, this kind of difference can be annoying.  Ideally we could
> skip stack trace entries for internal BPF or lock functions and get the
> correct caller, but it's not the case as of today.  Currently it's hard
> coded to control the behavior of stack traces for the lock contention
> tracepoints.
> 
> To handle those differences, add two new options to control the number of
> stack entries and how many it skips.  The default value worked well on
> my VM setup, but I had to use --stack-skip=5 on real machines.
> 
> You can get it from 'perf/lock-stack-v1' branch in
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

This clashed with a patch you Acked earlier, so lets see if someone has
extra review comments and a v2 become needed for other reason, when you
can refresh it, ok?

- Arnaldo
 
> Thanks,
> Namhyung
> 
> 
> Namhyung Kim (4):
>   perf lock contention: Factor out get_symbol_name_offset()
>   perf lock contention: Show full callstack with -v option
>   perf lock contention: Allow to change stack depth and skip
>   perf lock contention: Skip stack trace from BPF
> 
>  tools/perf/Documentation/perf-lock.txt        |  6 ++
>  tools/perf/builtin-lock.c                     | 89 ++++++++++++++-----
>  tools/perf/util/bpf_lock_contention.c         | 21 +++--
>  .../perf/util/bpf_skel/lock_contention.bpf.c  |  3 +-
>  tools/perf/util/lock-contention.h             |  3 +
>  5 files changed, 96 insertions(+), 26 deletions(-)
> 
> 
> base-commit: 6c3bd8d3e01d9014312caa52e4ef1c29d5249648
> -- 
> 2.37.2.789.g6183377224-goog
Namhyung Kim Sept. 8, 2022, 11:44 p.m. UTC | #2
Hi Arnaldo,

On Thu, Sep 8, 2022 at 11:43 AM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> Em Wed, Sep 07, 2022 at 11:37:50PM -0700, Namhyung Kim escreveu:
> > Hello,
> >
> > I found that call stack from the lock tracepoint (using bpf_get_stackid)
> > can be different on each configuration.  For example it's very different
> > when I run it on a VM than on a real machine.
> >
> > The perf lock contention relies on the stack trace to get the lock
> > caller names, this kind of difference can be annoying.  Ideally we could
> > skip stack trace entries for internal BPF or lock functions and get the
> > correct caller, but it's not the case as of today.  Currently it's hard
> > coded to control the behavior of stack traces for the lock contention
> > tracepoints.
> >
> > To handle those differences, add two new options to control the number of
> > stack entries and how many it skips.  The default value worked well on
> > my VM setup, but I had to use --stack-skip=5 on real machines.
> >
> > You can get it from 'perf/lock-stack-v1' branch in
> >
> >   git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
>
> This clashed with a patch you Acked earlier, so lets see if someone has
> extra review comments and a v2 become needed for other reason, when you
> can refresh it, ok?

Sounds good!

Thanks,
Namhyung
Arnaldo Carvalho de Melo Sept. 20, 2022, 8:22 p.m. UTC | #3
Em Thu, Sep 08, 2022 at 04:44:15PM -0700, Namhyung Kim escreveu:
> Hi Arnaldo,
> 
> On Thu, Sep 8, 2022 at 11:43 AM Arnaldo Carvalho de Melo
> <acme@kernel.org> wrote:
> >
> > Em Wed, Sep 07, 2022 at 11:37:50PM -0700, Namhyung Kim escreveu:
> > > Hello,
> > >
> > > I found that call stack from the lock tracepoint (using bpf_get_stackid)
> > > can be different on each configuration.  For example it's very different
> > > when I run it on a VM than on a real machine.
> > >
> > > The perf lock contention relies on the stack trace to get the lock
> > > caller names, this kind of difference can be annoying.  Ideally we could
> > > skip stack trace entries for internal BPF or lock functions and get the
> > > correct caller, but it's not the case as of today.  Currently it's hard
> > > coded to control the behavior of stack traces for the lock contention
> > > tracepoints.
> > >
> > > To handle those differences, add two new options to control the number of
> > > stack entries and how many it skips.  The default value worked well on
> > > my VM setup, but I had to use --stack-skip=5 on real machines.
> > >
> > > You can get it from 'perf/lock-stack-v1' branch in
> > >
> > >   git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
> >
> > This clashed with a patch you Acked earlier, so lets see if someone has
> > extra review comments and a v2 become needed for other reason, when you
> > can refresh it, ok?
> 
> Sounds good!

Have you resubmitted this? /me goes on the backlog...

- Arnaldo
Namhyung Kim Sept. 20, 2022, 9:04 p.m. UTC | #4
On Tue, Sep 20, 2022 at 1:22 PM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> Em Thu, Sep 08, 2022 at 04:44:15PM -0700, Namhyung Kim escreveu:
> > Hi Arnaldo,
> >
> > On Thu, Sep 8, 2022 at 11:43 AM Arnaldo Carvalho de Melo
> > <acme@kernel.org> wrote:
> > >
> > > Em Wed, Sep 07, 2022 at 11:37:50PM -0700, Namhyung Kim escreveu:
> > > > Hello,
> > > >
> > > > I found that call stack from the lock tracepoint (using bpf_get_stackid)
> > > > can be different on each configuration.  For example it's very different
> > > > when I run it on a VM than on a real machine.
> > > >
> > > > The perf lock contention relies on the stack trace to get the lock
> > > > caller names, this kind of difference can be annoying.  Ideally we could
> > > > skip stack trace entries for internal BPF or lock functions and get the
> > > > correct caller, but it's not the case as of today.  Currently it's hard
> > > > coded to control the behavior of stack traces for the lock contention
> > > > tracepoints.
> > > >
> > > > To handle those differences, add two new options to control the number of
> > > > stack entries and how many it skips.  The default value worked well on
> > > > my VM setup, but I had to use --stack-skip=5 on real machines.
> > > >
> > > > You can get it from 'perf/lock-stack-v1' branch in
> > > >
> > > >   git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
> > >
> > > This clashed with a patch you Acked earlier, so lets see if someone has
> > > extra review comments and a v2 become needed for other reason, when you
> > > can refresh it, ok?
> >
> > Sounds good!
>
> Have you resubmitted this? /me goes on the backlog...

Yep :)

https://lore.kernel.org/r/20220912055314.744552-1-namhyung@kernel.org
Arnaldo Carvalho de Melo Sept. 21, 2022, 2:09 p.m. UTC | #5
Em Tue, Sep 20, 2022 at 02:04:47PM -0700, Namhyung Kim escreveu:
> On Tue, Sep 20, 2022 at 1:22 PM Arnaldo Carvalho de Melo
> <acme@kernel.org> wrote:
> >
> > Em Thu, Sep 08, 2022 at 04:44:15PM -0700, Namhyung Kim escreveu:
> > > Hi Arnaldo,
> > >
> > > On Thu, Sep 8, 2022 at 11:43 AM Arnaldo Carvalho de Melo
> > > <acme@kernel.org> wrote:
> > > >
> > > > Em Wed, Sep 07, 2022 at 11:37:50PM -0700, Namhyung Kim escreveu:
> > > > > Hello,
> > > > >
> > > > > I found that call stack from the lock tracepoint (using bpf_get_stackid)
> > > > > can be different on each configuration.  For example it's very different
> > > > > when I run it on a VM than on a real machine.
> > > > >
> > > > > The perf lock contention relies on the stack trace to get the lock
> > > > > caller names, this kind of difference can be annoying.  Ideally we could
> > > > > skip stack trace entries for internal BPF or lock functions and get the
> > > > > correct caller, but it's not the case as of today.  Currently it's hard
> > > > > coded to control the behavior of stack traces for the lock contention
> > > > > tracepoints.
> > > > >
> > > > > To handle those differences, add two new options to control the number of
> > > > > stack entries and how many it skips.  The default value worked well on
> > > > > my VM setup, but I had to use --stack-skip=5 on real machines.
> > > > >
> > > > > You can get it from 'perf/lock-stack-v1' branch in
> > > > >
> > > > >   git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
> > > >
> > > > This clashed with a patch you Acked earlier, so lets see if someone has
> > > > extra review comments and a v2 become needed for other reason, when you
> > > > can refresh it, ok?
> > >
> > > Sounds good!
> >
> > Have you resubmitted this? /me goes on the backlog...
> 
> Yep :)
> 
> https://lore.kernel.org/r/20220912055314.744552-1-namhyung@kernel.org

It applies now, testing :-)

- Arnaldo