mbox series

[RFC,0/5] mm/bpf/perf: Store build id in file object

Message ID 20230201135737.800527-1-jolsa@kernel.org (mailing list archive)
Headers show
Series mm/bpf/perf: Store build id in file object | expand

Message

Jiri Olsa Feb. 1, 2023, 1:57 p.m. UTC
hi,
we have a use cases for bpf programs to use binary file's build id.

After some attempts to add helpers/kfuncs [1] [2] Andrii had an idea [3]
to store build id directly in the file object. That would solve our use
case and might be beneficial for other profiling/tracing use cases with
bpf programs.

This RFC patchset adds new config CONFIG_FILE_BUILD_ID option, which adds
build id object pointer to the file object when enabled. The build id is
read/populated when the file is mmap-ed.

I also added bpf and perf changes that would benefit from this.

I'm not sure what's the policy on adding stuff to file object, so apologies
if that's out of line. I'm open to any feedback or suggestions if there's
better place or way to do this.

thanks,
jirka


[1] https://lore.kernel.org/bpf/20221108222027.3409437-1-jolsa@kernel.org/
[2] https://lore.kernel.org/bpf/20221128132915.141211-1-jolsa@kernel.org/
[3] https://lore.kernel.org/bpf/CAEf4BzaZCUoxN_X2ALXwQeFTCwtL17R4P_B_-hUCcidfyO2xyQ@mail.gmail.com/
---
Jiri Olsa (5):
      mm: Store build id in file object
      bpf: Use file object build id in stackmap
      perf: Use file object build id in perf_event_mmap_event
      selftests/bpf: Add file_build_id test
      selftests/bpf: Add iter_task_vma_buildid test

 fs/file_table.c                                               |  3 +++
 include/linux/buildid.h                                       | 17 +++++++++++++++++
 include/linux/fs.h                                            |  3 +++
 kernel/bpf/stackmap.c                                         |  8 ++++++++
 kernel/events/core.c                                          | 43 +++++++++++++++++++++++++++++++++++++++----
 lib/buildid.c                                                 | 44 ++++++++++++++++++++++++++++++++++++++++++++
 mm/Kconfig                                                    |  7 +++++++
 mm/mmap.c                                                     | 15 +++++++++++++++
 tools/testing/selftests/bpf/prog_tests/bpf_iter.c             | 88 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 tools/testing/selftests/bpf/prog_tests/file_build_id.c        | 70 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 tools/testing/selftests/bpf/progs/bpf_iter_task_vma_buildid.c | 49 +++++++++++++++++++++++++++++++++++++++++++++++++
 tools/testing/selftests/bpf/progs/file_build_id.c             | 34 ++++++++++++++++++++++++++++++++++
 tools/testing/selftests/bpf/trace_helpers.c                   | 35 +++++++++++++++++++++++++++++++++++
 tools/testing/selftests/bpf/trace_helpers.h                   |  1 +
 14 files changed, 413 insertions(+), 4 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/file_build_id.c
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_task_vma_buildid.c
 create mode 100644 tools/testing/selftests/bpf/progs/file_build_id.c

Comments

Alexei Starovoitov Feb. 2, 2023, 11:15 a.m. UTC | #1
On Wed, Feb 1, 2023 at 5:57 AM Jiri Olsa <jolsa@kernel.org> wrote:
>
> hi,
> we have a use cases for bpf programs to use binary file's build id.
>
> After some attempts to add helpers/kfuncs [1] [2] Andrii had an idea [3]
> to store build id directly in the file object. That would solve our use
> case and might be beneficial for other profiling/tracing use cases with
> bpf programs.
>
> This RFC patchset adds new config CONFIG_FILE_BUILD_ID option, which adds
> build id object pointer to the file object when enabled. The build id is
> read/populated when the file is mmap-ed.
>
> I also added bpf and perf changes that would benefit from this.
>
> I'm not sure what's the policy on adding stuff to file object, so apologies
> if that's out of line. I'm open to any feedback or suggestions if there's
> better place or way to do this.

struct file represents all files while build_id is for executables only,
and not all executables, but those currently running, so
I think it's cleaner to put it into vm_area_struct.
Jiri Olsa Feb. 2, 2023, 2:47 p.m. UTC | #2
On Thu, Feb 02, 2023 at 03:15:39AM -0800, Alexei Starovoitov wrote:
> On Wed, Feb 1, 2023 at 5:57 AM Jiri Olsa <jolsa@kernel.org> wrote:
> >
> > hi,
> > we have a use cases for bpf programs to use binary file's build id.
> >
> > After some attempts to add helpers/kfuncs [1] [2] Andrii had an idea [3]
> > to store build id directly in the file object. That would solve our use
> > case and might be beneficial for other profiling/tracing use cases with
> > bpf programs.
> >
> > This RFC patchset adds new config CONFIG_FILE_BUILD_ID option, which adds
> > build id object pointer to the file object when enabled. The build id is
> > read/populated when the file is mmap-ed.
> >
> > I also added bpf and perf changes that would benefit from this.
> >
> > I'm not sure what's the policy on adding stuff to file object, so apologies
> > if that's out of line. I'm open to any feedback or suggestions if there's
> > better place or way to do this.
> 
> struct file represents all files while build_id is for executables only,
> and not all executables, but those currently running, so
> I think it's cleaner to put it into vm_area_struct.

I thought file objects would be shared to some extend and we might save
some memory keeping the build id objects there, but not sure it's really
the case now.. will check, using vma might be also easier

jirka
Matthew Wilcox (Oracle) Feb. 2, 2023, 3:10 p.m. UTC | #3
On Wed, Feb 01, 2023 at 02:57:32PM +0100, Jiri Olsa wrote:
> hi,
> we have a use cases for bpf programs to use binary file's build id.

What is your use case?  Is it some hobbyist thing or is it something
that distro kernels are all going to enable?
Jiri Olsa Feb. 2, 2023, 3:33 p.m. UTC | #4
On Thu, Feb 02, 2023 at 03:10:30PM +0000, Matthew Wilcox wrote:
> On Wed, Feb 01, 2023 at 02:57:32PM +0100, Jiri Olsa wrote:
> > hi,
> > we have a use cases for bpf programs to use binary file's build id.
> 
> What is your use case?  Is it some hobbyist thing or is it something
> that distro kernels are all going to enable?
> 

our use case is for hubble/tetragon [1] and we are asked to report
buildid of executed binary.. but the monitoring process is running
in its own pod and can't access the the binaries outside of it, so
we need to be able to read it in kernel

I understand Hao Luo has also use case for that [2]

jirka


[1] https://github.com/cilium/tetragon/
[2] https://lore.kernel.org/bpf/CA+khW7gAYHmoUkq0UqTiZjdOqARLG256USj3uFwi6z_FyZf31w@mail.gmail.com/
Peter Zijlstra Feb. 3, 2023, 10:03 a.m. UTC | #5
On Thu, Feb 02, 2023 at 03:15:39AM -0800, Alexei Starovoitov wrote:
> On Wed, Feb 1, 2023 at 5:57 AM Jiri Olsa <jolsa@kernel.org> wrote:
> >
> > hi,
> > we have a use cases for bpf programs to use binary file's build id.
> >
> > After some attempts to add helpers/kfuncs [1] [2] Andrii had an idea [3]
> > to store build id directly in the file object. That would solve our use
> > case and might be beneficial for other profiling/tracing use cases with
> > bpf programs.
> >
> > This RFC patchset adds new config CONFIG_FILE_BUILD_ID option, which adds
> > build id object pointer to the file object when enabled. The build id is
> > read/populated when the file is mmap-ed.
> >
> > I also added bpf and perf changes that would benefit from this.
> >
> > I'm not sure what's the policy on adding stuff to file object, so apologies
> > if that's out of line. I'm open to any feedback or suggestions if there's
> > better place or way to do this.
> 
> struct file represents all files while build_id is for executables only,
> and not all executables, but those currently running, so
> I think it's cleaner to put it into vm_area_struct.

There can be many vm_area_structs per file, and like for struct file,
there's vm_area_structs for non-executable ranges too.

Given there's only one buildid per file, struct file seems most
appropriate to me.
Hao Luo Feb. 9, 2023, 7:12 a.m. UTC | #6
On Thu, Feb 2, 2023 at 7:33 AM Jiri Olsa <olsajiri@gmail.com> wrote:
>
> On Thu, Feb 02, 2023 at 03:10:30PM +0000, Matthew Wilcox wrote:
> > On Wed, Feb 01, 2023 at 02:57:32PM +0100, Jiri Olsa wrote:
> > > hi,
> > > we have a use cases for bpf programs to use binary file's build id.
> >
> > What is your use case?  Is it some hobbyist thing or is it something
> > that distro kernels are all going to enable?
> >
>
> our use case is for hubble/tetragon [1] and we are asked to report
> buildid of executed binary.. but the monitoring process is running
> in its own pod and can't access the the binaries outside of it, so
> we need to be able to read it in kernel
>
> I understand Hao Luo has also use case for that [2]
>

Sorry for the late reply.

We use BPF to profile stacktraces and build id is more useful than
instruction addresses. However, sometimes we need to record
stacktraces from an atomic context. In that case, if the page that
contains the build id string is not in the page cache, we would fail
to get build id. Storing the build id in file object solves this
problem and helps us get build id more reliably.

> jirka
>
>
> [1] https://github.com/cilium/tetragon/
> [2] https://lore.kernel.org/bpf/CA+khW7gAYHmoUkq0UqTiZjdOqARLG256USj3uFwi6z_FyZf31w@mail.gmail.com/
Jiri Olsa Feb. 9, 2023, 2:18 p.m. UTC | #7
On Wed, Feb 01, 2023 at 02:57:32PM +0100, Jiri Olsa wrote:
> hi,
> we have a use cases for bpf programs to use binary file's build id.
> 
> After some attempts to add helpers/kfuncs [1] [2] Andrii had an idea [3]
> to store build id directly in the file object. That would solve our use
> case and might be beneficial for other profiling/tracing use cases with
> bpf programs.
> 
> This RFC patchset adds new config CONFIG_FILE_BUILD_ID option, which adds
> build id object pointer to the file object when enabled. The build id is
> read/populated when the file is mmap-ed.
> 
> I also added bpf and perf changes that would benefit from this.
> 
> I'm not sure what's the policy on adding stuff to file object, so apologies
> if that's out of line. I'm open to any feedback or suggestions if there's
> better place or way to do this.

hi,
Matthew suggested on irc to consider inode for storing build id

I tried that and it seems to have better stats wrt allocated build
id objects, because inode is being shared among file objects

I took /proc/slabinfo output after running bpf tests

- build id stored in file:

  # name            <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
  build_id             668    775    160   25    1 : tunables    0    0    0 : slabdata     31     31      0

- build id stored in inode:

  # name            <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
  build_id             222    225    160   25    1 : tunables    0    0    0 : slabdata      9      9      0


I'm stranger to inode/fs/mm code so I'll spend some time checking on
what I possibly broke in there before I send it, but I'd appreciate
any early feedback ;-)

the code is in here:
  git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
  inode_build_id

I'll send another version with inode if there's no objection

thanks,
jirka


> 
> thanks,
> jirka
> 
> 
> [1] https://lore.kernel.org/bpf/20221108222027.3409437-1-jolsa@kernel.org/
> [2] https://lore.kernel.org/bpf/20221128132915.141211-1-jolsa@kernel.org/
> [3] https://lore.kernel.org/bpf/CAEf4BzaZCUoxN_X2ALXwQeFTCwtL17R4P_B_-hUCcidfyO2xyQ@mail.gmail.com/
> ---
> Jiri Olsa (5):
>       mm: Store build id in file object
>       bpf: Use file object build id in stackmap
>       perf: Use file object build id in perf_event_mmap_event
>       selftests/bpf: Add file_build_id test
>       selftests/bpf: Add iter_task_vma_buildid test
> 
>  fs/file_table.c                                               |  3 +++
>  include/linux/buildid.h                                       | 17 +++++++++++++++++
>  include/linux/fs.h                                            |  3 +++
>  kernel/bpf/stackmap.c                                         |  8 ++++++++
>  kernel/events/core.c                                          | 43 +++++++++++++++++++++++++++++++++++++++----
>  lib/buildid.c                                                 | 44 ++++++++++++++++++++++++++++++++++++++++++++
>  mm/Kconfig                                                    |  7 +++++++
>  mm/mmap.c                                                     | 15 +++++++++++++++
>  tools/testing/selftests/bpf/prog_tests/bpf_iter.c             | 88 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  tools/testing/selftests/bpf/prog_tests/file_build_id.c        | 70 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  tools/testing/selftests/bpf/progs/bpf_iter_task_vma_buildid.c | 49 +++++++++++++++++++++++++++++++++++++++++++++++++
>  tools/testing/selftests/bpf/progs/file_build_id.c             | 34 ++++++++++++++++++++++++++++++++++
>  tools/testing/selftests/bpf/trace_helpers.c                   | 35 +++++++++++++++++++++++++++++++++++
>  tools/testing/selftests/bpf/trace_helpers.h                   |  1 +
>  14 files changed, 413 insertions(+), 4 deletions(-)
>  create mode 100644 tools/testing/selftests/bpf/prog_tests/file_build_id.c
>  create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_task_vma_buildid.c
>  create mode 100644 tools/testing/selftests/bpf/progs/file_build_id.c
Namhyung Kim Feb. 9, 2023, 7:38 p.m. UTC | #8
Hi Jiri,

On Thu, Feb 9, 2023 at 6:25 AM Jiri Olsa <olsajiri@gmail.com> wrote:
>
> On Wed, Feb 01, 2023 at 02:57:32PM +0100, Jiri Olsa wrote:
> > hi,
> > we have a use cases for bpf programs to use binary file's build id.
> >
> > After some attempts to add helpers/kfuncs [1] [2] Andrii had an idea [3]
> > to store build id directly in the file object. That would solve our use
> > case and might be beneficial for other profiling/tracing use cases with
> > bpf programs.
> >
> > This RFC patchset adds new config CONFIG_FILE_BUILD_ID option, which adds
> > build id object pointer to the file object when enabled. The build id is
> > read/populated when the file is mmap-ed.
> >
> > I also added bpf and perf changes that would benefit from this.

Thanks for working on this!

> >
> > I'm not sure what's the policy on adding stuff to file object, so apologies
> > if that's out of line. I'm open to any feedback or suggestions if there's
> > better place or way to do this.
>
> hi,
> Matthew suggested on irc to consider inode for storing build id

Yeah, that's my idea too.

>
> I tried that and it seems to have better stats wrt allocated build
> id objects, because inode is being shared among file objects
>
> I took /proc/slabinfo output after running bpf tests
>
> - build id stored in file:
>
>   # name            <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
>   build_id             668    775    160   25    1 : tunables    0    0    0 : slabdata     31     31      0
>
> - build id stored in inode:
>
>   # name            <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
>   build_id             222    225    160   25    1 : tunables    0    0    0 : slabdata      9      9      0

Cool!

>
>
> I'm stranger to inode/fs/mm code so I'll spend some time checking on
> what I possibly broke in there before I send it, but I'd appreciate
> any early feedback ;-)
>
> the code is in here:
>   git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
>   inode_build_id
>
> I'll send another version with inode if there's no objection

I'll take a look.

Thanks,
Namhyung


> >
> > [1] https://lore.kernel.org/bpf/20221108222027.3409437-1-jolsa@kernel.org/
> > [2] https://lore.kernel.org/bpf/20221128132915.141211-1-jolsa@kernel.org/
> > [3] https://lore.kernel.org/bpf/CAEf4BzaZCUoxN_X2ALXwQeFTCwtL17R4P_B_-hUCcidfyO2xyQ@mail.gmail.com/
> > ---
> > Jiri Olsa (5):
> >       mm: Store build id in file object
> >       bpf: Use file object build id in stackmap
> >       perf: Use file object build id in perf_event_mmap_event
> >       selftests/bpf: Add file_build_id test
> >       selftests/bpf: Add iter_task_vma_buildid test
> >
> >  fs/file_table.c                                               |  3 +++
> >  include/linux/buildid.h                                       | 17 +++++++++++++++++
> >  include/linux/fs.h                                            |  3 +++
> >  kernel/bpf/stackmap.c                                         |  8 ++++++++
> >  kernel/events/core.c                                          | 43 +++++++++++++++++++++++++++++++++++++++----
> >  lib/buildid.c                                                 | 44 ++++++++++++++++++++++++++++++++++++++++++++
> >  mm/Kconfig                                                    |  7 +++++++
> >  mm/mmap.c                                                     | 15 +++++++++++++++
> >  tools/testing/selftests/bpf/prog_tests/bpf_iter.c             | 88 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  tools/testing/selftests/bpf/prog_tests/file_build_id.c        | 70 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  tools/testing/selftests/bpf/progs/bpf_iter_task_vma_buildid.c | 49 +++++++++++++++++++++++++++++++++++++++++++++++++
> >  tools/testing/selftests/bpf/progs/file_build_id.c             | 34 ++++++++++++++++++++++++++++++++++
> >  tools/testing/selftests/bpf/trace_helpers.c                   | 35 +++++++++++++++++++++++++++++++++++
> >  tools/testing/selftests/bpf/trace_helpers.h                   |  1 +
> >  14 files changed, 413 insertions(+), 4 deletions(-)
> >  create mode 100644 tools/testing/selftests/bpf/prog_tests/file_build_id.c
> >  create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_task_vma_buildid.c
> >  create mode 100644 tools/testing/selftests/bpf/progs/file_build_id.c