mbox series

[v2,0/6] maps memory improvements and fixes

Message ID 20240207223639.3139601-1-irogers@google.com (mailing list archive)
Headers show
Series maps memory improvements and fixes | expand

Message

Ian Rogers Feb. 7, 2024, 10:36 p.m. UTC
First 6 patches from:
https://lore.kernel.org/lkml/20240202061532.1939474-1-irogers@google.com/

v2. Fix NO_LIBUNWIND=1 build issue.

Ian Rogers (6):
  perf maps: Switch from rbtree to lazily sorted array for addresses
  perf maps: Get map before returning in maps__find
  perf maps: Get map before returning in maps__find_by_name
  perf maps: Get map before returning in maps__find_next_entry
  perf maps: Hide maps internals
  perf maps: Locking tidy up of nr_maps

 tools/perf/arch/x86/tests/dwarf-unwind.c |    1 +
 tools/perf/tests/maps.c                  |    3 +
 tools/perf/tests/thread-maps-share.c     |    8 +-
 tools/perf/tests/vmlinux-kallsyms.c      |   10 +-
 tools/perf/util/bpf-event.c              |    1 +
 tools/perf/util/callchain.c              |    2 +-
 tools/perf/util/event.c                  |    4 +-
 tools/perf/util/machine.c                |   34 +-
 tools/perf/util/map.c                    |    1 +
 tools/perf/util/maps.c                   | 1296 ++++++++++++++--------
 tools/perf/util/maps.h                   |   65 +-
 tools/perf/util/probe-event.c            |    1 +
 tools/perf/util/symbol-elf.c             |    4 +-
 tools/perf/util/symbol.c                 |   31 +-
 tools/perf/util/thread.c                 |    2 +-
 tools/perf/util/unwind-libdw.c           |    2 +-
 tools/perf/util/unwind-libunwind-local.c |    2 +-
 tools/perf/util/unwind-libunwind.c       |    7 +-
 18 files changed, 900 insertions(+), 574 deletions(-)

Comments

Namhyung Kim Feb. 8, 2024, 5:44 p.m. UTC | #1
Hi Ian,

On Wed, Feb 7, 2024 at 2:37 PM Ian Rogers <irogers@google.com> wrote:
>
> First 6 patches from:
> https://lore.kernel.org/lkml/20240202061532.1939474-1-irogers@google.com/
>
> v2. Fix NO_LIBUNWIND=1 build issue.
>
> Ian Rogers (6):
>   perf maps: Switch from rbtree to lazily sorted array for addresses
>   perf maps: Get map before returning in maps__find
>   perf maps: Get map before returning in maps__find_by_name
>   perf maps: Get map before returning in maps__find_next_entry
>   perf maps: Hide maps internals
>   perf maps: Locking tidy up of nr_maps

Now I see a perf test failure on the vmlinux test:

$ sudo ./perf test -v vmlinux
  1: vmlinux symtab matches kallsyms                                 :
--- start ---
test child forked, pid 4164115
/proc/{kallsyms,modules} inconsistency while looking for
"[__builtin__kprobes]" module!
/proc/{kallsyms,modules} inconsistency while looking for
"[__builtin__kprobes]" module!
/proc/{kallsyms,modules} inconsistency while looking for
"[__builtin__ftrace]" module!
Looking at the vmlinux_path (8 entries long)
Using /usr/lib/debug/boot/vmlinux-6.5.13-1rodete2-amd64 for symbols
perf: Segmentation fault
Obtained 16 stack frames.
./perf(+0x1b7dcd) [0x55c40be97dcd]
./perf(+0x1b7eb7) [0x55c40be97eb7]
/lib/x86_64-linux-gnu/libc.so.6(+0x3c510) [0x7f33d7a5a510]
./perf(+0x1c2e9c) [0x55c40bea2e9c]
./perf(+0x1c43f6) [0x55c40bea43f6]
./perf(+0x1c4649) [0x55c40bea4649]
./perf(+0x1c46d3) [0x55c40bea46d3]
./perf(+0x1c7303) [0x55c40bea7303]
./perf(+0x1c70b5) [0x55c40bea70b5]
./perf(+0x1c73e6) [0x55c40bea73e6]
./perf(+0x11833e) [0x55c40bdf833e]
./perf(+0x118f78) [0x55c40bdf8f78]
./perf(+0x103d49) [0x55c40bde3d49]
./perf(+0x103e75) [0x55c40bde3e75]
./perf(+0x1044c0) [0x55c40bde44c0]
./perf(+0x104de0) [0x55c40bde4de0]
test child interrupted
---- end ----
vmlinux symtab matches kallsyms: FAILED!


Thanks,
Namhyung
Ian Rogers Feb. 10, 2024, 2:46 a.m. UTC | #2
On Thu, Feb 8, 2024 at 9:44 AM Namhyung Kim <namhyung@kernel.org> wrote:
>
> Hi Ian,
>
> On Wed, Feb 7, 2024 at 2:37 PM Ian Rogers <irogers@google.com> wrote:
> >
> > First 6 patches from:
> > https://lore.kernel.org/lkml/20240202061532.1939474-1-irogers@google.com/
> >
> > v2. Fix NO_LIBUNWIND=1 build issue.
> >
> > Ian Rogers (6):
> >   perf maps: Switch from rbtree to lazily sorted array for addresses
> >   perf maps: Get map before returning in maps__find
> >   perf maps: Get map before returning in maps__find_by_name
> >   perf maps: Get map before returning in maps__find_next_entry
> >   perf maps: Hide maps internals
> >   perf maps: Locking tidy up of nr_maps
>
> Now I see a perf test failure on the vmlinux test:
>
> $ sudo ./perf test -v vmlinux
>   1: vmlinux symtab matches kallsyms                                 :
> --- start ---
> test child forked, pid 4164115
> /proc/{kallsyms,modules} inconsistency while looking for
> "[__builtin__kprobes]" module!
> /proc/{kallsyms,modules} inconsistency while looking for
> "[__builtin__kprobes]" module!
> /proc/{kallsyms,modules} inconsistency while looking for
> "[__builtin__ftrace]" module!
> Looking at the vmlinux_path (8 entries long)
> Using /usr/lib/debug/boot/vmlinux-6.5.13-1rodete2-amd64 for symbols
> perf: Segmentation fault
> Obtained 16 stack frames.
> ./perf(+0x1b7dcd) [0x55c40be97dcd]
> ./perf(+0x1b7eb7) [0x55c40be97eb7]
> /lib/x86_64-linux-gnu/libc.so.6(+0x3c510) [0x7f33d7a5a510]
> ./perf(+0x1c2e9c) [0x55c40bea2e9c]
> ./perf(+0x1c43f6) [0x55c40bea43f6]
> ./perf(+0x1c4649) [0x55c40bea4649]
> ./perf(+0x1c46d3) [0x55c40bea46d3]
> ./perf(+0x1c7303) [0x55c40bea7303]
> ./perf(+0x1c70b5) [0x55c40bea70b5]
> ./perf(+0x1c73e6) [0x55c40bea73e6]
> ./perf(+0x11833e) [0x55c40bdf833e]
> ./perf(+0x118f78) [0x55c40bdf8f78]
> ./perf(+0x103d49) [0x55c40bde3d49]
> ./perf(+0x103e75) [0x55c40bde3e75]
> ./perf(+0x1044c0) [0x55c40bde44c0]
> ./perf(+0x104de0) [0x55c40bde4de0]
> test child interrupted
> ---- end ----
> vmlinux symtab matches kallsyms: FAILED!

Ah, tripped over a latent bug summarized in this part of an asan stack trace:
```
freed by thread T0 here:
   #0 0x7fa13bcd74b5 in __interceptor_realloc
../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:85
   #1 0x561d66377713 in __maps__insert util/maps.c:353
   #2 0x561d66377b89 in maps__insert util/maps.c:413
   #3 0x561d6652911d in dso__process_kernel_symbol util/symbol-elf.c:1460
   #4 0x561d6652aaae in dso__load_sym_internal util/symbol-elf.c:1675
   #5 0x561d6652b6dc in dso__load_sym util/symbol-elf.c:1771
   #6 0x561d66321a4e in dso__load util/symbol.c:1914
   #7 0x561d66372cd9 in map__load util/map.c:353
   #8 0x561d663730e7 in map__find_symbol_by_name_idx util/map.c:397
   #9 0x561d663731e7 in map__find_symbol_by_name util/map.c:410
   #10 0x561d66378208 in maps__find_symbol_by_name_cb util/maps.c:524
   #11 0x561d66377f49 in maps__for_each_map util/maps.c:471
   #12 0x561d663784a0 in maps__find_symbol_by_name util/maps.c:546
   #13 0x561d662093e8 in machine__find_kernel_symbol_by_name util/machine.h:243
   #14 0x561d6620abbd in test__vmlinux_matches_kallsyms
tests/vmlinux-kallsyms.c:330
...
```
dso__process_kernel_symbol rewrites the kernel maps here:
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/symbol-elf.c#n1378
which resizes the maps_by_address array causing the maps__for_each_map
iteration in frame 11 to be iterating over a stale/freed value.

The most correct solutions would be to clone the maps_by_address array
prior to iteration, or reference count maps_by_address and its size.
Neither of these solutions particularly appeal, so just reloading the
maps_by_address and size on each iteration also fixes the problem, but
possibly causes some maps to be skipped/repeated. I think this is
acceptable correctness for the performance.

Thanks,
Ian

> Thanks,
> Namhyung
Ian Rogers Feb. 10, 2024, 6:08 p.m. UTC | #3
On Fri, Feb 9, 2024 at 6:46 PM Ian Rogers <irogers@google.com> wrote:
>
> On Thu, Feb 8, 2024 at 9:44 AM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > Hi Ian,
> >
> > On Wed, Feb 7, 2024 at 2:37 PM Ian Rogers <irogers@google.com> wrote:
> > >
> > > First 6 patches from:
> > > https://lore.kernel.org/lkml/20240202061532.1939474-1-irogers@google.com/
> > >
> > > v2. Fix NO_LIBUNWIND=1 build issue.
> > >
> > > Ian Rogers (6):
> > >   perf maps: Switch from rbtree to lazily sorted array for addresses
> > >   perf maps: Get map before returning in maps__find
> > >   perf maps: Get map before returning in maps__find_by_name
> > >   perf maps: Get map before returning in maps__find_next_entry
> > >   perf maps: Hide maps internals
> > >   perf maps: Locking tidy up of nr_maps
> >
> > Now I see a perf test failure on the vmlinux test:
> >
> > $ sudo ./perf test -v vmlinux
> >   1: vmlinux symtab matches kallsyms                                 :
> > --- start ---
> > test child forked, pid 4164115
> > /proc/{kallsyms,modules} inconsistency while looking for
> > "[__builtin__kprobes]" module!
> > /proc/{kallsyms,modules} inconsistency while looking for
> > "[__builtin__kprobes]" module!
> > /proc/{kallsyms,modules} inconsistency while looking for
> > "[__builtin__ftrace]" module!
> > Looking at the vmlinux_path (8 entries long)
> > Using /usr/lib/debug/boot/vmlinux-6.5.13-1rodete2-amd64 for symbols
> > perf: Segmentation fault
> > Obtained 16 stack frames.
> > ./perf(+0x1b7dcd) [0x55c40be97dcd]
> > ./perf(+0x1b7eb7) [0x55c40be97eb7]
> > /lib/x86_64-linux-gnu/libc.so.6(+0x3c510) [0x7f33d7a5a510]
> > ./perf(+0x1c2e9c) [0x55c40bea2e9c]
> > ./perf(+0x1c43f6) [0x55c40bea43f6]
> > ./perf(+0x1c4649) [0x55c40bea4649]
> > ./perf(+0x1c46d3) [0x55c40bea46d3]
> > ./perf(+0x1c7303) [0x55c40bea7303]
> > ./perf(+0x1c70b5) [0x55c40bea70b5]
> > ./perf(+0x1c73e6) [0x55c40bea73e6]
> > ./perf(+0x11833e) [0x55c40bdf833e]
> > ./perf(+0x118f78) [0x55c40bdf8f78]
> > ./perf(+0x103d49) [0x55c40bde3d49]
> > ./perf(+0x103e75) [0x55c40bde3e75]
> > ./perf(+0x1044c0) [0x55c40bde44c0]
> > ./perf(+0x104de0) [0x55c40bde4de0]
> > test child interrupted
> > ---- end ----
> > vmlinux symtab matches kallsyms: FAILED!
>
> Ah, tripped over a latent bug summarized in this part of an asan stack trace:
> ```
> freed by thread T0 here:
>    #0 0x7fa13bcd74b5 in __interceptor_realloc
> ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:85
>    #1 0x561d66377713 in __maps__insert util/maps.c:353
>    #2 0x561d66377b89 in maps__insert util/maps.c:413
>    #3 0x561d6652911d in dso__process_kernel_symbol util/symbol-elf.c:1460
>    #4 0x561d6652aaae in dso__load_sym_internal util/symbol-elf.c:1675
>    #5 0x561d6652b6dc in dso__load_sym util/symbol-elf.c:1771
>    #6 0x561d66321a4e in dso__load util/symbol.c:1914
>    #7 0x561d66372cd9 in map__load util/map.c:353
>    #8 0x561d663730e7 in map__find_symbol_by_name_idx util/map.c:397
>    #9 0x561d663731e7 in map__find_symbol_by_name util/map.c:410
>    #10 0x561d66378208 in maps__find_symbol_by_name_cb util/maps.c:524
>    #11 0x561d66377f49 in maps__for_each_map util/maps.c:471
>    #12 0x561d663784a0 in maps__find_symbol_by_name util/maps.c:546
>    #13 0x561d662093e8 in machine__find_kernel_symbol_by_name util/machine.h:243
>    #14 0x561d6620abbd in test__vmlinux_matches_kallsyms
> tests/vmlinux-kallsyms.c:330
> ...
> ```
> dso__process_kernel_symbol rewrites the kernel maps here:
> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/symbol-elf.c#n1378
> which resizes the maps_by_address array causing the maps__for_each_map
> iteration in frame 11 to be iterating over a stale/freed value.
>
> The most correct solutions would be to clone the maps_by_address array
> prior to iteration, or reference count maps_by_address and its size.
> Neither of these solutions particularly appeal, so just reloading the
> maps_by_address and size on each iteration also fixes the problem, but
> possibly causes some maps to be skipped/repeated. I think this is
> acceptable correctness for the performance.

An aside, shouldn't taking a write lock to modify the maps deadlock
with holding the read lock for iteration? Well no because
perf_singlethreaded is true for the test:
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/rwsem.c#n17
Another perf_singlethreaded considered evil :-) Note, just getting rid
of perf_singlethreaded means latent bugs like this will pop up and
will need resolution.

Thanks,
Ian

> Thanks,
> Ian
>
> > Thanks,
> > Namhyung
Namhyung Kim Feb. 12, 2024, 8:10 p.m. UTC | #4
On Sat, Feb 10, 2024 at 10:08 AM Ian Rogers <irogers@google.com> wrote:
>
> On Fri, Feb 9, 2024 at 6:46 PM Ian Rogers <irogers@google.com> wrote:
> >
> > On Thu, Feb 8, 2024 at 9:44 AM Namhyung Kim <namhyung@kernel.org> wrote:
> > >
> > > Hi Ian,
> > >
> > > On Wed, Feb 7, 2024 at 2:37 PM Ian Rogers <irogers@google.com> wrote:
> > > >
> > > > First 6 patches from:
> > > > https://lore.kernel.org/lkml/20240202061532.1939474-1-irogers@google.com/
> > > >
> > > > v2. Fix NO_LIBUNWIND=1 build issue.
> > > >
> > > > Ian Rogers (6):
> > > >   perf maps: Switch from rbtree to lazily sorted array for addresses
> > > >   perf maps: Get map before returning in maps__find
> > > >   perf maps: Get map before returning in maps__find_by_name
> > > >   perf maps: Get map before returning in maps__find_next_entry
> > > >   perf maps: Hide maps internals
> > > >   perf maps: Locking tidy up of nr_maps
> > >
> > > Now I see a perf test failure on the vmlinux test:
> > >
> > > $ sudo ./perf test -v vmlinux
> > >   1: vmlinux symtab matches kallsyms                                 :
> > > --- start ---
> > > test child forked, pid 4164115
> > > /proc/{kallsyms,modules} inconsistency while looking for
> > > "[__builtin__kprobes]" module!
> > > /proc/{kallsyms,modules} inconsistency while looking for
> > > "[__builtin__kprobes]" module!
> > > /proc/{kallsyms,modules} inconsistency while looking for
> > > "[__builtin__ftrace]" module!
> > > Looking at the vmlinux_path (8 entries long)
> > > Using /usr/lib/debug/boot/vmlinux-6.5.13-1rodete2-amd64 for symbols
> > > perf: Segmentation fault
> > > Obtained 16 stack frames.
> > > ./perf(+0x1b7dcd) [0x55c40be97dcd]
> > > ./perf(+0x1b7eb7) [0x55c40be97eb7]
> > > /lib/x86_64-linux-gnu/libc.so.6(+0x3c510) [0x7f33d7a5a510]
> > > ./perf(+0x1c2e9c) [0x55c40bea2e9c]
> > > ./perf(+0x1c43f6) [0x55c40bea43f6]
> > > ./perf(+0x1c4649) [0x55c40bea4649]
> > > ./perf(+0x1c46d3) [0x55c40bea46d3]
> > > ./perf(+0x1c7303) [0x55c40bea7303]
> > > ./perf(+0x1c70b5) [0x55c40bea70b5]
> > > ./perf(+0x1c73e6) [0x55c40bea73e6]
> > > ./perf(+0x11833e) [0x55c40bdf833e]
> > > ./perf(+0x118f78) [0x55c40bdf8f78]
> > > ./perf(+0x103d49) [0x55c40bde3d49]
> > > ./perf(+0x103e75) [0x55c40bde3e75]
> > > ./perf(+0x1044c0) [0x55c40bde44c0]
> > > ./perf(+0x104de0) [0x55c40bde4de0]
> > > test child interrupted
> > > ---- end ----
> > > vmlinux symtab matches kallsyms: FAILED!
> >
> > Ah, tripped over a latent bug summarized in this part of an asan stack trace:
> > ```
> > freed by thread T0 here:
> >    #0 0x7fa13bcd74b5 in __interceptor_realloc
> > ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:85
> >    #1 0x561d66377713 in __maps__insert util/maps.c:353
> >    #2 0x561d66377b89 in maps__insert util/maps.c:413
> >    #3 0x561d6652911d in dso__process_kernel_symbol util/symbol-elf.c:1460
> >    #4 0x561d6652aaae in dso__load_sym_internal util/symbol-elf.c:1675
> >    #5 0x561d6652b6dc in dso__load_sym util/symbol-elf.c:1771
> >    #6 0x561d66321a4e in dso__load util/symbol.c:1914
> >    #7 0x561d66372cd9 in map__load util/map.c:353
> >    #8 0x561d663730e7 in map__find_symbol_by_name_idx util/map.c:397
> >    #9 0x561d663731e7 in map__find_symbol_by_name util/map.c:410
> >    #10 0x561d66378208 in maps__find_symbol_by_name_cb util/maps.c:524
> >    #11 0x561d66377f49 in maps__for_each_map util/maps.c:471
> >    #12 0x561d663784a0 in maps__find_symbol_by_name util/maps.c:546
> >    #13 0x561d662093e8 in machine__find_kernel_symbol_by_name util/machine.h:243
> >    #14 0x561d6620abbd in test__vmlinux_matches_kallsyms
> > tests/vmlinux-kallsyms.c:330
> > ...
> > ```
> > dso__process_kernel_symbol rewrites the kernel maps here:
> > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/symbol-elf.c#n1378
> > which resizes the maps_by_address array causing the maps__for_each_map
> > iteration in frame 11 to be iterating over a stale/freed value.
> >
> > The most correct solutions would be to clone the maps_by_address array
> > prior to iteration, or reference count maps_by_address and its size.
> > Neither of these solutions particularly appeal, so just reloading the
> > maps_by_address and size on each iteration also fixes the problem, but
> > possibly causes some maps to be skipped/repeated. I think this is
> > acceptable correctness for the performance.

Can we move map__load() out of maps__for_each_map() ?
I think the callback should just return the map and break the loop.
And it can call the map__load() out of the read lock.

>
> An aside, shouldn't taking a write lock to modify the maps deadlock
> with holding the read lock for iteration? Well no because
> perf_singlethreaded is true for the test:
> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/rwsem.c#n17
> Another perf_singlethreaded considered evil :-) Note, just getting rid
> of perf_singlethreaded means latent bugs like this will pop up and
> will need resolution.

Yeah, maybe.  How about turning it on in the test code?

Thanks,
Namhyung
Ian Rogers Feb. 12, 2024, 8:22 p.m. UTC | #5
On Mon, Feb 12, 2024 at 12:10 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> On Sat, Feb 10, 2024 at 10:08 AM Ian Rogers <irogers@google.com> wrote:
> >
> > On Fri, Feb 9, 2024 at 6:46 PM Ian Rogers <irogers@google.com> wrote:
> > >
> > > On Thu, Feb 8, 2024 at 9:44 AM Namhyung Kim <namhyung@kernel.org> wrote:
> > > >
> > > > Hi Ian,
> > > >
> > > > On Wed, Feb 7, 2024 at 2:37 PM Ian Rogers <irogers@google.com> wrote:
> > > > >
> > > > > First 6 patches from:
> > > > > https://lore.kernel.org/lkml/20240202061532.1939474-1-irogers@google.com/
> > > > >
> > > > > v2. Fix NO_LIBUNWIND=1 build issue.
> > > > >
> > > > > Ian Rogers (6):
> > > > >   perf maps: Switch from rbtree to lazily sorted array for addresses
> > > > >   perf maps: Get map before returning in maps__find
> > > > >   perf maps: Get map before returning in maps__find_by_name
> > > > >   perf maps: Get map before returning in maps__find_next_entry
> > > > >   perf maps: Hide maps internals
> > > > >   perf maps: Locking tidy up of nr_maps
> > > >
> > > > Now I see a perf test failure on the vmlinux test:
> > > >
> > > > $ sudo ./perf test -v vmlinux
> > > >   1: vmlinux symtab matches kallsyms                                 :
> > > > --- start ---
> > > > test child forked, pid 4164115
> > > > /proc/{kallsyms,modules} inconsistency while looking for
> > > > "[__builtin__kprobes]" module!
> > > > /proc/{kallsyms,modules} inconsistency while looking for
> > > > "[__builtin__kprobes]" module!
> > > > /proc/{kallsyms,modules} inconsistency while looking for
> > > > "[__builtin__ftrace]" module!
> > > > Looking at the vmlinux_path (8 entries long)
> > > > Using /usr/lib/debug/boot/vmlinux-6.5.13-1rodete2-amd64 for symbols
> > > > perf: Segmentation fault
> > > > Obtained 16 stack frames.
> > > > ./perf(+0x1b7dcd) [0x55c40be97dcd]
> > > > ./perf(+0x1b7eb7) [0x55c40be97eb7]
> > > > /lib/x86_64-linux-gnu/libc.so.6(+0x3c510) [0x7f33d7a5a510]
> > > > ./perf(+0x1c2e9c) [0x55c40bea2e9c]
> > > > ./perf(+0x1c43f6) [0x55c40bea43f6]
> > > > ./perf(+0x1c4649) [0x55c40bea4649]
> > > > ./perf(+0x1c46d3) [0x55c40bea46d3]
> > > > ./perf(+0x1c7303) [0x55c40bea7303]
> > > > ./perf(+0x1c70b5) [0x55c40bea70b5]
> > > > ./perf(+0x1c73e6) [0x55c40bea73e6]
> > > > ./perf(+0x11833e) [0x55c40bdf833e]
> > > > ./perf(+0x118f78) [0x55c40bdf8f78]
> > > > ./perf(+0x103d49) [0x55c40bde3d49]
> > > > ./perf(+0x103e75) [0x55c40bde3e75]
> > > > ./perf(+0x1044c0) [0x55c40bde44c0]
> > > > ./perf(+0x104de0) [0x55c40bde4de0]
> > > > test child interrupted
> > > > ---- end ----
> > > > vmlinux symtab matches kallsyms: FAILED!
> > >
> > > Ah, tripped over a latent bug summarized in this part of an asan stack trace:
> > > ```
> > > freed by thread T0 here:
> > >    #0 0x7fa13bcd74b5 in __interceptor_realloc
> > > ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:85
> > >    #1 0x561d66377713 in __maps__insert util/maps.c:353
> > >    #2 0x561d66377b89 in maps__insert util/maps.c:413
> > >    #3 0x561d6652911d in dso__process_kernel_symbol util/symbol-elf.c:1460
> > >    #4 0x561d6652aaae in dso__load_sym_internal util/symbol-elf.c:1675
> > >    #5 0x561d6652b6dc in dso__load_sym util/symbol-elf.c:1771
> > >    #6 0x561d66321a4e in dso__load util/symbol.c:1914
> > >    #7 0x561d66372cd9 in map__load util/map.c:353
> > >    #8 0x561d663730e7 in map__find_symbol_by_name_idx util/map.c:397
> > >    #9 0x561d663731e7 in map__find_symbol_by_name util/map.c:410
> > >    #10 0x561d66378208 in maps__find_symbol_by_name_cb util/maps.c:524
> > >    #11 0x561d66377f49 in maps__for_each_map util/maps.c:471
> > >    #12 0x561d663784a0 in maps__find_symbol_by_name util/maps.c:546
> > >    #13 0x561d662093e8 in machine__find_kernel_symbol_by_name util/machine.h:243
> > >    #14 0x561d6620abbd in test__vmlinux_matches_kallsyms
> > > tests/vmlinux-kallsyms.c:330
> > > ...
> > > ```
> > > dso__process_kernel_symbol rewrites the kernel maps here:
> > > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/symbol-elf.c#n1378
> > > which resizes the maps_by_address array causing the maps__for_each_map
> > > iteration in frame 11 to be iterating over a stale/freed value.
> > >
> > > The most correct solutions would be to clone the maps_by_address array
> > > prior to iteration, or reference count maps_by_address and its size.
> > > Neither of these solutions particularly appeal, so just reloading the
> > > maps_by_address and size on each iteration also fixes the problem, but
> > > possibly causes some maps to be skipped/repeated. I think this is
> > > acceptable correctness for the performance.
>
> Can we move map__load() out of maps__for_each_map() ?
> I think the callback should just return the map and break the loop.
> And it can call the map__load() out of the read lock.

It would need a rewrite of map__find_symbol_by_name which is being
called by a callback from maps__find_symbol_by_name. Perhaps an
initial pass to ensure everything is loaded and a safe version of the
loop that copies the maps_by_address ahead of copying it. It'd be of a
scope that'd be worth its own patch set.

> >
> > An aside, shouldn't taking a write lock to modify the maps deadlock
> > with holding the read lock for iteration? Well no because
> > perf_singlethreaded is true for the test:
> > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/rwsem.c#n17
> > Another perf_singlethreaded considered evil :-) Note, just getting rid
> > of perf_singlethreaded means latent bugs like this will pop up and
> > will need resolution.
>
> Yeah, maybe.  How about turning it on in the test code?

Agreed, but I think it should be a follow up.

Thanks,
Ian

> Thanks,
> Namhyung
Namhyung Kim Feb. 13, 2024, 5:53 p.m. UTC | #6
On Mon, Feb 12, 2024 at 12:22 PM Ian Rogers <irogers@google.com> wrote:
>
> On Mon, Feb 12, 2024 at 12:10 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > On Sat, Feb 10, 2024 at 10:08 AM Ian Rogers <irogers@google.com> wrote:
> > >
> > > On Fri, Feb 9, 2024 at 6:46 PM Ian Rogers <irogers@google.com> wrote:
> > > >
> > > > On Thu, Feb 8, 2024 at 9:44 AM Namhyung Kim <namhyung@kernel.org> wrote:
> > > > >
> > > > > Hi Ian,
> > > > >
> > > > > On Wed, Feb 7, 2024 at 2:37 PM Ian Rogers <irogers@google.com> wrote:
> > > > > >
> > > > > > First 6 patches from:
> > > > > > https://lore.kernel.org/lkml/20240202061532.1939474-1-irogers@google.com/
> > > > > >
> > > > > > v2. Fix NO_LIBUNWIND=1 build issue.
> > > > > >
> > > > > > Ian Rogers (6):
> > > > > >   perf maps: Switch from rbtree to lazily sorted array for addresses
> > > > > >   perf maps: Get map before returning in maps__find
> > > > > >   perf maps: Get map before returning in maps__find_by_name
> > > > > >   perf maps: Get map before returning in maps__find_next_entry
> > > > > >   perf maps: Hide maps internals
> > > > > >   perf maps: Locking tidy up of nr_maps
> > > > >
> > > > > Now I see a perf test failure on the vmlinux test:
> > > > >
> > > > > $ sudo ./perf test -v vmlinux
> > > > >   1: vmlinux symtab matches kallsyms                                 :
> > > > > --- start ---
> > > > > test child forked, pid 4164115
> > > > > /proc/{kallsyms,modules} inconsistency while looking for
> > > > > "[__builtin__kprobes]" module!
> > > > > /proc/{kallsyms,modules} inconsistency while looking for
> > > > > "[__builtin__kprobes]" module!
> > > > > /proc/{kallsyms,modules} inconsistency while looking for
> > > > > "[__builtin__ftrace]" module!
> > > > > Looking at the vmlinux_path (8 entries long)
> > > > > Using /usr/lib/debug/boot/vmlinux-6.5.13-1rodete2-amd64 for symbols
> > > > > perf: Segmentation fault
> > > > > Obtained 16 stack frames.
> > > > > ./perf(+0x1b7dcd) [0x55c40be97dcd]
> > > > > ./perf(+0x1b7eb7) [0x55c40be97eb7]
> > > > > /lib/x86_64-linux-gnu/libc.so.6(+0x3c510) [0x7f33d7a5a510]
> > > > > ./perf(+0x1c2e9c) [0x55c40bea2e9c]
> > > > > ./perf(+0x1c43f6) [0x55c40bea43f6]
> > > > > ./perf(+0x1c4649) [0x55c40bea4649]
> > > > > ./perf(+0x1c46d3) [0x55c40bea46d3]
> > > > > ./perf(+0x1c7303) [0x55c40bea7303]
> > > > > ./perf(+0x1c70b5) [0x55c40bea70b5]
> > > > > ./perf(+0x1c73e6) [0x55c40bea73e6]
> > > > > ./perf(+0x11833e) [0x55c40bdf833e]
> > > > > ./perf(+0x118f78) [0x55c40bdf8f78]
> > > > > ./perf(+0x103d49) [0x55c40bde3d49]
> > > > > ./perf(+0x103e75) [0x55c40bde3e75]
> > > > > ./perf(+0x1044c0) [0x55c40bde44c0]
> > > > > ./perf(+0x104de0) [0x55c40bde4de0]
> > > > > test child interrupted
> > > > > ---- end ----
> > > > > vmlinux symtab matches kallsyms: FAILED!
> > > >
> > > > Ah, tripped over a latent bug summarized in this part of an asan stack trace:
> > > > ```
> > > > freed by thread T0 here:
> > > >    #0 0x7fa13bcd74b5 in __interceptor_realloc
> > > > ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:85
> > > >    #1 0x561d66377713 in __maps__insert util/maps.c:353
> > > >    #2 0x561d66377b89 in maps__insert util/maps.c:413
> > > >    #3 0x561d6652911d in dso__process_kernel_symbol util/symbol-elf.c:1460
> > > >    #4 0x561d6652aaae in dso__load_sym_internal util/symbol-elf.c:1675
> > > >    #5 0x561d6652b6dc in dso__load_sym util/symbol-elf.c:1771
> > > >    #6 0x561d66321a4e in dso__load util/symbol.c:1914
> > > >    #7 0x561d66372cd9 in map__load util/map.c:353
> > > >    #8 0x561d663730e7 in map__find_symbol_by_name_idx util/map.c:397
> > > >    #9 0x561d663731e7 in map__find_symbol_by_name util/map.c:410
> > > >    #10 0x561d66378208 in maps__find_symbol_by_name_cb util/maps.c:524
> > > >    #11 0x561d66377f49 in maps__for_each_map util/maps.c:471
> > > >    #12 0x561d663784a0 in maps__find_symbol_by_name util/maps.c:546
> > > >    #13 0x561d662093e8 in machine__find_kernel_symbol_by_name util/machine.h:243
> > > >    #14 0x561d6620abbd in test__vmlinux_matches_kallsyms
> > > > tests/vmlinux-kallsyms.c:330
> > > > ...
> > > > ```
> > > > dso__process_kernel_symbol rewrites the kernel maps here:
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/symbol-elf.c#n1378
> > > > which resizes the maps_by_address array causing the maps__for_each_map
> > > > iteration in frame 11 to be iterating over a stale/freed value.
> > > >
> > > > The most correct solutions would be to clone the maps_by_address array
> > > > prior to iteration, or reference count maps_by_address and its size.
> > > > Neither of these solutions particularly appeal, so just reloading the
> > > > maps_by_address and size on each iteration also fixes the problem, but
> > > > possibly causes some maps to be skipped/repeated. I think this is
> > > > acceptable correctness for the performance.
> >
> > Can we move map__load() out of maps__for_each_map() ?
> > I think the callback should just return the map and break the loop.
> > And it can call the map__load() out of the read lock.
>
> It would need a rewrite of map__find_symbol_by_name which is being
> called by a callback from maps__find_symbol_by_name. Perhaps an
> initial pass to ensure everything is loaded and a safe version of the
> loop that copies the maps_by_address ahead of copying it. It'd be of a
> scope that'd be worth its own patch set.

Right, let's do it in a separate work.

>
> > >
> > > An aside, shouldn't taking a write lock to modify the maps deadlock
> > > with holding the read lock for iteration? Well no because
> > > perf_singlethreaded is true for the test:
> > > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/rwsem.c#n17
> > > Another perf_singlethreaded considered evil :-) Note, just getting rid
> > > of perf_singlethreaded means latent bugs like this will pop up and
> > > will need resolution.
> >
> > Yeah, maybe.  How about turning it on in the test code?
>
> Agreed, but I think it should be a follow up.

Sounds good.

Thanks,
Namhyung