Message ID | 20240207223639.3139601-1-irogers@google.com (mailing list archive) |
---|---|
Headers | show |
Series | maps memory improvements and fixes | expand |
Hi Ian, On Wed, Feb 7, 2024 at 2:37 PM Ian Rogers <irogers@google.com> wrote: > > First 6 patches from: > https://lore.kernel.org/lkml/20240202061532.1939474-1-irogers@google.com/ > > v2. Fix NO_LIBUNWIND=1 build issue. > > Ian Rogers (6): > perf maps: Switch from rbtree to lazily sorted array for addresses > perf maps: Get map before returning in maps__find > perf maps: Get map before returning in maps__find_by_name > perf maps: Get map before returning in maps__find_next_entry > perf maps: Hide maps internals > perf maps: Locking tidy up of nr_maps Now I see a perf test failure on the vmlinux test: $ sudo ./perf test -v vmlinux 1: vmlinux symtab matches kallsyms : --- start --- test child forked, pid 4164115 /proc/{kallsyms,modules} inconsistency while looking for "[__builtin__kprobes]" module! /proc/{kallsyms,modules} inconsistency while looking for "[__builtin__kprobes]" module! /proc/{kallsyms,modules} inconsistency while looking for "[__builtin__ftrace]" module! Looking at the vmlinux_path (8 entries long) Using /usr/lib/debug/boot/vmlinux-6.5.13-1rodete2-amd64 for symbols perf: Segmentation fault Obtained 16 stack frames. ./perf(+0x1b7dcd) [0x55c40be97dcd] ./perf(+0x1b7eb7) [0x55c40be97eb7] /lib/x86_64-linux-gnu/libc.so.6(+0x3c510) [0x7f33d7a5a510] ./perf(+0x1c2e9c) [0x55c40bea2e9c] ./perf(+0x1c43f6) [0x55c40bea43f6] ./perf(+0x1c4649) [0x55c40bea4649] ./perf(+0x1c46d3) [0x55c40bea46d3] ./perf(+0x1c7303) [0x55c40bea7303] ./perf(+0x1c70b5) [0x55c40bea70b5] ./perf(+0x1c73e6) [0x55c40bea73e6] ./perf(+0x11833e) [0x55c40bdf833e] ./perf(+0x118f78) [0x55c40bdf8f78] ./perf(+0x103d49) [0x55c40bde3d49] ./perf(+0x103e75) [0x55c40bde3e75] ./perf(+0x1044c0) [0x55c40bde44c0] ./perf(+0x104de0) [0x55c40bde4de0] test child interrupted ---- end ---- vmlinux symtab matches kallsyms: FAILED! Thanks, Namhyung
On Thu, Feb 8, 2024 at 9:44 AM Namhyung Kim <namhyung@kernel.org> wrote: > > Hi Ian, > > On Wed, Feb 7, 2024 at 2:37 PM Ian Rogers <irogers@google.com> wrote: > > > > First 6 patches from: > > https://lore.kernel.org/lkml/20240202061532.1939474-1-irogers@google.com/ > > > > v2. Fix NO_LIBUNWIND=1 build issue. > > > > Ian Rogers (6): > > perf maps: Switch from rbtree to lazily sorted array for addresses > > perf maps: Get map before returning in maps__find > > perf maps: Get map before returning in maps__find_by_name > > perf maps: Get map before returning in maps__find_next_entry > > perf maps: Hide maps internals > > perf maps: Locking tidy up of nr_maps > > Now I see a perf test failure on the vmlinux test: > > $ sudo ./perf test -v vmlinux > 1: vmlinux symtab matches kallsyms : > --- start --- > test child forked, pid 4164115 > /proc/{kallsyms,modules} inconsistency while looking for > "[__builtin__kprobes]" module! > /proc/{kallsyms,modules} inconsistency while looking for > "[__builtin__kprobes]" module! > /proc/{kallsyms,modules} inconsistency while looking for > "[__builtin__ftrace]" module! > Looking at the vmlinux_path (8 entries long) > Using /usr/lib/debug/boot/vmlinux-6.5.13-1rodete2-amd64 for symbols > perf: Segmentation fault > Obtained 16 stack frames. > ./perf(+0x1b7dcd) [0x55c40be97dcd] > ./perf(+0x1b7eb7) [0x55c40be97eb7] > /lib/x86_64-linux-gnu/libc.so.6(+0x3c510) [0x7f33d7a5a510] > ./perf(+0x1c2e9c) [0x55c40bea2e9c] > ./perf(+0x1c43f6) [0x55c40bea43f6] > ./perf(+0x1c4649) [0x55c40bea4649] > ./perf(+0x1c46d3) [0x55c40bea46d3] > ./perf(+0x1c7303) [0x55c40bea7303] > ./perf(+0x1c70b5) [0x55c40bea70b5] > ./perf(+0x1c73e6) [0x55c40bea73e6] > ./perf(+0x11833e) [0x55c40bdf833e] > ./perf(+0x118f78) [0x55c40bdf8f78] > ./perf(+0x103d49) [0x55c40bde3d49] > ./perf(+0x103e75) [0x55c40bde3e75] > ./perf(+0x1044c0) [0x55c40bde44c0] > ./perf(+0x104de0) [0x55c40bde4de0] > test child interrupted > ---- end ---- > vmlinux symtab matches kallsyms: FAILED! Ah, tripped over a latent bug summarized in this part of an asan stack trace: ``` freed by thread T0 here: #0 0x7fa13bcd74b5 in __interceptor_realloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:85 #1 0x561d66377713 in __maps__insert util/maps.c:353 #2 0x561d66377b89 in maps__insert util/maps.c:413 #3 0x561d6652911d in dso__process_kernel_symbol util/symbol-elf.c:1460 #4 0x561d6652aaae in dso__load_sym_internal util/symbol-elf.c:1675 #5 0x561d6652b6dc in dso__load_sym util/symbol-elf.c:1771 #6 0x561d66321a4e in dso__load util/symbol.c:1914 #7 0x561d66372cd9 in map__load util/map.c:353 #8 0x561d663730e7 in map__find_symbol_by_name_idx util/map.c:397 #9 0x561d663731e7 in map__find_symbol_by_name util/map.c:410 #10 0x561d66378208 in maps__find_symbol_by_name_cb util/maps.c:524 #11 0x561d66377f49 in maps__for_each_map util/maps.c:471 #12 0x561d663784a0 in maps__find_symbol_by_name util/maps.c:546 #13 0x561d662093e8 in machine__find_kernel_symbol_by_name util/machine.h:243 #14 0x561d6620abbd in test__vmlinux_matches_kallsyms tests/vmlinux-kallsyms.c:330 ... ``` dso__process_kernel_symbol rewrites the kernel maps here: https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/symbol-elf.c#n1378 which resizes the maps_by_address array causing the maps__for_each_map iteration in frame 11 to be iterating over a stale/freed value. The most correct solutions would be to clone the maps_by_address array prior to iteration, or reference count maps_by_address and its size. Neither of these solutions particularly appeal, so just reloading the maps_by_address and size on each iteration also fixes the problem, but possibly causes some maps to be skipped/repeated. I think this is acceptable correctness for the performance. Thanks, Ian > Thanks, > Namhyung
On Fri, Feb 9, 2024 at 6:46 PM Ian Rogers <irogers@google.com> wrote: > > On Thu, Feb 8, 2024 at 9:44 AM Namhyung Kim <namhyung@kernel.org> wrote: > > > > Hi Ian, > > > > On Wed, Feb 7, 2024 at 2:37 PM Ian Rogers <irogers@google.com> wrote: > > > > > > First 6 patches from: > > > https://lore.kernel.org/lkml/20240202061532.1939474-1-irogers@google.com/ > > > > > > v2. Fix NO_LIBUNWIND=1 build issue. > > > > > > Ian Rogers (6): > > > perf maps: Switch from rbtree to lazily sorted array for addresses > > > perf maps: Get map before returning in maps__find > > > perf maps: Get map before returning in maps__find_by_name > > > perf maps: Get map before returning in maps__find_next_entry > > > perf maps: Hide maps internals > > > perf maps: Locking tidy up of nr_maps > > > > Now I see a perf test failure on the vmlinux test: > > > > $ sudo ./perf test -v vmlinux > > 1: vmlinux symtab matches kallsyms : > > --- start --- > > test child forked, pid 4164115 > > /proc/{kallsyms,modules} inconsistency while looking for > > "[__builtin__kprobes]" module! > > /proc/{kallsyms,modules} inconsistency while looking for > > "[__builtin__kprobes]" module! > > /proc/{kallsyms,modules} inconsistency while looking for > > "[__builtin__ftrace]" module! > > Looking at the vmlinux_path (8 entries long) > > Using /usr/lib/debug/boot/vmlinux-6.5.13-1rodete2-amd64 for symbols > > perf: Segmentation fault > > Obtained 16 stack frames. > > ./perf(+0x1b7dcd) [0x55c40be97dcd] > > ./perf(+0x1b7eb7) [0x55c40be97eb7] > > /lib/x86_64-linux-gnu/libc.so.6(+0x3c510) [0x7f33d7a5a510] > > ./perf(+0x1c2e9c) [0x55c40bea2e9c] > > ./perf(+0x1c43f6) [0x55c40bea43f6] > > ./perf(+0x1c4649) [0x55c40bea4649] > > ./perf(+0x1c46d3) [0x55c40bea46d3] > > ./perf(+0x1c7303) [0x55c40bea7303] > > ./perf(+0x1c70b5) [0x55c40bea70b5] > > ./perf(+0x1c73e6) [0x55c40bea73e6] > > ./perf(+0x11833e) [0x55c40bdf833e] > > ./perf(+0x118f78) [0x55c40bdf8f78] > > ./perf(+0x103d49) [0x55c40bde3d49] > > ./perf(+0x103e75) [0x55c40bde3e75] > > ./perf(+0x1044c0) [0x55c40bde44c0] > > ./perf(+0x104de0) [0x55c40bde4de0] > > test child interrupted > > ---- end ---- > > vmlinux symtab matches kallsyms: FAILED! > > Ah, tripped over a latent bug summarized in this part of an asan stack trace: > ``` > freed by thread T0 here: > #0 0x7fa13bcd74b5 in __interceptor_realloc > ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:85 > #1 0x561d66377713 in __maps__insert util/maps.c:353 > #2 0x561d66377b89 in maps__insert util/maps.c:413 > #3 0x561d6652911d in dso__process_kernel_symbol util/symbol-elf.c:1460 > #4 0x561d6652aaae in dso__load_sym_internal util/symbol-elf.c:1675 > #5 0x561d6652b6dc in dso__load_sym util/symbol-elf.c:1771 > #6 0x561d66321a4e in dso__load util/symbol.c:1914 > #7 0x561d66372cd9 in map__load util/map.c:353 > #8 0x561d663730e7 in map__find_symbol_by_name_idx util/map.c:397 > #9 0x561d663731e7 in map__find_symbol_by_name util/map.c:410 > #10 0x561d66378208 in maps__find_symbol_by_name_cb util/maps.c:524 > #11 0x561d66377f49 in maps__for_each_map util/maps.c:471 > #12 0x561d663784a0 in maps__find_symbol_by_name util/maps.c:546 > #13 0x561d662093e8 in machine__find_kernel_symbol_by_name util/machine.h:243 > #14 0x561d6620abbd in test__vmlinux_matches_kallsyms > tests/vmlinux-kallsyms.c:330 > ... > ``` > dso__process_kernel_symbol rewrites the kernel maps here: > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/symbol-elf.c#n1378 > which resizes the maps_by_address array causing the maps__for_each_map > iteration in frame 11 to be iterating over a stale/freed value. > > The most correct solutions would be to clone the maps_by_address array > prior to iteration, or reference count maps_by_address and its size. > Neither of these solutions particularly appeal, so just reloading the > maps_by_address and size on each iteration also fixes the problem, but > possibly causes some maps to be skipped/repeated. I think this is > acceptable correctness for the performance. An aside, shouldn't taking a write lock to modify the maps deadlock with holding the read lock for iteration? Well no because perf_singlethreaded is true for the test: https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/rwsem.c#n17 Another perf_singlethreaded considered evil :-) Note, just getting rid of perf_singlethreaded means latent bugs like this will pop up and will need resolution. Thanks, Ian > Thanks, > Ian > > > Thanks, > > Namhyung
On Sat, Feb 10, 2024 at 10:08 AM Ian Rogers <irogers@google.com> wrote: > > On Fri, Feb 9, 2024 at 6:46 PM Ian Rogers <irogers@google.com> wrote: > > > > On Thu, Feb 8, 2024 at 9:44 AM Namhyung Kim <namhyung@kernel.org> wrote: > > > > > > Hi Ian, > > > > > > On Wed, Feb 7, 2024 at 2:37 PM Ian Rogers <irogers@google.com> wrote: > > > > > > > > First 6 patches from: > > > > https://lore.kernel.org/lkml/20240202061532.1939474-1-irogers@google.com/ > > > > > > > > v2. Fix NO_LIBUNWIND=1 build issue. > > > > > > > > Ian Rogers (6): > > > > perf maps: Switch from rbtree to lazily sorted array for addresses > > > > perf maps: Get map before returning in maps__find > > > > perf maps: Get map before returning in maps__find_by_name > > > > perf maps: Get map before returning in maps__find_next_entry > > > > perf maps: Hide maps internals > > > > perf maps: Locking tidy up of nr_maps > > > > > > Now I see a perf test failure on the vmlinux test: > > > > > > $ sudo ./perf test -v vmlinux > > > 1: vmlinux symtab matches kallsyms : > > > --- start --- > > > test child forked, pid 4164115 > > > /proc/{kallsyms,modules} inconsistency while looking for > > > "[__builtin__kprobes]" module! > > > /proc/{kallsyms,modules} inconsistency while looking for > > > "[__builtin__kprobes]" module! > > > /proc/{kallsyms,modules} inconsistency while looking for > > > "[__builtin__ftrace]" module! > > > Looking at the vmlinux_path (8 entries long) > > > Using /usr/lib/debug/boot/vmlinux-6.5.13-1rodete2-amd64 for symbols > > > perf: Segmentation fault > > > Obtained 16 stack frames. > > > ./perf(+0x1b7dcd) [0x55c40be97dcd] > > > ./perf(+0x1b7eb7) [0x55c40be97eb7] > > > /lib/x86_64-linux-gnu/libc.so.6(+0x3c510) [0x7f33d7a5a510] > > > ./perf(+0x1c2e9c) [0x55c40bea2e9c] > > > ./perf(+0x1c43f6) [0x55c40bea43f6] > > > ./perf(+0x1c4649) [0x55c40bea4649] > > > ./perf(+0x1c46d3) [0x55c40bea46d3] > > > ./perf(+0x1c7303) [0x55c40bea7303] > > > ./perf(+0x1c70b5) [0x55c40bea70b5] > > > ./perf(+0x1c73e6) [0x55c40bea73e6] > > > ./perf(+0x11833e) [0x55c40bdf833e] > > > ./perf(+0x118f78) [0x55c40bdf8f78] > > > ./perf(+0x103d49) [0x55c40bde3d49] > > > ./perf(+0x103e75) [0x55c40bde3e75] > > > ./perf(+0x1044c0) [0x55c40bde44c0] > > > ./perf(+0x104de0) [0x55c40bde4de0] > > > test child interrupted > > > ---- end ---- > > > vmlinux symtab matches kallsyms: FAILED! > > > > Ah, tripped over a latent bug summarized in this part of an asan stack trace: > > ``` > > freed by thread T0 here: > > #0 0x7fa13bcd74b5 in __interceptor_realloc > > ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:85 > > #1 0x561d66377713 in __maps__insert util/maps.c:353 > > #2 0x561d66377b89 in maps__insert util/maps.c:413 > > #3 0x561d6652911d in dso__process_kernel_symbol util/symbol-elf.c:1460 > > #4 0x561d6652aaae in dso__load_sym_internal util/symbol-elf.c:1675 > > #5 0x561d6652b6dc in dso__load_sym util/symbol-elf.c:1771 > > #6 0x561d66321a4e in dso__load util/symbol.c:1914 > > #7 0x561d66372cd9 in map__load util/map.c:353 > > #8 0x561d663730e7 in map__find_symbol_by_name_idx util/map.c:397 > > #9 0x561d663731e7 in map__find_symbol_by_name util/map.c:410 > > #10 0x561d66378208 in maps__find_symbol_by_name_cb util/maps.c:524 > > #11 0x561d66377f49 in maps__for_each_map util/maps.c:471 > > #12 0x561d663784a0 in maps__find_symbol_by_name util/maps.c:546 > > #13 0x561d662093e8 in machine__find_kernel_symbol_by_name util/machine.h:243 > > #14 0x561d6620abbd in test__vmlinux_matches_kallsyms > > tests/vmlinux-kallsyms.c:330 > > ... > > ``` > > dso__process_kernel_symbol rewrites the kernel maps here: > > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/symbol-elf.c#n1378 > > which resizes the maps_by_address array causing the maps__for_each_map > > iteration in frame 11 to be iterating over a stale/freed value. > > > > The most correct solutions would be to clone the maps_by_address array > > prior to iteration, or reference count maps_by_address and its size. > > Neither of these solutions particularly appeal, so just reloading the > > maps_by_address and size on each iteration also fixes the problem, but > > possibly causes some maps to be skipped/repeated. I think this is > > acceptable correctness for the performance. Can we move map__load() out of maps__for_each_map() ? I think the callback should just return the map and break the loop. And it can call the map__load() out of the read lock. > > An aside, shouldn't taking a write lock to modify the maps deadlock > with holding the read lock for iteration? Well no because > perf_singlethreaded is true for the test: > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/rwsem.c#n17 > Another perf_singlethreaded considered evil :-) Note, just getting rid > of perf_singlethreaded means latent bugs like this will pop up and > will need resolution. Yeah, maybe. How about turning it on in the test code? Thanks, Namhyung
On Mon, Feb 12, 2024 at 12:10 PM Namhyung Kim <namhyung@kernel.org> wrote: > > On Sat, Feb 10, 2024 at 10:08 AM Ian Rogers <irogers@google.com> wrote: > > > > On Fri, Feb 9, 2024 at 6:46 PM Ian Rogers <irogers@google.com> wrote: > > > > > > On Thu, Feb 8, 2024 at 9:44 AM Namhyung Kim <namhyung@kernel.org> wrote: > > > > > > > > Hi Ian, > > > > > > > > On Wed, Feb 7, 2024 at 2:37 PM Ian Rogers <irogers@google.com> wrote: > > > > > > > > > > First 6 patches from: > > > > > https://lore.kernel.org/lkml/20240202061532.1939474-1-irogers@google.com/ > > > > > > > > > > v2. Fix NO_LIBUNWIND=1 build issue. > > > > > > > > > > Ian Rogers (6): > > > > > perf maps: Switch from rbtree to lazily sorted array for addresses > > > > > perf maps: Get map before returning in maps__find > > > > > perf maps: Get map before returning in maps__find_by_name > > > > > perf maps: Get map before returning in maps__find_next_entry > > > > > perf maps: Hide maps internals > > > > > perf maps: Locking tidy up of nr_maps > > > > > > > > Now I see a perf test failure on the vmlinux test: > > > > > > > > $ sudo ./perf test -v vmlinux > > > > 1: vmlinux symtab matches kallsyms : > > > > --- start --- > > > > test child forked, pid 4164115 > > > > /proc/{kallsyms,modules} inconsistency while looking for > > > > "[__builtin__kprobes]" module! > > > > /proc/{kallsyms,modules} inconsistency while looking for > > > > "[__builtin__kprobes]" module! > > > > /proc/{kallsyms,modules} inconsistency while looking for > > > > "[__builtin__ftrace]" module! > > > > Looking at the vmlinux_path (8 entries long) > > > > Using /usr/lib/debug/boot/vmlinux-6.5.13-1rodete2-amd64 for symbols > > > > perf: Segmentation fault > > > > Obtained 16 stack frames. > > > > ./perf(+0x1b7dcd) [0x55c40be97dcd] > > > > ./perf(+0x1b7eb7) [0x55c40be97eb7] > > > > /lib/x86_64-linux-gnu/libc.so.6(+0x3c510) [0x7f33d7a5a510] > > > > ./perf(+0x1c2e9c) [0x55c40bea2e9c] > > > > ./perf(+0x1c43f6) [0x55c40bea43f6] > > > > ./perf(+0x1c4649) [0x55c40bea4649] > > > > ./perf(+0x1c46d3) [0x55c40bea46d3] > > > > ./perf(+0x1c7303) [0x55c40bea7303] > > > > ./perf(+0x1c70b5) [0x55c40bea70b5] > > > > ./perf(+0x1c73e6) [0x55c40bea73e6] > > > > ./perf(+0x11833e) [0x55c40bdf833e] > > > > ./perf(+0x118f78) [0x55c40bdf8f78] > > > > ./perf(+0x103d49) [0x55c40bde3d49] > > > > ./perf(+0x103e75) [0x55c40bde3e75] > > > > ./perf(+0x1044c0) [0x55c40bde44c0] > > > > ./perf(+0x104de0) [0x55c40bde4de0] > > > > test child interrupted > > > > ---- end ---- > > > > vmlinux symtab matches kallsyms: FAILED! > > > > > > Ah, tripped over a latent bug summarized in this part of an asan stack trace: > > > ``` > > > freed by thread T0 here: > > > #0 0x7fa13bcd74b5 in __interceptor_realloc > > > ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:85 > > > #1 0x561d66377713 in __maps__insert util/maps.c:353 > > > #2 0x561d66377b89 in maps__insert util/maps.c:413 > > > #3 0x561d6652911d in dso__process_kernel_symbol util/symbol-elf.c:1460 > > > #4 0x561d6652aaae in dso__load_sym_internal util/symbol-elf.c:1675 > > > #5 0x561d6652b6dc in dso__load_sym util/symbol-elf.c:1771 > > > #6 0x561d66321a4e in dso__load util/symbol.c:1914 > > > #7 0x561d66372cd9 in map__load util/map.c:353 > > > #8 0x561d663730e7 in map__find_symbol_by_name_idx util/map.c:397 > > > #9 0x561d663731e7 in map__find_symbol_by_name util/map.c:410 > > > #10 0x561d66378208 in maps__find_symbol_by_name_cb util/maps.c:524 > > > #11 0x561d66377f49 in maps__for_each_map util/maps.c:471 > > > #12 0x561d663784a0 in maps__find_symbol_by_name util/maps.c:546 > > > #13 0x561d662093e8 in machine__find_kernel_symbol_by_name util/machine.h:243 > > > #14 0x561d6620abbd in test__vmlinux_matches_kallsyms > > > tests/vmlinux-kallsyms.c:330 > > > ... > > > ``` > > > dso__process_kernel_symbol rewrites the kernel maps here: > > > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/symbol-elf.c#n1378 > > > which resizes the maps_by_address array causing the maps__for_each_map > > > iteration in frame 11 to be iterating over a stale/freed value. > > > > > > The most correct solutions would be to clone the maps_by_address array > > > prior to iteration, or reference count maps_by_address and its size. > > > Neither of these solutions particularly appeal, so just reloading the > > > maps_by_address and size on each iteration also fixes the problem, but > > > possibly causes some maps to be skipped/repeated. I think this is > > > acceptable correctness for the performance. > > Can we move map__load() out of maps__for_each_map() ? > I think the callback should just return the map and break the loop. > And it can call the map__load() out of the read lock. It would need a rewrite of map__find_symbol_by_name which is being called by a callback from maps__find_symbol_by_name. Perhaps an initial pass to ensure everything is loaded and a safe version of the loop that copies the maps_by_address ahead of copying it. It'd be of a scope that'd be worth its own patch set. > > > > An aside, shouldn't taking a write lock to modify the maps deadlock > > with holding the read lock for iteration? Well no because > > perf_singlethreaded is true for the test: > > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/rwsem.c#n17 > > Another perf_singlethreaded considered evil :-) Note, just getting rid > > of perf_singlethreaded means latent bugs like this will pop up and > > will need resolution. > > Yeah, maybe. How about turning it on in the test code? Agreed, but I think it should be a follow up. Thanks, Ian > Thanks, > Namhyung
On Mon, Feb 12, 2024 at 12:22 PM Ian Rogers <irogers@google.com> wrote: > > On Mon, Feb 12, 2024 at 12:10 PM Namhyung Kim <namhyung@kernel.org> wrote: > > > > On Sat, Feb 10, 2024 at 10:08 AM Ian Rogers <irogers@google.com> wrote: > > > > > > On Fri, Feb 9, 2024 at 6:46 PM Ian Rogers <irogers@google.com> wrote: > > > > > > > > On Thu, Feb 8, 2024 at 9:44 AM Namhyung Kim <namhyung@kernel.org> wrote: > > > > > > > > > > Hi Ian, > > > > > > > > > > On Wed, Feb 7, 2024 at 2:37 PM Ian Rogers <irogers@google.com> wrote: > > > > > > > > > > > > First 6 patches from: > > > > > > https://lore.kernel.org/lkml/20240202061532.1939474-1-irogers@google.com/ > > > > > > > > > > > > v2. Fix NO_LIBUNWIND=1 build issue. > > > > > > > > > > > > Ian Rogers (6): > > > > > > perf maps: Switch from rbtree to lazily sorted array for addresses > > > > > > perf maps: Get map before returning in maps__find > > > > > > perf maps: Get map before returning in maps__find_by_name > > > > > > perf maps: Get map before returning in maps__find_next_entry > > > > > > perf maps: Hide maps internals > > > > > > perf maps: Locking tidy up of nr_maps > > > > > > > > > > Now I see a perf test failure on the vmlinux test: > > > > > > > > > > $ sudo ./perf test -v vmlinux > > > > > 1: vmlinux symtab matches kallsyms : > > > > > --- start --- > > > > > test child forked, pid 4164115 > > > > > /proc/{kallsyms,modules} inconsistency while looking for > > > > > "[__builtin__kprobes]" module! > > > > > /proc/{kallsyms,modules} inconsistency while looking for > > > > > "[__builtin__kprobes]" module! > > > > > /proc/{kallsyms,modules} inconsistency while looking for > > > > > "[__builtin__ftrace]" module! > > > > > Looking at the vmlinux_path (8 entries long) > > > > > Using /usr/lib/debug/boot/vmlinux-6.5.13-1rodete2-amd64 for symbols > > > > > perf: Segmentation fault > > > > > Obtained 16 stack frames. > > > > > ./perf(+0x1b7dcd) [0x55c40be97dcd] > > > > > ./perf(+0x1b7eb7) [0x55c40be97eb7] > > > > > /lib/x86_64-linux-gnu/libc.so.6(+0x3c510) [0x7f33d7a5a510] > > > > > ./perf(+0x1c2e9c) [0x55c40bea2e9c] > > > > > ./perf(+0x1c43f6) [0x55c40bea43f6] > > > > > ./perf(+0x1c4649) [0x55c40bea4649] > > > > > ./perf(+0x1c46d3) [0x55c40bea46d3] > > > > > ./perf(+0x1c7303) [0x55c40bea7303] > > > > > ./perf(+0x1c70b5) [0x55c40bea70b5] > > > > > ./perf(+0x1c73e6) [0x55c40bea73e6] > > > > > ./perf(+0x11833e) [0x55c40bdf833e] > > > > > ./perf(+0x118f78) [0x55c40bdf8f78] > > > > > ./perf(+0x103d49) [0x55c40bde3d49] > > > > > ./perf(+0x103e75) [0x55c40bde3e75] > > > > > ./perf(+0x1044c0) [0x55c40bde44c0] > > > > > ./perf(+0x104de0) [0x55c40bde4de0] > > > > > test child interrupted > > > > > ---- end ---- > > > > > vmlinux symtab matches kallsyms: FAILED! > > > > > > > > Ah, tripped over a latent bug summarized in this part of an asan stack trace: > > > > ``` > > > > freed by thread T0 here: > > > > #0 0x7fa13bcd74b5 in __interceptor_realloc > > > > ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:85 > > > > #1 0x561d66377713 in __maps__insert util/maps.c:353 > > > > #2 0x561d66377b89 in maps__insert util/maps.c:413 > > > > #3 0x561d6652911d in dso__process_kernel_symbol util/symbol-elf.c:1460 > > > > #4 0x561d6652aaae in dso__load_sym_internal util/symbol-elf.c:1675 > > > > #5 0x561d6652b6dc in dso__load_sym util/symbol-elf.c:1771 > > > > #6 0x561d66321a4e in dso__load util/symbol.c:1914 > > > > #7 0x561d66372cd9 in map__load util/map.c:353 > > > > #8 0x561d663730e7 in map__find_symbol_by_name_idx util/map.c:397 > > > > #9 0x561d663731e7 in map__find_symbol_by_name util/map.c:410 > > > > #10 0x561d66378208 in maps__find_symbol_by_name_cb util/maps.c:524 > > > > #11 0x561d66377f49 in maps__for_each_map util/maps.c:471 > > > > #12 0x561d663784a0 in maps__find_symbol_by_name util/maps.c:546 > > > > #13 0x561d662093e8 in machine__find_kernel_symbol_by_name util/machine.h:243 > > > > #14 0x561d6620abbd in test__vmlinux_matches_kallsyms > > > > tests/vmlinux-kallsyms.c:330 > > > > ... > > > > ``` > > > > dso__process_kernel_symbol rewrites the kernel maps here: > > > > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/symbol-elf.c#n1378 > > > > which resizes the maps_by_address array causing the maps__for_each_map > > > > iteration in frame 11 to be iterating over a stale/freed value. > > > > > > > > The most correct solutions would be to clone the maps_by_address array > > > > prior to iteration, or reference count maps_by_address and its size. > > > > Neither of these solutions particularly appeal, so just reloading the > > > > maps_by_address and size on each iteration also fixes the problem, but > > > > possibly causes some maps to be skipped/repeated. I think this is > > > > acceptable correctness for the performance. > > > > Can we move map__load() out of maps__for_each_map() ? > > I think the callback should just return the map and break the loop. > > And it can call the map__load() out of the read lock. > > It would need a rewrite of map__find_symbol_by_name which is being > called by a callback from maps__find_symbol_by_name. Perhaps an > initial pass to ensure everything is loaded and a safe version of the > loop that copies the maps_by_address ahead of copying it. It'd be of a > scope that'd be worth its own patch set. Right, let's do it in a separate work. > > > > > > > An aside, shouldn't taking a write lock to modify the maps deadlock > > > with holding the read lock for iteration? Well no because > > > perf_singlethreaded is true for the test: > > > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/rwsem.c#n17 > > > Another perf_singlethreaded considered evil :-) Note, just getting rid > > > of perf_singlethreaded means latent bugs like this will pop up and > > > will need resolution. > > > > Yeah, maybe. How about turning it on in the test code? > > Agreed, but I think it should be a follow up. Sounds good. Thanks, Namhyung