Message ID | patch-v3-4.4-d212009e517-20220221T155656Z-avarab@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | test-lib: improve LSAN + ASAN stack traces | expand |
On Mon, Feb 21, 2022 at 04:58:35PM +0100, Ævar Arnfjörð Bjarmason wrote: > Add "fast_unwind_on_malloc=0" to LSAN_OPTIONS to get more meaningful > stack traces from LSAN. This isn't required under ASAN which will emit > traces such as this one for a leak in "t/t0006-date.sh": > > $ ASAN_OPTIONS=detect_leaks=1 ./t0006-date.sh -vixd > [...] > Direct leak of 3 byte(s) in 1 object(s) allocated from: > #0 0x488b94 in strdup (t/helper/test-tool+0x488b94) > #1 0x9444a4 in xstrdup wrapper.c:29:14 > #2 0x5995fa in parse_date_format date.c:991:24 > #3 0x4d2056 in show_dates t/helper/test-date.c:39:2 > #4 0x4d174a in cmd__date t/helper/test-date.c:116:3 > #5 0x4cce89 in cmd_main t/helper/test-tool.c:127:11 > #6 0x4cd1e3 in main common-main.c:52:11 > #7 0x7fef3c695e49 in __libc_start_main csu/../csu/libc-start.c:314:16 > #8 0x422b09 in _start (t/helper/test-tool+0x422b09) > > SUMMARY: AddressSanitizer: 3 byte(s) leaked in 1 allocation(s). > Aborted > > Whereas LSAN would emit this instead: > > $ ./t0006-date.sh -vixd > [...] > Direct leak of 3 byte(s) in 1 object(s) allocated from: > #0 0x4323b8 in malloc (t/helper/test-tool+0x4323b8) > #1 0x7f2be1d614aa in strdup string/strdup.c:42:15 > > SUMMARY: LeakSanitizer: 3 byte(s) leaked in 1 allocation(s). > Aborted > > Now we'll instead git this sensible stack trace under > LSAN. I.e. almost the same one (but starting with "malloc", as is > usual for LSAN) as under ASAN: > > Direct leak of 3 byte(s) in 1 object(s) allocated from: > #0 0x4323b8 in malloc (t/helper/test-tool+0x4323b8) > #1 0x7f012af5c4aa in strdup string/strdup.c:42:15 > #2 0x5cb164 in xstrdup wrapper.c:29:14 > #3 0x495ee9 in parse_date_format date.c:991:24 > #4 0x453aac in show_dates t/helper/test-date.c:39:2 > #5 0x453782 in cmd__date t/helper/test-date.c:116:3 > #6 0x451d95 in cmd_main t/helper/test-tool.c:127:11 > #7 0x451f1e in main common-main.c:52:11 > #8 0x7f012aef5e49 in __libc_start_main csu/../csu/libc-start.c:314:16 > #9 0x42e0a9 in _start (t/helper/test-tool+0x42e0a9) > > SUMMARY: LeakSanitizer: 3 byte(s) leaked in 1 allocation(s). > Aborted This is great, by the way. I have often hit that bug in LSan and been incredibly frustrated by it. I'm happy to see it getting fixed here, thank you. > As the option name suggests this does make things slower, e.g. for > t0001-init.sh we're around 10% slower: > > $ hyperfine -L v 0,1 'LSAN_OPTIONS=fast_unwind_on_malloc={v} make T=t0001-init.sh' -r 3 > Benchmark 1: LSAN_OPTIONS=fast_unwind_on_malloc=0 make T=t0001-init.sh > Time (mean ± σ): 2.135 s ± 0.015 s [User: 1.951 s, System: 0.554 s] > Range (min … max): 2.122 s … 2.152 s 3 runs > > Benchmark 2: LSAN_OPTIONS=fast_unwind_on_malloc=1 make T=t0001-init.sh > Time (mean ± σ): 1.981 s ± 0.055 s [User: 1.769 s, System: 0.488 s] > Range (min … max): 1.941 s … 2.044 s 3 runs > > Summary > 'LSAN_OPTIONS=fast_unwind_on_malloc=1 make T=t0001-init.sh' ran > 1.08 ± 0.03 times faster than 'LSAN_OPTIONS=fast_unwind_on_malloc=0 make T=t0001-init.sh' > > I think that's more than worth it to get the more meaningful stack > traces, we can always provide LSAN_OPTIONS=fast_unwind_on_malloc=0 for > one-off "fast" runs. I completely agree. I am almost always run ASan / LSan tests a single script at a time (often focusing on just one script that I know demonstrates some bug). At GitHub, we use both a sanitized and un-sanitized build when running CI. So we'll probably feel the effects a little more during the "run make test under a sanitized build" CI job, but we could easily set fast_unwind_on_malloc=0 if it becomes too big of a problem for us (though I suspect it won't matter in practice). Thanks, Taylor
On Mon, Feb 21 2022, Taylor Blau wrote: > On Mon, Feb 21, 2022 at 04:58:35PM +0100, Ævar Arnfjörð Bjarmason wrote: >> Add "fast_unwind_on_malloc=0" to LSAN_OPTIONS to get more meaningful >> stack traces from LSAN. This isn't required under ASAN which will emit >> traces such as this one for a leak in "t/t0006-date.sh": >> >> $ ASAN_OPTIONS=detect_leaks=1 ./t0006-date.sh -vixd >> [...] >> Direct leak of 3 byte(s) in 1 object(s) allocated from: >> #0 0x488b94 in strdup (t/helper/test-tool+0x488b94) >> #1 0x9444a4 in xstrdup wrapper.c:29:14 >> #2 0x5995fa in parse_date_format date.c:991:24 >> #3 0x4d2056 in show_dates t/helper/test-date.c:39:2 >> #4 0x4d174a in cmd__date t/helper/test-date.c:116:3 >> #5 0x4cce89 in cmd_main t/helper/test-tool.c:127:11 >> #6 0x4cd1e3 in main common-main.c:52:11 >> #7 0x7fef3c695e49 in __libc_start_main csu/../csu/libc-start.c:314:16 >> #8 0x422b09 in _start (t/helper/test-tool+0x422b09) >> >> SUMMARY: AddressSanitizer: 3 byte(s) leaked in 1 allocation(s). >> Aborted >> >> Whereas LSAN would emit this instead: >> >> $ ./t0006-date.sh -vixd >> [...] >> Direct leak of 3 byte(s) in 1 object(s) allocated from: >> #0 0x4323b8 in malloc (t/helper/test-tool+0x4323b8) >> #1 0x7f2be1d614aa in strdup string/strdup.c:42:15 >> >> SUMMARY: LeakSanitizer: 3 byte(s) leaked in 1 allocation(s). >> Aborted >> >> Now we'll instead git this sensible stack trace under >> LSAN. I.e. almost the same one (but starting with "malloc", as is >> usual for LSAN) as under ASAN: >> >> Direct leak of 3 byte(s) in 1 object(s) allocated from: >> #0 0x4323b8 in malloc (t/helper/test-tool+0x4323b8) >> #1 0x7f012af5c4aa in strdup string/strdup.c:42:15 >> #2 0x5cb164 in xstrdup wrapper.c:29:14 >> #3 0x495ee9 in parse_date_format date.c:991:24 >> #4 0x453aac in show_dates t/helper/test-date.c:39:2 >> #5 0x453782 in cmd__date t/helper/test-date.c:116:3 >> #6 0x451d95 in cmd_main t/helper/test-tool.c:127:11 >> #7 0x451f1e in main common-main.c:52:11 >> #8 0x7f012aef5e49 in __libc_start_main csu/../csu/libc-start.c:314:16 >> #9 0x42e0a9 in _start (t/helper/test-tool+0x42e0a9) >> >> SUMMARY: LeakSanitizer: 3 byte(s) leaked in 1 allocation(s). >> Aborted > > This is great, by the way. I have often hit that bug in LSan and been > incredibly frustrated by it. I'm happy to see it getting fixed here, > thank you. Cheers! >> As the option name suggests this does make things slower, e.g. for >> t0001-init.sh we're around 10% slower: >> >> $ hyperfine -L v 0,1 'LSAN_OPTIONS=fast_unwind_on_malloc={v} make T=t0001-init.sh' -r 3 >> Benchmark 1: LSAN_OPTIONS=fast_unwind_on_malloc=0 make T=t0001-init.sh >> Time (mean ± σ): 2.135 s ± 0.015 s [User: 1.951 s, System: 0.554 s] >> Range (min … max): 2.122 s … 2.152 s 3 runs >> >> Benchmark 2: LSAN_OPTIONS=fast_unwind_on_malloc=1 make T=t0001-init.sh >> Time (mean ± σ): 1.981 s ± 0.055 s [User: 1.769 s, System: 0.488 s] >> Range (min … max): 1.941 s … 2.044 s 3 runs >> >> Summary >> 'LSAN_OPTIONS=fast_unwind_on_malloc=1 make T=t0001-init.sh' ran >> 1.08 ± 0.03 times faster than 'LSAN_OPTIONS=fast_unwind_on_malloc=0 make T=t0001-init.sh' >> >> I think that's more than worth it to get the more meaningful stack >> traces, we can always provide LSAN_OPTIONS=fast_unwind_on_malloc=0 for >> one-off "fast" runs. > > I completely agree. I am almost always run ASan / LSan tests a single > script at a time (often focusing on just one script that I know > demonstrates some bug). > > At GitHub, we use both a sanitized and un-sanitized build when running > CI. So we'll probably feel the effects a little more during the "run > make test under a sanitized build" CI job, but we could easily set > fast_unwind_on_malloc=0 if it becomes too big of a problem for us > (though I suspect it won't matter in practice). I suspect you mean SANITIZE=address not SANITIZE=leak, i.e. ASAN not LSAN. Note that the performance of the two is often quite different (with ASAN starting out much slower). I almost never use ASAN, but I've been using LSAN a lot. I also haven't been able to have ASAN spew out these worse stack traces, as noted in the commit message. But in some brief testing now if I were setting fast_unwind_on_malloc=0 it would be around 15% slower: $ hyperfine -L v 0,1 'ASAN_OPTIONS=fast_unwind_on_malloc={v} make T=t0001-init.sh' -r 5 Benchmark 1: ASAN_OPTIONS=fast_unwind_on_malloc=0 make T=t0001-init.sh Time (mean ± σ): 3.264 s ± 0.014 s [User: 2.456 s, System: 0.927 s] Range (min … max): 3.249 s … 3.282 s 5 runs Benchmark 2: ASAN_OPTIONS=fast_unwind_on_malloc=1 make T=t0001-init.sh Time (mean ± σ): 2.826 s ± 0.023 s [User: 2.008 s, System: 0.936 s] Range (min … max): 2.814 s … 2.866 s 5 runs Summary 'ASAN_OPTIONS=fast_unwind_on_malloc=1 make T=t0001-init.sh' ran 1.15 ± 0.01 times faster than 'ASAN_OPTIONS=fast_unwind_on_malloc=0 make T=t0001-init.sh' But I don't know in the ASAN case what, if anything, you're getting for that reduction in performance.
diff --git a/t/test-lib.sh b/t/test-lib.sh index ff13321bfd3..a96d19a332e 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -71,6 +71,7 @@ prepend_var ASAN_OPTIONS : detect_leaks=0 export ASAN_OPTIONS prepend_var LSAN_OPTIONS : $GIT_SAN_OPTIONS +prepend_var LSAN_OPTIONS : fast_unwind_on_malloc=0 export LSAN_OPTIONS if test ! -f "$GIT_BUILD_DIR"/GIT-BUILD-OPTIONS
Add "fast_unwind_on_malloc=0" to LSAN_OPTIONS to get more meaningful stack traces from LSAN. This isn't required under ASAN which will emit traces such as this one for a leak in "t/t0006-date.sh": $ ASAN_OPTIONS=detect_leaks=1 ./t0006-date.sh -vixd [...] Direct leak of 3 byte(s) in 1 object(s) allocated from: #0 0x488b94 in strdup (t/helper/test-tool+0x488b94) #1 0x9444a4 in xstrdup wrapper.c:29:14 #2 0x5995fa in parse_date_format date.c:991:24 #3 0x4d2056 in show_dates t/helper/test-date.c:39:2 #4 0x4d174a in cmd__date t/helper/test-date.c:116:3 #5 0x4cce89 in cmd_main t/helper/test-tool.c:127:11 #6 0x4cd1e3 in main common-main.c:52:11 #7 0x7fef3c695e49 in __libc_start_main csu/../csu/libc-start.c:314:16 #8 0x422b09 in _start (t/helper/test-tool+0x422b09) SUMMARY: AddressSanitizer: 3 byte(s) leaked in 1 allocation(s). Aborted Whereas LSAN would emit this instead: $ ./t0006-date.sh -vixd [...] Direct leak of 3 byte(s) in 1 object(s) allocated from: #0 0x4323b8 in malloc (t/helper/test-tool+0x4323b8) #1 0x7f2be1d614aa in strdup string/strdup.c:42:15 SUMMARY: LeakSanitizer: 3 byte(s) leaked in 1 allocation(s). Aborted Now we'll instead git this sensible stack trace under LSAN. I.e. almost the same one (but starting with "malloc", as is usual for LSAN) as under ASAN: Direct leak of 3 byte(s) in 1 object(s) allocated from: #0 0x4323b8 in malloc (t/helper/test-tool+0x4323b8) #1 0x7f012af5c4aa in strdup string/strdup.c:42:15 #2 0x5cb164 in xstrdup wrapper.c:29:14 #3 0x495ee9 in parse_date_format date.c:991:24 #4 0x453aac in show_dates t/helper/test-date.c:39:2 #5 0x453782 in cmd__date t/helper/test-date.c:116:3 #6 0x451d95 in cmd_main t/helper/test-tool.c:127:11 #7 0x451f1e in main common-main.c:52:11 #8 0x7f012aef5e49 in __libc_start_main csu/../csu/libc-start.c:314:16 #9 0x42e0a9 in _start (t/helper/test-tool+0x42e0a9) SUMMARY: LeakSanitizer: 3 byte(s) leaked in 1 allocation(s). Aborted As the option name suggests this does make things slower, e.g. for t0001-init.sh we're around 10% slower: $ hyperfine -L v 0,1 'LSAN_OPTIONS=fast_unwind_on_malloc={v} make T=t0001-init.sh' -r 3 Benchmark 1: LSAN_OPTIONS=fast_unwind_on_malloc=0 make T=t0001-init.sh Time (mean ± σ): 2.135 s ± 0.015 s [User: 1.951 s, System: 0.554 s] Range (min … max): 2.122 s … 2.152 s 3 runs Benchmark 2: LSAN_OPTIONS=fast_unwind_on_malloc=1 make T=t0001-init.sh Time (mean ± σ): 1.981 s ± 0.055 s [User: 1.769 s, System: 0.488 s] Range (min … max): 1.941 s … 2.044 s 3 runs Summary 'LSAN_OPTIONS=fast_unwind_on_malloc=1 make T=t0001-init.sh' ran 1.08 ± 0.03 times faster than 'LSAN_OPTIONS=fast_unwind_on_malloc=0 make T=t0001-init.sh' I think that's more than worth it to get the more meaningful stack traces, we can always provide LSAN_OPTIONS=fast_unwind_on_malloc=0 for one-off "fast" runs. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- t/test-lib.sh | 1 + 1 file changed, 1 insertion(+)