diff mbox series

[v3,4/4] test-lib: add "fast_unwind_on_malloc=0" to LSAN_OPTIONS

Message ID patch-v3-4.4-d212009e517-20220221T155656Z-avarab@gmail.com (mailing list archive)
State Superseded
Headers show
Series test-lib: improve LSAN + ASAN stack traces | expand

Commit Message

Ævar Arnfjörð Bjarmason Feb. 21, 2022, 3:58 p.m. UTC
Add "fast_unwind_on_malloc=0" to LSAN_OPTIONS to get more meaningful
stack traces from LSAN. This isn't required under ASAN which will emit
traces such as this one for a leak in "t/t0006-date.sh":

    $ ASAN_OPTIONS=detect_leaks=1 ./t0006-date.sh -vixd
    [...]
    Direct leak of 3 byte(s) in 1 object(s) allocated from:
        #0 0x488b94 in strdup (t/helper/test-tool+0x488b94)
        #1 0x9444a4 in xstrdup wrapper.c:29:14
        #2 0x5995fa in parse_date_format date.c:991:24
        #3 0x4d2056 in show_dates t/helper/test-date.c:39:2
        #4 0x4d174a in cmd__date t/helper/test-date.c:116:3
        #5 0x4cce89 in cmd_main t/helper/test-tool.c:127:11
        #6 0x4cd1e3 in main common-main.c:52:11
        #7 0x7fef3c695e49 in __libc_start_main csu/../csu/libc-start.c:314:16
        #8 0x422b09 in _start (t/helper/test-tool+0x422b09)

    SUMMARY: AddressSanitizer: 3 byte(s) leaked in 1 allocation(s).
    Aborted

Whereas LSAN would emit this instead:

    $ ./t0006-date.sh -vixd
    [...]
    Direct leak of 3 byte(s) in 1 object(s) allocated from:
        #0 0x4323b8 in malloc (t/helper/test-tool+0x4323b8)
        #1 0x7f2be1d614aa in strdup string/strdup.c:42:15

    SUMMARY: LeakSanitizer: 3 byte(s) leaked in 1 allocation(s).
    Aborted

Now we'll instead git this sensible stack trace under
LSAN. I.e. almost the same one (but starting with "malloc", as is
usual for LSAN) as under ASAN:

    Direct leak of 3 byte(s) in 1 object(s) allocated from:
        #0 0x4323b8 in malloc (t/helper/test-tool+0x4323b8)
        #1 0x7f012af5c4aa in strdup string/strdup.c:42:15
        #2 0x5cb164 in xstrdup wrapper.c:29:14
        #3 0x495ee9 in parse_date_format date.c:991:24
        #4 0x453aac in show_dates t/helper/test-date.c:39:2
        #5 0x453782 in cmd__date t/helper/test-date.c:116:3
        #6 0x451d95 in cmd_main t/helper/test-tool.c:127:11
        #7 0x451f1e in main common-main.c:52:11
        #8 0x7f012aef5e49 in __libc_start_main csu/../csu/libc-start.c:314:16
        #9 0x42e0a9 in _start (t/helper/test-tool+0x42e0a9)

    SUMMARY: LeakSanitizer: 3 byte(s) leaked in 1 allocation(s).
    Aborted

As the option name suggests this does make things slower, e.g. for
t0001-init.sh we're around 10% slower:

    $ hyperfine -L v 0,1 'LSAN_OPTIONS=fast_unwind_on_malloc={v} make T=t0001-init.sh' -r 3
    Benchmark 1: LSAN_OPTIONS=fast_unwind_on_malloc=0 make T=t0001-init.sh
      Time (mean ± σ):      2.135 s ±  0.015 s    [User: 1.951 s, System: 0.554 s]
      Range (min … max):    2.122 s …  2.152 s    3 runs

    Benchmark 2: LSAN_OPTIONS=fast_unwind_on_malloc=1 make T=t0001-init.sh
      Time (mean ± σ):      1.981 s ±  0.055 s    [User: 1.769 s, System: 0.488 s]
      Range (min … max):    1.941 s …  2.044 s    3 runs

    Summary
      'LSAN_OPTIONS=fast_unwind_on_malloc=1 make T=t0001-init.sh' ran
        1.08 ± 0.03 times faster than 'LSAN_OPTIONS=fast_unwind_on_malloc=0 make T=t0001-init.sh'

I think that's more than worth it to get the more meaningful stack
traces, we can always provide LSAN_OPTIONS=fast_unwind_on_malloc=0 for
one-off "fast" runs.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/test-lib.sh | 1 +
 1 file changed, 1 insertion(+)

Comments

Taylor Blau Feb. 21, 2022, 5:32 p.m. UTC | #1
On Mon, Feb 21, 2022 at 04:58:35PM +0100, Ævar Arnfjörð Bjarmason wrote:
> Add "fast_unwind_on_malloc=0" to LSAN_OPTIONS to get more meaningful
> stack traces from LSAN. This isn't required under ASAN which will emit
> traces such as this one for a leak in "t/t0006-date.sh":
>
>     $ ASAN_OPTIONS=detect_leaks=1 ./t0006-date.sh -vixd
>     [...]
>     Direct leak of 3 byte(s) in 1 object(s) allocated from:
>         #0 0x488b94 in strdup (t/helper/test-tool+0x488b94)
>         #1 0x9444a4 in xstrdup wrapper.c:29:14
>         #2 0x5995fa in parse_date_format date.c:991:24
>         #3 0x4d2056 in show_dates t/helper/test-date.c:39:2
>         #4 0x4d174a in cmd__date t/helper/test-date.c:116:3
>         #5 0x4cce89 in cmd_main t/helper/test-tool.c:127:11
>         #6 0x4cd1e3 in main common-main.c:52:11
>         #7 0x7fef3c695e49 in __libc_start_main csu/../csu/libc-start.c:314:16
>         #8 0x422b09 in _start (t/helper/test-tool+0x422b09)
>
>     SUMMARY: AddressSanitizer: 3 byte(s) leaked in 1 allocation(s).
>     Aborted
>
> Whereas LSAN would emit this instead:
>
>     $ ./t0006-date.sh -vixd
>     [...]
>     Direct leak of 3 byte(s) in 1 object(s) allocated from:
>         #0 0x4323b8 in malloc (t/helper/test-tool+0x4323b8)
>         #1 0x7f2be1d614aa in strdup string/strdup.c:42:15
>
>     SUMMARY: LeakSanitizer: 3 byte(s) leaked in 1 allocation(s).
>     Aborted
>
> Now we'll instead git this sensible stack trace under
> LSAN. I.e. almost the same one (but starting with "malloc", as is
> usual for LSAN) as under ASAN:
>
>     Direct leak of 3 byte(s) in 1 object(s) allocated from:
>         #0 0x4323b8 in malloc (t/helper/test-tool+0x4323b8)
>         #1 0x7f012af5c4aa in strdup string/strdup.c:42:15
>         #2 0x5cb164 in xstrdup wrapper.c:29:14
>         #3 0x495ee9 in parse_date_format date.c:991:24
>         #4 0x453aac in show_dates t/helper/test-date.c:39:2
>         #5 0x453782 in cmd__date t/helper/test-date.c:116:3
>         #6 0x451d95 in cmd_main t/helper/test-tool.c:127:11
>         #7 0x451f1e in main common-main.c:52:11
>         #8 0x7f012aef5e49 in __libc_start_main csu/../csu/libc-start.c:314:16
>         #9 0x42e0a9 in _start (t/helper/test-tool+0x42e0a9)
>
>     SUMMARY: LeakSanitizer: 3 byte(s) leaked in 1 allocation(s).
>     Aborted

This is great, by the way. I have often hit that bug in LSan and been
incredibly frustrated by it. I'm happy to see it getting fixed here,
thank you.

> As the option name suggests this does make things slower, e.g. for
> t0001-init.sh we're around 10% slower:
>
>     $ hyperfine -L v 0,1 'LSAN_OPTIONS=fast_unwind_on_malloc={v} make T=t0001-init.sh' -r 3
>     Benchmark 1: LSAN_OPTIONS=fast_unwind_on_malloc=0 make T=t0001-init.sh
>       Time (mean ± σ):      2.135 s ±  0.015 s    [User: 1.951 s, System: 0.554 s]
>       Range (min … max):    2.122 s …  2.152 s    3 runs
>
>     Benchmark 2: LSAN_OPTIONS=fast_unwind_on_malloc=1 make T=t0001-init.sh
>       Time (mean ± σ):      1.981 s ±  0.055 s    [User: 1.769 s, System: 0.488 s]
>       Range (min … max):    1.941 s …  2.044 s    3 runs
>
>     Summary
>       'LSAN_OPTIONS=fast_unwind_on_malloc=1 make T=t0001-init.sh' ran
>         1.08 ± 0.03 times faster than 'LSAN_OPTIONS=fast_unwind_on_malloc=0 make T=t0001-init.sh'
>
> I think that's more than worth it to get the more meaningful stack
> traces, we can always provide LSAN_OPTIONS=fast_unwind_on_malloc=0 for
> one-off "fast" runs.

I completely agree. I am almost always run ASan / LSan tests a single
script at a time (often focusing on just one script that I know
demonstrates some bug).

At GitHub, we use both a sanitized and un-sanitized build when running
CI. So we'll probably feel the effects a little more during the "run
make test under a sanitized build" CI job, but we could easily set
fast_unwind_on_malloc=0 if it becomes too big of a problem for us
(though I suspect it won't matter in practice).

Thanks,
Taylor
Ævar Arnfjörð Bjarmason Feb. 21, 2022, 6:59 p.m. UTC | #2
On Mon, Feb 21 2022, Taylor Blau wrote:

> On Mon, Feb 21, 2022 at 04:58:35PM +0100, Ævar Arnfjörð Bjarmason wrote:
>> Add "fast_unwind_on_malloc=0" to LSAN_OPTIONS to get more meaningful
>> stack traces from LSAN. This isn't required under ASAN which will emit
>> traces such as this one for a leak in "t/t0006-date.sh":
>>
>>     $ ASAN_OPTIONS=detect_leaks=1 ./t0006-date.sh -vixd
>>     [...]
>>     Direct leak of 3 byte(s) in 1 object(s) allocated from:
>>         #0 0x488b94 in strdup (t/helper/test-tool+0x488b94)
>>         #1 0x9444a4 in xstrdup wrapper.c:29:14
>>         #2 0x5995fa in parse_date_format date.c:991:24
>>         #3 0x4d2056 in show_dates t/helper/test-date.c:39:2
>>         #4 0x4d174a in cmd__date t/helper/test-date.c:116:3
>>         #5 0x4cce89 in cmd_main t/helper/test-tool.c:127:11
>>         #6 0x4cd1e3 in main common-main.c:52:11
>>         #7 0x7fef3c695e49 in __libc_start_main csu/../csu/libc-start.c:314:16
>>         #8 0x422b09 in _start (t/helper/test-tool+0x422b09)
>>
>>     SUMMARY: AddressSanitizer: 3 byte(s) leaked in 1 allocation(s).
>>     Aborted
>>
>> Whereas LSAN would emit this instead:
>>
>>     $ ./t0006-date.sh -vixd
>>     [...]
>>     Direct leak of 3 byte(s) in 1 object(s) allocated from:
>>         #0 0x4323b8 in malloc (t/helper/test-tool+0x4323b8)
>>         #1 0x7f2be1d614aa in strdup string/strdup.c:42:15
>>
>>     SUMMARY: LeakSanitizer: 3 byte(s) leaked in 1 allocation(s).
>>     Aborted
>>
>> Now we'll instead git this sensible stack trace under
>> LSAN. I.e. almost the same one (but starting with "malloc", as is
>> usual for LSAN) as under ASAN:
>>
>>     Direct leak of 3 byte(s) in 1 object(s) allocated from:
>>         #0 0x4323b8 in malloc (t/helper/test-tool+0x4323b8)
>>         #1 0x7f012af5c4aa in strdup string/strdup.c:42:15
>>         #2 0x5cb164 in xstrdup wrapper.c:29:14
>>         #3 0x495ee9 in parse_date_format date.c:991:24
>>         #4 0x453aac in show_dates t/helper/test-date.c:39:2
>>         #5 0x453782 in cmd__date t/helper/test-date.c:116:3
>>         #6 0x451d95 in cmd_main t/helper/test-tool.c:127:11
>>         #7 0x451f1e in main common-main.c:52:11
>>         #8 0x7f012aef5e49 in __libc_start_main csu/../csu/libc-start.c:314:16
>>         #9 0x42e0a9 in _start (t/helper/test-tool+0x42e0a9)
>>
>>     SUMMARY: LeakSanitizer: 3 byte(s) leaked in 1 allocation(s).
>>     Aborted
>
> This is great, by the way. I have often hit that bug in LSan and been
> incredibly frustrated by it. I'm happy to see it getting fixed here,
> thank you.

Cheers!

>> As the option name suggests this does make things slower, e.g. for
>> t0001-init.sh we're around 10% slower:
>>
>>     $ hyperfine -L v 0,1 'LSAN_OPTIONS=fast_unwind_on_malloc={v} make T=t0001-init.sh' -r 3
>>     Benchmark 1: LSAN_OPTIONS=fast_unwind_on_malloc=0 make T=t0001-init.sh
>>       Time (mean ± σ):      2.135 s ±  0.015 s    [User: 1.951 s, System: 0.554 s]
>>       Range (min … max):    2.122 s …  2.152 s    3 runs
>>
>>     Benchmark 2: LSAN_OPTIONS=fast_unwind_on_malloc=1 make T=t0001-init.sh
>>       Time (mean ± σ):      1.981 s ±  0.055 s    [User: 1.769 s, System: 0.488 s]
>>       Range (min … max):    1.941 s …  2.044 s    3 runs
>>
>>     Summary
>>       'LSAN_OPTIONS=fast_unwind_on_malloc=1 make T=t0001-init.sh' ran
>>         1.08 ± 0.03 times faster than 'LSAN_OPTIONS=fast_unwind_on_malloc=0 make T=t0001-init.sh'
>>
>> I think that's more than worth it to get the more meaningful stack
>> traces, we can always provide LSAN_OPTIONS=fast_unwind_on_malloc=0 for
>> one-off "fast" runs.
>
> I completely agree. I am almost always run ASan / LSan tests a single
> script at a time (often focusing on just one script that I know
> demonstrates some bug).
>
> At GitHub, we use both a sanitized and un-sanitized build when running
> CI. So we'll probably feel the effects a little more during the "run
> make test under a sanitized build" CI job, but we could easily set
> fast_unwind_on_malloc=0 if it becomes too big of a problem for us
> (though I suspect it won't matter in practice).

I suspect you mean SANITIZE=address not SANITIZE=leak, i.e. ASAN not
LSAN. Note that the performance of the two is often quite different
(with ASAN starting out much slower).

I almost never use ASAN, but I've been using LSAN a lot.

I also haven't been able to have ASAN spew out these worse stack traces,
as noted in the commit message. But in some brief testing now if I were
setting fast_unwind_on_malloc=0 it would be around 15% slower:
    
    $ hyperfine -L v 0,1 'ASAN_OPTIONS=fast_unwind_on_malloc={v} make T=t0001-init.sh' -r 5
    Benchmark 1: ASAN_OPTIONS=fast_unwind_on_malloc=0 make T=t0001-init.sh
      Time (mean ± σ):      3.264 s ±  0.014 s    [User: 2.456 s, System: 0.927 s]
      Range (min … max):    3.249 s …  3.282 s    5 runs
     
    Benchmark 2: ASAN_OPTIONS=fast_unwind_on_malloc=1 make T=t0001-init.sh
      Time (mean ± σ):      2.826 s ±  0.023 s    [User: 2.008 s, System: 0.936 s]
      Range (min … max):    2.814 s …  2.866 s    5 runs
     
    Summary
      'ASAN_OPTIONS=fast_unwind_on_malloc=1 make T=t0001-init.sh' ran
        1.15 ± 0.01 times faster than 'ASAN_OPTIONS=fast_unwind_on_malloc=0 make T=t0001-init.sh'

But I don't know in the ASAN case what, if anything, you're getting for
that reduction in performance.
diff mbox series

Patch

diff --git a/t/test-lib.sh b/t/test-lib.sh
index ff13321bfd3..a96d19a332e 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -71,6 +71,7 @@  prepend_var ASAN_OPTIONS : detect_leaks=0
 export ASAN_OPTIONS
 
 prepend_var LSAN_OPTIONS : $GIT_SAN_OPTIONS
+prepend_var LSAN_OPTIONS : fast_unwind_on_malloc=0
 export LSAN_OPTIONS
 
 if test ! -f "$GIT_BUILD_DIR"/GIT-BUILD-OPTIONS