diff mbox series

[6/6] test-lib: ignore leaks in the sanitizer's thread code

Message ID 20250101202124.GF3305462@coredump.intra.peff.net (mailing list archive)
State Accepted
Commit b119a687d411864433aed92017c144d311b53a4c
Headers show
Series [1/6] test-lib: use individual lsan dir for --stress runs | expand

Commit Message

Jeff King Jan. 1, 2025, 8:21 p.m. UTC
Our CI jobs sometimes see false positive leaks like this:

        =================================================================
        ==3904583==ERROR: LeakSanitizer: detected memory leaks

        Direct leak of 32 byte(s) in 1 object(s) allocated from:
            #0 0x7fa790d01986 in __interceptor_realloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:98
            #1 0x7fa790add769 in __pthread_getattr_np nptl/pthread_getattr_np.c:180
            #2 0x7fa790d117c5 in __sanitizer::GetThreadStackTopAndBottom(bool, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:150
            #3 0x7fa790d11957 in __sanitizer::GetThreadStackAndTls(bool, unsigned long*, unsigned long*, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:598
            #4 0x7fa790d03fe8 in __lsan::ThreadStart(unsigned int, unsigned long long, __sanitizer::ThreadType) ../../../../src/libsanitizer/lsan/lsan_posix.cpp:51
            #5 0x7fa790d013fd in __lsan_thread_start_func ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:440
            #6 0x7fa790adc3eb in start_thread nptl/pthread_create.c:444
            #7 0x7fa790b5ca5b in clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

This is not a leak in our code, but appears to be a race between one
thread calling exit() while another one is in LSan's stack setup code.
You can reproduce it easily by running t0003 or t5309 with --stress
(these trigger it because of the threading in git-grep and index-pack
respectively).

This may be a bug in LSan, but regardless of whether it is eventually
fixed, it is useful to work around it so that we stop seeing these false
positives.

We can recognize it by the mention of the sanitizer functions in the
DEDUP_TOKEN line. With this patch, the scripts mentioned above should
run with --stress indefinitely.

Signed-off-by: Jeff King <peff@peff.net>
---
One small downside here is that this just suppresses the "were there any
leaks" check. If there's a real leak _and_ the race was triggered, then
you'd see the racy false positive in the output. I don't think that's a
big deal, since both the race and real leaks should be rare-ish, and
you'd have to encounter both in the same run of a given test script.

 t/test-lib.sh | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Patrick Steinhardt Jan. 3, 2025, 12:05 p.m. UTC | #1
On Wed, Jan 01, 2025 at 03:21:24PM -0500, Jeff King wrote:
> One small downside here is that this just suppresses the "were there any
> leaks" check. If there's a real leak _and_ the race was triggered, then
> you'd see the racy false positive in the output. I don't think that's a
> big deal, since both the race and real leaks should be rare-ish, and
> you'd have to encounter both in the same run of a given test script.

Yeah, I think that's fine. You'd have to both be unlucky and trigger the
leak _and_ you'd have to be debugging a testcase that hits the race in
the first place. Which together feels unlikely enough to really matter.

>  t/test-lib.sh | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/t/test-lib.sh b/t/test-lib.sh
> index c9487d0805..d1f62adbf8 100644
> --- a/t/test-lib.sh
> +++ b/t/test-lib.sh
> @@ -1177,7 +1177,8 @@ check_test_results_san_file_empty_ () {
>  	! find "$TEST_RESULTS_SAN_DIR" \
>  		-type f \
>  		-name "$TEST_RESULTS_SAN_FILE_PFX.*" 2>/dev/null |
> -	xargs grep -q ^DEDUP_TOKEN
> +	xargs grep ^DEDUP_TOKEN |
> +	grep -qv sanitizer::GetThreadStackTopAndBottom
>  }

It would be nice to provide some more context here in the form of a
comment so that one doesn't have to blame the commit.

Patrick
Jeff King Jan. 3, 2025, 8:26 p.m. UTC | #2
On Fri, Jan 03, 2025 at 01:05:48PM +0100, Patrick Steinhardt wrote:

> > diff --git a/t/test-lib.sh b/t/test-lib.sh
> > index c9487d0805..d1f62adbf8 100644
> > --- a/t/test-lib.sh
> > +++ b/t/test-lib.sh
> > @@ -1177,7 +1177,8 @@ check_test_results_san_file_empty_ () {
> >  	! find "$TEST_RESULTS_SAN_DIR" \
> >  		-type f \
> >  		-name "$TEST_RESULTS_SAN_FILE_PFX.*" 2>/dev/null |
> > -	xargs grep -q ^DEDUP_TOKEN
> > +	xargs grep ^DEDUP_TOKEN |
> > +	grep -qv sanitizer::GetThreadStackTopAndBottom
> >  }
> 
> It would be nice to provide some more context here in the form of a
> comment so that one doesn't have to blame the commit.

We can add that on top, but I'm not sure what it should say. Do you want
something along the lines of "add false positives to ignore here..." or
are an explanation of why we are ignoring this particular false
positive?

-Peff
Patrick Steinhardt Jan. 6, 2025, 7:56 a.m. UTC | #3
On Fri, Jan 03, 2025 at 03:26:45PM -0500, Jeff King wrote:
> On Fri, Jan 03, 2025 at 01:05:48PM +0100, Patrick Steinhardt wrote:
> 
> > > diff --git a/t/test-lib.sh b/t/test-lib.sh
> > > index c9487d0805..d1f62adbf8 100644
> > > --- a/t/test-lib.sh
> > > +++ b/t/test-lib.sh
> > > @@ -1177,7 +1177,8 @@ check_test_results_san_file_empty_ () {
> > >  	! find "$TEST_RESULTS_SAN_DIR" \
> > >  		-type f \
> > >  		-name "$TEST_RESULTS_SAN_FILE_PFX.*" 2>/dev/null |
> > > -	xargs grep -q ^DEDUP_TOKEN
> > > +	xargs grep ^DEDUP_TOKEN |
> > > +	grep -qv sanitizer::GetThreadStackTopAndBottom
> > >  }
> > 
> > It would be nice to provide some more context here in the form of a
> > comment so that one doesn't have to blame the commit.
> 
> We can add that on top, but I'm not sure what it should say. Do you want
> something along the lines of "add false positives to ignore here..." or
> are an explanation of why we are ignoring this particular false
> positive?

The latter, ideally also with a reference to the upstream issue you have
created. That makes it way easier to discover why this line exists and
may prompt people to remove the line eventually if they discover that
the issue has been fixed for a while.

It doesn't have to be a full paragraph, but a sentence or to go a long
way sometimes. For more context people can still blame the commit
message.

Patrick
diff mbox series

Patch

diff --git a/t/test-lib.sh b/t/test-lib.sh
index c9487d0805..d1f62adbf8 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1177,7 +1177,8 @@  check_test_results_san_file_empty_ () {
 	! find "$TEST_RESULTS_SAN_DIR" \
 		-type f \
 		-name "$TEST_RESULTS_SAN_FILE_PFX.*" 2>/dev/null |
-	xargs grep -q ^DEDUP_TOKEN
+	xargs grep ^DEDUP_TOKEN |
+	grep -qv sanitizer::GetThreadStackTopAndBottom
 }
 
 check_test_results_san_file_ () {