Message ID | 588de2e4f16ab090ff477088084e0b81d9615ec5.1704909216.git.me@ttaylorr.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/5] t5309: run expected-to-fail `index-pack`s with `--threads=1` | expand |
Taylor Blau <me@ttaylorr.com> writes: > But that requires us to tweak production code (albeit at a negligible > cost) in order to appease LSan in this narrow circumstance. Another > approach is to simply run these expected-to-fail `index-pack` > invocations with `--threads=1` so that we bypass the above issue > entirely. But of course, multi-threaded operation that production folks use will not be tested at all with the alternative. > The downside of that approach is that the test doesn't match our > production code as well as it did before (where we might have run those > same `index-pack` invocations with >1 thread, depending on how many CPUs > the testing machine has). The risk there is that we might miss a > regression that would leave us in an inconsistent state. But that feels > rather unlikely in practice, and there are many other tests related to > `index-pack` in the suite. As long as "make sure we spawn all of them atmically" has negligible downside, I'd rather take that approach. Buying predictability with minimum cost is quite attractive. Thanks.
On Wed, Jan 10, 2024 at 02:18:52PM -0800, Junio C Hamano wrote: > Taylor Blau <me@ttaylorr.com> writes: > > > But that requires us to tweak production code (albeit at a negligible > > cost) in order to appease LSan in this narrow circumstance. Another > > approach is to simply run these expected-to-fail `index-pack` > > invocations with `--threads=1` so that we bypass the above issue > > entirely. > > But of course, multi-threaded operation that production folks use > will not be tested at all with the alternative. Just the ones that we expect to fail *and* are in test scripts which are marked as leak-free. > > The downside of that approach is that the test doesn't match our > > production code as well as it did before (where we might have run those > > same `index-pack` invocations with >1 thread, depending on how many CPUs > > the testing machine has). The risk there is that we might miss a > > regression that would leave us in an inconsistent state. But that feels > > rather unlikely in practice, and there are many other tests related to > > `index-pack` in the suite. > > As long as "make sure we spawn all of them atmically" has negligible > downside, I'd rather take that approach. Buying predictability with > minimum cost is quite attractive. Like I said earlier, I have no strong preference between either approach. If you want to pick up Peff's patch instead of these five, that is fine with me :-). Thanks, Taylor
diff --git a/t/t5309-pack-delta-cycles.sh b/t/t5309-pack-delta-cycles.sh index 4e910c5b9d..4100595c89 100755 --- a/t/t5309-pack-delta-cycles.sh +++ b/t/t5309-pack-delta-cycles.sh @@ -44,7 +44,7 @@ test_expect_success 'index-pack detects missing base objects' ' pack_obj $A $B } >missing.pack && pack_trailer missing.pack && - test_must_fail git index-pack --fix-thin --stdin <missing.pack + test_must_fail git index-pack --threads=1 --fix-thin --stdin <missing.pack ' test_expect_success 'index-pack detects REF_DELTA cycles' ' @@ -55,13 +55,13 @@ test_expect_success 'index-pack detects REF_DELTA cycles' ' pack_obj $B $A } >cycle.pack && pack_trailer cycle.pack && - test_must_fail git index-pack --fix-thin --stdin <cycle.pack + test_must_fail git index-pack --threads=1 --fix-thin --stdin <cycle.pack ' test_expect_success 'failover to an object in another pack' ' clear_packs && git index-pack --stdin <ab.pack && - test_must_fail git index-pack --stdin --fix-thin <cycle.pack + test_must_fail git index-pack --threads=1 --stdin --fix-thin <cycle.pack ' test_expect_success 'failover to a duplicate object in the same pack' ' @@ -73,7 +73,7 @@ test_expect_success 'failover to a duplicate object in the same pack' ' pack_obj $A } >recoverable.pack && pack_trailer recoverable.pack && - test_must_fail git index-pack --fix-thin --stdin <recoverable.pack + test_must_fail git index-pack --threads=1 --fix-thin --stdin <recoverable.pack ' test_done
The t5309 script triggers a racy false positive with SANITIZE=leak on a multi-core system. Running with "--stress --run=6" usually fails within 10 seconds or so for me, complaining with something like: + git index-pack --fix-thin --stdin fatal: REF_DELTA at offset 46 already resolved (duplicate base 01d7713666f4de822776c7622c10f1b07de280dc?) ================================================================= ==3904583==ERROR: LeakSanitizer: detected memory leaks Direct leak of 32 byte(s) in 1 object(s) allocated from: #0 0x7fa790d01986 in __interceptor_realloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:98 #1 0x7fa790add769 in __pthread_getattr_np nptl/pthread_getattr_np.c:180 #2 0x7fa790d117c5 in __sanitizer::GetThreadStackTopAndBottom(bool, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:150 #3 0x7fa790d11957 in __sanitizer::GetThreadStackAndTls(bool, unsigned long*, unsigned long*, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:598 #4 0x7fa790d03fe8 in __lsan::ThreadStart(unsigned int, unsigned long long, __sanitizer::ThreadType) ../../../../src/libsanitizer/lsan/lsan_posix.cpp:51 #5 0x7fa790d013fd in __lsan_thread_start_func ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:440 #6 0x7fa790adc3eb in start_thread nptl/pthread_create.c:444 #7 0x7fa790b5ca5b in clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 SUMMARY: LeakSanitizer: 32 byte(s) leaked in 1 allocation(s). Aborted What happens is this: 0. We construct a bogus pack with a duplicate object in it and trigger index-pack. 1. We spawn a bunch of worker threads to resolve deltas (on my system it is 16 threads). 2. One of the threads sees the duplicate object and bails by calling exit(), taking down all of the threads. This is expected and is the point of the test. 3. At the time exit() is called, we may still be spawning threads from the main process via pthread_create(). LSan hooks thread creation to update its book-keeping; it has to know where each thread's stack is (so it can find entry points for reachable memory). So it calls pthread_getattr_np() to get information about the new thread. That may allocate memory that must be freed with a matching call to pthread_attr_destroy(). Probably LSan does that immediately, but if you're unlucky enough, the exit() will happen while it's between those two calls, and the allocated pthread_attr_t appears as a leak. This isn't a real leak. It's not even in our code, but rather in the LSan instrumentation code. So we could just ignore it. But the false positive can cause people to waste time tracking it down. It's possibly something that LSan could protect against (e.g., cover the getattr/destroy pair with a mutex, and then in the final post-exit() check for leaks try to take the same mutex). But I don't know enough about LSan to say if that's a reasonable approach or not (or if my analysis is even completely correct). One approach to papering over this issue (short of LSan fixing it upstream) is to make the creation of work threads "atomic", i.e. by spawning all of them before letting any of them start to work. This shouldn't make any practical difference for non-LSan runs. The thread spawning is quick, and could happen before any worker thread gets scheduled anyway. But that requires us to tweak production code (albeit at a negligible cost) in order to appease LSan in this narrow circumstance. Another approach is to simply run these expected-to-fail `index-pack` invocations with `--threads=1` so that we bypass the above issue entirely. The downside of that approach is that the test doesn't match our production code as well as it did before (where we might have run those same `index-pack` invocations with >1 thread, depending on how many CPUs the testing machine has). The risk there is that we might miss a regression that would leave us in an inconsistent state. But that feels rather unlikely in practice, and there are many other tests related to `index-pack` in the suite. Original-patch-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> --- t/t5309-pack-delta-cycles.sh | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)