diff mbox series

[4/7] builtin/gc: stop processing log file on signal

Message ID 812c61c9b66d7608e41c6c1d00a6e22e995cef06.1723533091.git.ps@pks.im (mailing list archive)
State Superseded
Headers show
Series builtin/maintenance: fix auto-detach with non-standard tasks | expand

Commit Message

Patrick Steinhardt Aug. 13, 2024, 7:17 a.m. UTC
When detaching, git-gc(1) will redirect its stderr to a "gc.log" log
file, which is then used to surface errors of a backgrounded process to
the user. To ensure that the file is properly managed on abnormal exit
paths, we install both signal and exit handlers that try to either
commit the underlying lock file or roll it back in case there wasn't any
error.

This logic is severly broken when handling signals though, as we end up
calling all kinds of functions that are not signal safe. This includes
malloc(3P) via `git_path()`, fprintf(3P), fflush(3P) and many more
functions. The consequence can be anything, from deadlocks to crashes.
Unfortunately, we cannot really do much about this without a larger
refactoring.

The least-worst thing we can do is to not set up the signal handler in
the first place. This will still cause us to remove the lockfile, as the
underlying tempfile subsystem already knows to unlink locks when
receiving a signal. But it may cause us to remove the lock even in the
case where it would have contained actual errors, which is a change in
behaviour.

The consequence is that "gc.log" will not be committed, and thus
subsequent calls to `git gc --auto` won't bail out because of this.
Arguably though, it is better to retry garbage collection rather than
having the process run into a potentially-corrupted state.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/gc.c | 8 --------
 1 file changed, 8 deletions(-)

Comments

James Liu Aug. 15, 2024, 6:01 a.m. UTC | #1
On Tue Aug 13, 2024 at 5:17 PM AEST, Patrick Steinhardt wrote:
> The consequence is that "gc.log" will not be committed, and thus
> subsequent calls to `git gc --auto` won't bail out because of this.
> Arguably though, it is better to retry garbage collection rather than
> having the process run into a potentially-corrupted state.

Ahh I see, because `report_last_gc_error()` won't have anything to read.
I agree it's an appropriate tradeoff given that garbage collection is
not on the critical path, and it's not likely that GC will be
interrupted on every attempt.

> @@ -807,7 +800,6 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
>  					  git_path("gc.log"),
>  					  LOCK_DIE_ON_ERROR);
>  		dup2(get_lock_file_fd(&log_lock), 2);
> -		sigchain_push_common(process_log_file_on_signal);
>  		atexit(process_log_file_at_exit);
>  	}
>
diff mbox series

Patch

diff --git a/builtin/gc.c b/builtin/gc.c
index a93cfa147e..f815557081 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -109,13 +109,6 @@  static void process_log_file_at_exit(void)
 	process_log_file();
 }
 
-static void process_log_file_on_signal(int signo)
-{
-	process_log_file();
-	sigchain_pop(signo);
-	raise(signo);
-}
-
 static int gc_config_is_timestamp_never(const char *var)
 {
 	const char *value;
@@ -807,7 +800,6 @@  int cmd_gc(int argc, const char **argv, const char *prefix)
 					  git_path("gc.log"),
 					  LOCK_DIE_ON_ERROR);
 		dup2(get_lock_file_fd(&log_lock), 2);
-		sigchain_push_common(process_log_file_on_signal);
 		atexit(process_log_file_at_exit);
 	}