Message ID | 20181010192732.13918-1-avarab@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | gc: introduce an --auto-exit-code option for undoing 3029970275 | expand |
On Wed, Oct 10, 2018 at 07:27:32PM +0000, Ævar Arnfjörð Bjarmason wrote: > Add an --auto-exit-code variable and a corresponding 'gc.autoExitCode' > configuration option to optionally bring back the 'git gc --auto' exit > code behavior as it existed between 2.6.3..2.19.0 (inclusive). > > This was changed in 3029970275 ("gc: do not return error for prior > errors in daemonized mode", 2018-07-16). The motivation for that patch > was to appease 3rd party tools whose treatment of the 'git gc --auto' > exit code is different from that of git core where it has always been > ignored. OK. I wouldn't want to use this myself, but I think you've made clear why you find it useful. So I don't mind making it an optional behavior (and it probably beats you trying to poke at the logfile yourself). I'm not sure if the config is going to actually help that much, though. The callers within Git will generally ignore the exit code anyway. So for those cases, setting it will at best do nothing, and at worst it may confuse the few stragglers (e.g., the git-svn one under recent discussion). Callers who _are_ prepared to act on the exit code probably ought to just use --auto-exit-code in their invocation. That said, I'm not entirely opposed to the matching config. There's enough history here that somebody might want a sledgehammer setting to go back to the old behavior. -Peff
Hi, Ævar Arnfjörð Bjarmason wrote: > Add an --auto-exit-code variable and a corresponding 'gc.autoExitCode' > configuration option to optionally bring back the 'git gc --auto' exit > code behavior as it existed between 2.6.3..2.19.0 (inclusive). Hm. Can you tell me more about the use case where this would be helpful to you? That would help us come up with a better name for it. Thanks, Jonathan
On Wed, Oct 10 2018, Jeff King wrote: > On Wed, Oct 10, 2018 at 07:27:32PM +0000, Ævar Arnfjörð Bjarmason wrote: > >> Add an --auto-exit-code variable and a corresponding 'gc.autoExitCode' >> configuration option to optionally bring back the 'git gc --auto' exit >> code behavior as it existed between 2.6.3..2.19.0 (inclusive). >> >> This was changed in 3029970275 ("gc: do not return error for prior >> errors in daemonized mode", 2018-07-16). The motivation for that patch >> was to appease 3rd party tools whose treatment of the 'git gc --auto' >> exit code is different from that of git core where it has always been >> ignored. > > OK. I wouldn't want to use this myself, but I think you've made clear > why you find it useful. So I don't mind making it an optional behavior > (and it probably beats you trying to poke at the logfile yourself). [...] > I'm not sure if the config is going to actually help that much, though. > The callers within Git will generally ignore the exit code anyway. So > for those cases, setting it will at best do nothing, and at worst it may > confuse the few stragglers (e.g., the git-svn one under recent > discussion). Yeah git internals don't care, but we've never advertised the combination of --auto and gc.autoDetach=true as being something internal-only, so e.g. I wrote stuff expecting errors, and one might run "git gc --auto" in a repo whose .git/objects state is uncertain to see if it needed repack (and have a shell integration that reports failures...). > Callers who _are_ prepared to act on the exit code probably ought to > just use --auto-exit-code in their invocation. > > That said, I'm not entirely opposed to the matching config. There's > enough history here that somebody might want a sledgehammer setting to > go back to the old behavior. If it's not a config option then as git is upgraded I'll need to change my across-server invocation to be some variant of checking git version, then etiher using the --auto-exit-code option or not (which'll error on older gits). Easier to be able to just drop in a config setting before the upgrade.
On Wed, Oct 10 2018, Jonathan Nieder wrote: > Hi, > > Ævar Arnfjörð Bjarmason wrote: > >> Add an --auto-exit-code variable and a corresponding 'gc.autoExitCode' >> configuration option to optionally bring back the 'git gc --auto' exit >> code behavior as it existed between 2.6.3..2.19.0 (inclusive). > > Hm. Can you tell me more about the use case where this would be > helpful to you? That would help us come up with a better name for it. From the E-Mail linked from the commit message[1] (I opted not to put this in, because it was getting a bit long: Right. I know. What I mean is now I can (and do) use it to run 'git gc --auto' across our server fleet and see whether I have any of #3, or whether it's all #1 or #2. If there's nothing to do in #1 that's fine, and it just so happens that I'll run gc due to #2 that's also fine, but I'd like to see if gc really is stuck. This of course relies on them having other users / scripts doing normal git commands which would trigger previous 'git gc --auto' runs. I.e. with your change that command: git gc --auto Would change to something like: git gc --auto && ! test -e .git/gc.log Which, as noted is a bit of a nasty breaker of the encapsulation, so now: git gc --auto --auto-exit-code Or just a variant of that which will have dropped the config in-place in /etc/gitconfig, and then as before: git gc --auto 1. https://public-inbox.org/git/878t69dgvx.fsf@evledraar.gmail.com/
Ævar Arnfjörð Bjarmason wrote: > Right. I know. What I mean is now I can (and do) use it to run 'git gc > --auto' across our server fleet and see whether I have any of #3, or > whether it's all #1 or #2. If there's nothing to do in #1 that's fine, > and it just so happens that I'll run gc due to #2 that's also fine, but > I'd like to see if gc really is stuck. > > This of course relies on them having other users / scripts doing normal > git commands which would trigger previous 'git gc --auto' runs. > > I.e. with your change that command: > > git gc --auto > > Would change to something like: > > git gc --auto && ! test -e .git/gc.log > > Which, as noted is a bit of a nasty breaker of the encapsulation That helps. What if we package up the "test -e .git/gc.log" bit *without* having the side effect of running git gc --auto, so that you can run if ! git gc --detached-exit-code then ... handle the error ... fi git gc --auto; # perhaps also with --detach ? I'm not great at naming options, so the --detached-exit-code name is bikesheddable. What I really mean to ask about is: what if the status reporting goes in a separate command from running gc --auto? Perhaps this reporting could also print the message from a previous run, so you could write: git gc --detached-status || exit git gc --auto; # perhaps also passing --detach (Names still open for bikeshedding.) Thanks and hope that helps, Jonathan
Jonathan Nieder <jrnieder@gmail.com> writes: > Perhaps this reporting could also print the message from a previous > run, so you could write: > > git gc --detached-status || exit > git gc --auto; # perhaps also passing --detach > > (Names still open for bikeshedding.) When the command is given --detached-exit-code/status option, what does it do? Does it perform the "did an earlier run left gc.log?" and report the result and nothing else? In other words, is it a pure replacement for "test -e .git/gc.log"? Or does it do some of the "auto-gc" prep logic like guestimating loose object count and have that also in its exit status (e.g. "from the gc.log left behind, we know that we failed to reduce loose object count down sufficiently after finding there are more than 6700 earlier, but now we do not have that many loose object, so there is nothing to complain about the presence of gc.log")? I am bad at naming myself, but worse at guessing what others meant with a new thing that was given a new name whose name is fuzzy, so... ;-)
Junio C Hamano wrote: > Jonathan Nieder <jrnieder@gmail.com> writes: >> Perhaps this reporting could also print the message from a previous >> run, so you could write: >> >> git gc --detached-status || exit >> git gc --auto; # perhaps also passing --detach >> >> (Names still open for bikeshedding.) > > When the command is given --detached-exit-code/status option, what > does it do? Does it perform the "did an earlier run left gc.log?" > and report the result and nothing else? In other words, is it a > pure replacement for "test -e .git/gc.log"? My intent was the latter. In other words, in the idiom do_something_async & ... a lot of time passes ... wait it is something like the replacement for "wait". More precisely, git gc --detached-status || exit would mean something like if test -e .git/gc.log # Error from previous gc --detach? then cat >&2 .git/gc.log # Report the error. exit 1 fi > Or does it do some of > the "auto-gc" prep logic like guestimating loose object count and > have that also in its exit status (e.g. "from the gc.log left > behind, we know that we failed to reduce loose object count down > sufficiently after finding there are more than 6700 earlier, but now > we do not have that many loose object, so there is nothing to > complain about the presence of gc.log")? Depending on the use case, a user might want to avoid losing information about the results of a previous "git gc --detach" run, even if they no longer apply. For example, a user might want to collect the error message for monitoring or later log analysis, to track down intermittent gc errors that go away on their own. A separate possible use case might be a git gc --needs-auto-gc command that detects whether an auto gc is needed. With that, a caller that only wants to learn about errors if auto gc is needed could run if git gc --needs-auto-gc then git gc --detached-status || exit fi > I am bad at naming myself, but worse at guessing what others meant > with a new thing that was given a new name whose name is fuzzy, > so... ;-) No problem. I'm mostly trying to tease out more details about the use case. Thanks, Jonathan
On Wed, Oct 10 2018, Jonathan Nieder wrote: > Junio C Hamano wrote: >> Jonathan Nieder <jrnieder@gmail.com> writes: > >>> Perhaps this reporting could also print the message from a previous >>> run, so you could write: >>> >>> git gc --detached-status || exit >>> git gc --auto; # perhaps also passing --detach >>> >>> (Names still open for bikeshedding.) >> >> When the command is given --detached-exit-code/status option, what >> does it do? Does it perform the "did an earlier run left gc.log?" >> and report the result and nothing else? In other words, is it a >> pure replacement for "test -e .git/gc.log"? > > My intent was the latter. In other words, in the idiom > > do_something_async & > ... a lot of time passes ... > wait > > it is something like the replacement for "wait". > > More precisely, > > git gc --detached-status || exit > > would mean something like > > if test -e .git/gc.log # Error from previous gc --detach? > then > cat >&2 .git/gc.log # Report the error. > exit 1 > fi > >> Or does it do some of >> the "auto-gc" prep logic like guestimating loose object count and >> have that also in its exit status (e.g. "from the gc.log left >> behind, we know that we failed to reduce loose object count down >> sufficiently after finding there are more than 6700 earlier, but now >> we do not have that many loose object, so there is nothing to >> complain about the presence of gc.log")? > > Depending on the use case, a user might want to avoid losing > information about the results of a previous "git gc --detach" run, > even if they no longer apply. For example, a user might want to > collect the error message for monitoring or later log analysis, to > track down intermittent gc errors that go away on their own. > > A separate possible use case might be a > > git gc --needs-auto-gc > > command that detects whether an auto gc is needed. With that, a > caller that only wants to learn about errors if auto gc is needed > could run > > if git gc --needs-auto-gc > then > git gc --detached-status || exit > fi > >> I am bad at naming myself, but worse at guessing what others meant >> with a new thing that was given a new name whose name is fuzzy, >> so... ;-) > > No problem. I'm mostly trying to tease out more details about the use > case. Likewise, so don't take the following as an assertion of fact, but more of a fact-finding mission: We could add something like this --detached-status / --needs-auto-gc, but I don't need it, and frankly I can't think of a reason for why anyone would want to use these. The entire point of having gc --auto in the first place is that you don't care when exactly GC happens, you're happy with whenever git decides it's needed. So why would anyone need a --needs-auto-gc? If your criteria for doing GC exactly matches that of gc --auto then ... you just run gc --auto, if it isn't (e.g. if you're using Microsoft's Windows repo) you're not using gc --auto in the first place, and neither --needs-auto-gc nor --auto is useful to you. So maybe I'm missing something here, but a --needs-auto-gc just seems like a gratuitous exposure of an internal implementation detail whose only actionable result is doing what we're doing with "gc --auto" now, i.e. just run gc. Which is what I'm doing by running "gc --auto" across a set of servers and looking at the exit code. If it's been failing I get an error, if there's no need to gc nothing happens, and if it hasn't been failing and it just so happens that it's time to GC then fine, now was as good a time as any. So if we assume that for the sake of argument there's no point in a --detached-status either. My only reason for ever caring about that status is when I run "gc --auto" and it says it can't fork() itself so it fails. Since I'm using "gc --auto" I have zero reason to even ask that question unless I'm OK with kicking off a gc run as a side-effect, so why split up the two? It just introduces a race condition for no benefit.
Ævar Arnfjörð Bjarmason wrote: > Which is what I'm doing by running "gc --auto" across a set of servers > and looking at the exit code. If it's been failing I get an error, if > there's no need to gc nothing happens, and if it hasn't been failing and > it just so happens that it's time to GC then fine, now was as good a > time as any. For this, a simple "git gc --detached-status" would work. It sounds like the bonus gc --auto run was a side effect instead of being a requirement. > So if we assume that for the sake of argument there's no point in a > --detached-status either. My only reason for ever caring about that > status is when I run "gc --auto" and it says it can't fork() itself so > it fails. Since I'm using "gc --auto" I have zero reason to even ask > that question unless I'm OK with kicking off a gc run as a side-effect, > so why split up the two? It just introduces a race condition for no > benefit. What I am trying to do is design an interface which is simple to explain in the manual. The existing "git gc --auto" interface since detaching was introduced is super confusing to me, so I'm going through the thought exercise of "If we were starting over, what would we build instead?" Part of the answer to that question might include a --report-from-the-last-time-you-detached option. I'm still failing to come up of a case where the answer to that question would include a --report-from-the-last-time-you-detached-and-if-it-went-okay\ -then-run-another-detached-gc option. In other words, I think our disconnect is that you are describing things in terms of "I have been happy with the existing git gc --auto detaching behavior, so how can we maintain something as close as possible to that"? And I am trying to describe things in terms of "What is the simplest, most maintainable, and easiest to explain way to keep Ævar's servers working well"? Jonathan
On Wed, Oct 10, 2018 at 10:59:45PM +0200, Ævar Arnfjörð Bjarmason wrote: > > Callers who _are_ prepared to act on the exit code probably ought to > > just use --auto-exit-code in their invocation. > > > > That said, I'm not entirely opposed to the matching config. There's > > enough history here that somebody might want a sledgehammer setting to > > go back to the old behavior. > > If it's not a config option then as git is upgraded I'll need to change > my across-server invocation to be some variant of checking git version, > then etiher using the --auto-exit-code option or not (which'll error on > older gits). Easier to be able to just drop in a config setting before > the upgrade. Yeah, that's the "there's enough history here" that I was referring to, but I hadn't quite thought through a concrete example. That makes sense. (Though I also think the other part of the thread is reasonable, too, where we'd just have a command to abstract away "cat .git/gc.log" into "git gc --show-detached-log" or something). -Peff
diff --git a/Documentation/config.txt b/Documentation/config.txt index 5b72684999..e37a463bf8 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -1635,6 +1635,34 @@ gc.autoDetach:: Make `git gc --auto` return immediately and run in background if the system supports it. Default is true. +gc.autoExitCode:: + Make `git gc --auto` return non-zero if it would have + demonized itself (see `gc.autoDetach`) due to a needed gc, but + a 'gc.log' is found within the `gc.logExpiry` with an error + from a previous run. ++ +When 'git gc' is run with the default of `gc.autoDetach=true` a +failure might have been noted in the 'gc.log' from a previously +detached `--auto` run. If the failure is within the time configured in +`gc.logExpiry` the `--auto` run will abort early and report the error +in the 'gc.log'. ++ +From version 2.6.3 to version 2.19 of Git encountering this error +would cause 'git gc' to exit with non-zero, but this was deemed to be +a hassle for third-party tools to handle since it rarely happens, and +they usually don't assume that 'git gc --auto' can fail. Therefore +since version 2.20 of Git 'git gc --auto' will always exit with zero +if it would have demonized itself, even when encountering such an +error. ++ +Supplying this option will make 'git gc' exit with non-zero in that +case, which allows for detecting cases where a repository is in a +state where git 'gc --auto' is refusing to demonize due to previously +encountered errors. ++ +This option can also be turned on as a one-off with the +`--auto-exit-code` option, see linkgit:git-gc[1]. + gc.bigPackThreshold:: If non-zero, all packs larger than this limit are kept when `git gc` is run. This is very similar to `--keep-base-pack` diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt index 24b2dd44fe..adf53fc959 100644 --- a/Documentation/git-gc.txt +++ b/Documentation/git-gc.txt @@ -71,6 +71,13 @@ If houskeeping is required due to many loose objects or packs, all other housekeeping tasks (e.g. rerere, working trees, reflog...) will be performed as well. +--auto-exit-code:: + Make `git gc --auto` return non-zero if it would have + demonized itself (see `gc.autoDetach`) due to a needed gc, but + a 'gc.log' is found within the time period of `gc.logExpiry` + with an error from a previous run. See linkgit:git-config[1] + for more details about this option which can also be + configured with the `gc.autoExitCode` boolean variable. --prune=<date>:: Prune loose objects older than date (default is 2 weeks ago, diff --git a/builtin/gc.c b/builtin/gc.c index ce8a663a01..69c0dedd8c 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -41,6 +41,7 @@ static int aggressive_window = 250; static int gc_auto_threshold = 6700; static int gc_auto_pack_limit = 50; static int detach_auto = 1; +static int gc_auto_exit_code = 0; static timestamp_t gc_log_expire_time; static const char *gc_log_expire = "1.day.ago"; static const char *prune_expire = "2.weeks.ago"; @@ -130,6 +131,7 @@ static void gc_config(void) git_config_get_int("gc.auto", &gc_auto_threshold); git_config_get_int("gc.autopacklimit", &gc_auto_pack_limit); git_config_get_bool("gc.autodetach", &detach_auto); + git_config_get_bool("gc.autoexitcode", &gc_auto_exit_code); git_config_get_expiry("gc.pruneexpire", &prune_expire); git_config_get_expiry("gc.worktreepruneexpire", &prune_worktrees_expire); git_config_get_expiry("gc.logexpiry", &gc_log_expire); @@ -518,6 +520,9 @@ int cmd_gc(int argc, const char **argv, const char *prefix) OPT_BOOL(0, "aggressive", &aggressive, N_("be more thorough (increased runtime)")), OPT_BOOL_F(0, "auto", &auto_gc, N_("enable auto-gc mode"), PARSE_OPT_NOCOMPLETE), + OPT_BOOL_F(0, "auto-exit-code", &gc_auto_exit_code, + N_("exit with non-zero if an error is encountered before gc is daemonized"), + PARSE_OPT_NOCOMPLETE), OPT_BOOL_F(0, "force", &force, N_("force running gc even if there may be another gc running"), PARSE_OPT_NOCOMPLETE), @@ -582,7 +587,7 @@ int cmd_gc(int argc, const char **argv, const char *prefix) exit(128); if (ret == 1) /* Last gc --auto failed. Skip this one. */ - return 0; + return gc_auto_exit_code; if (lock_repo_for_gc(force, &pid)) return 0; diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh index a222efdbe1..25bce70316 100755 --- a/t/t6500-gc.sh +++ b/t/t6500-gc.sh @@ -121,6 +121,8 @@ test_expect_success 'background auto gc does not run if gc.log is present and re test_config gc.logexpiry 5.days && test-tool chmtime =-345600 .git/gc.log && git gc --auto && + test_must_fail git gc --auto --auto-exit-code && + test_must_fail git -c gc.autoExitCode=true gc --auto && test_config gc.logexpiry 2.days && run_and_wait_for_auto_gc && ls .git/objects/pack/pack-*.pack >packs &&