diff mbox series

reflog expire --stale-fix: be generous about missing objects

Message ID pull.873.git.1612973499110.gitgitgadget@gmail.com (mailing list archive)
State Accepted
Commit 899034efca9cd17115a648045385846231e6c0dc
Headers show
Series reflog expire --stale-fix: be generous about missing objects | expand

Commit Message

Johannes Schindelin Feb. 10, 2021, 4:11 p.m. UTC
From: Johannes Schindelin <johannes.schindelin@gmx.de>

Whenever a user runs `git reflog expire --stale-fix`, the most likely
reason is that their repository is at least _somewhat_ corrupt. Which
means that it is more than just possible that some objects are missing.

If that is the case, that can currently let the command abort through
the phase where it tries to mark all reachable objects.

Instead of adding insult to injury, let's be gentle and continue as best
as we can in such a scenario, simply by ignoring the missing objects and
moving on.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
    Make git reflog expire --stale-fix a lot more useful
    
    Yesterday, I tried to run a quick test to find out whether master's
    version of git repack prevents .bitmap files from being deleted by still
    having them mmap()ed. Since I do not have a build of master lying around
    just like that, I checked it out, built the thing, and then ran
    
    ./bin-wrappers/git -c alias.c='!(cd /path/to/directory && ./test-whether-bitmaps-are-mmaped-during-repack.sh) c
    
    
    Do NOT try this at home! The problem with this invocation is that the
    alias will still have GIT_DIR set, therefore the git init in that script
    will not create a new Git directory, and the git repack -ad in that
    script will remove all kinds of precious objects from the Git checkout.
    Even though I interrupted the run as quickly as I realized that things
    were going wrong, my repository was corrupted in a major way, and it
    took me many hours to get back to a healthy state.
    
    It made matters worse that git reflog expire --stale-fix was less
    helpful than it could have been, and this patch is the result of my
    directed emotional energy.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-873%2Fdscho%2Freflog-expire-with-missing-objects-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-873/dscho/reflog-expire-with-missing-objects-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/873

 builtin/reflog.c  |  3 +++
 t/t1410-reflog.sh | 26 ++++++++++++++++++++++++++
 2 files changed, 29 insertions(+)


base-commit: 1d4f2316c5b767ccbf20cc3d55c98d1f92e6e1ce

Comments

Junio C Hamano Feb. 10, 2021, 8:30 p.m. UTC | #1
"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> Whenever a user runs `git reflog expire --stale-fix`, the most likely
> reason is that their repository is at least _somewhat_ corrupt. Which
> means that it is more than just possible that some objects are missing.
>
> If that is the case, that can currently let the command abort through
> the phase where it tries to mark all reachable objects.
>
> Instead of adding insult to injury, let's be gentle and continue as best
> as we can in such a scenario, simply by ignoring the missing objects and
> moving on.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---

I am of two minds.

I appreciate an effort of making it looser and less likely to get in
trouble when running in a corrupt repository, but --stale-fix is a
bit special and should probably be removed.

The only reason the option was added was because we needed to have
an easy way to recover from a specific kind of reflog corruption
that would have resulted by a (then known) bug in "git reflog" in
the released version of Git that came immediately before the version
of Git that added the "fix" option, while the root cause of the
corruption got fixed.  

Back when 1389d9dd (reflog expire --fix-stale, 2007-01-06) was
written, it was very useful to have a way to recover from the
corruption likely to have happened with the version of Git that came
before it.  But it no longer is relevant after this many years.
There may be other ways to break the reflog entries, but --stale-fix
was never designed to deal with anything but a specific way the
reflog gets corrupted (namely, by the old version of Git that
corrupted reflog in a specific way), so keeping it would not be very
useful.
Jeff King Feb. 11, 2021, 11:10 a.m. UTC | #2
On Wed, Feb 10, 2021 at 12:30:27PM -0800, Junio C Hamano wrote:

> I appreciate an effort of making it looser and less likely to get in
> trouble when running in a corrupt repository, but --stale-fix is a
> bit special and should probably be removed.
> 
> The only reason the option was added was because we needed to have
> an easy way to recover from a specific kind of reflog corruption
> that would have resulted by a (then known) bug in "git reflog" in
> the released version of Git that came immediately before the version
> of Git that added the "fix" option, while the root cause of the
> corruption got fixed.
> 
> Back when 1389d9dd (reflog expire --fix-stale, 2007-01-06) was
> written, it was very useful to have a way to recover from the
> corruption likely to have happened with the version of Git that came
> before it.  But it no longer is relevant after this many years.
> There may be other ways to break the reflog entries, but --stale-fix
> was never designed to deal with anything but a specific way the
> reflog gets corrupted (namely, by the old version of Git that
> corrupted reflog in a specific way), so keeping it would not be very
> useful.

FWIW, I have used --stale-fix for cases where a repo's reflog was
"corrupted" by its alternate pruning objects.

E.g., if you do something like this:

  git clone -s orig.git new.git

you're now at risk of "git gc" in orig.git corrupting new.git, because
its reachability check doesn't know anything about those refs. You can
mitigate that by keeping a copy of new.git's refs in orig.git. Something
like:

  git -C orig.git fetch ../new.git refs/*:refs/preserve/*
  git -C orig.git gc

But that doesn't know about reflogs at all! It will still corrupt them
(but only sometimes; depending how often you do that fetch, you might
catch the same values in orig.git's reflog).

Possibly the correct answer here is "turn off reflogs in new.git", but
they are sometimes useful, and things _mostly_ work (for history that
doesn't rewind, or where the rewound bits are specific to new.git). So
it's useful to be able to run something like "reflog expire --stale-fix"
to clear out the occasional problem.

(A careful reader will note that objects mentioned only in the index
have a similar problem; but again, those tend to be local to new.git,
and don't exist at all in a server context).

-Peff
Johannes Schindelin Feb. 12, 2021, 4:04 p.m. UTC | #3
Hi,

On Thu, 11 Feb 2021, Jeff King wrote:

> On Wed, Feb 10, 2021 at 12:30:27PM -0800, Junio C Hamano wrote:
>
> > I appreciate an effort of making it looser and less likely to get in
> > trouble when running in a corrupt repository, but --stale-fix is a
> > bit special and should probably be removed.
> >
> > The only reason the option was added was because we needed to have
> > an easy way to recover from a specific kind of reflog corruption
> > that would have resulted by a (then known) bug in "git reflog" in
> > the released version of Git that came immediately before the version
> > of Git that added the "fix" option, while the root cause of the
> > corruption got fixed.
> >
> > Back when 1389d9dd (reflog expire --fix-stale, 2007-01-06) was
> > written, it was very useful to have a way to recover from the
> > corruption likely to have happened with the version of Git that came
> > before it.  But it no longer is relevant after this many years.
> > There may be other ways to break the reflog entries, but --stale-fix
> > was never designed to deal with anything but a specific way the
> > reflog gets corrupted (namely, by the old version of Git that
> > corrupted reflog in a specific way), so keeping it would not be very
> > useful.

Thank you for the original historical context.

> FWIW, I have used --stale-fix for cases where a repo's reflog was
> "corrupted" by its alternate pruning objects.
>
> E.g., if you do something like this:
>
>   git clone -s orig.git new.git
>
> you're now at risk of "git gc" in orig.git corrupting new.git, because
> its reachability check doesn't know anything about those refs. You can
> mitigate that by keeping a copy of new.git's refs in orig.git. Something
> like:
>
>   git -C orig.git fetch ../new.git refs/*:refs/preserve/*
>   git -C orig.git gc
>
> But that doesn't know about reflogs at all! It will still corrupt them
> (but only sometimes; depending how often you do that fetch, you might
> catch the same values in orig.git's reflog).
>
> Possibly the correct answer here is "turn off reflogs in new.git", but
> they are sometimes useful, and things _mostly_ work (for history that
> doesn't rewind, or where the rewound bits are specific to new.git). So
> it's useful to be able to run something like "reflog expire --stale-fix"
> to clear out the occasional problem.
>
> (A careful reader will note that objects mentioned only in the index
> have a similar problem; but again, those tend to be local to new.git,
> and don't exist at all in a server context).

I want to add the experience with that half year during which `git gc`
with worktrees was broken, as it would happily ignore the reflogs of the
worktree-specific `HEAD`s, all except the one from the worktree from which
`git gc` was run. That was some fun time, and `--stale-fix` was a life
saver.

With this _additional_ historical context, I would deem it wise to keep
`--stale-fix` around.

Ciao,
Dscho
Junio C Hamano Feb. 12, 2021, 7:35 p.m. UTC | #4
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

>> Possibly the correct answer here is "turn off reflogs in new.git", but
>> they are sometimes useful, and things _mostly_ work (for history that
>> doesn't rewind, or where the rewound bits are specific to new.git). So
>> it's useful to be able to run something like "reflog expire --stale-fix"
>> to clear out the occasional problem.
>>
>> (A careful reader will note that objects mentioned only in the index
>> have a similar problem; but again, those tend to be local to new.git,
>> and don't exist at all in a server context).
>
> I want to add the experience with that half year during which `git gc`
> with worktrees was broken, as it would happily ignore the reflogs of the
> worktree-specific `HEAD`s, all except the one from the worktree from which
> `git gc` was run. That was some fun time, and `--stale-fix` was a life
> saver.

The option was invented for a specific case, but if its "fix"
applies for breakages caused by more recent bugs and user induced
actions, I would agree that that it gives us a good reason to keep
it around.

I have to wonder if the explanation in the documentation for the
option needs to be made less specific to the originally motivating
case, though.

Thanks.
diff mbox series

Patch

diff --git a/builtin/reflog.c b/builtin/reflog.c
index ca1d8079f320..09541d1c8048 100644
--- a/builtin/reflog.c
+++ b/builtin/reflog.c
@@ -602,6 +602,9 @@  static int cmd_reflog_expire(int argc, const char **argv, const char *prefix)
 	 */
 	if (cb.cmd.stalefix) {
 		repo_init_revisions(the_repository, &cb.cmd.revs, prefix);
+		cb.cmd.revs.do_not_die_on_missing_tree = 1;
+		cb.cmd.revs.ignore_missing = 1;
+		cb.cmd.revs.ignore_missing_links = 1;
 		if (flags & EXPIRE_REFLOGS_VERBOSE)
 			printf(_("Marking reachable objects..."));
 		mark_reachable_objects(&cb.cmd.revs, 0, 0, NULL);
diff --git a/t/t1410-reflog.sh b/t/t1410-reflog.sh
index ecccaa063453..27b9080251a9 100755
--- a/t/t1410-reflog.sh
+++ b/t/t1410-reflog.sh
@@ -158,6 +158,32 @@  test_expect_success 'reflog expire' '
 	check_fsck "dangling commit $K"
 '
 
+test_expect_success '--stale-fix handles missing objects generously' '
+	git -c core.logAllRefUpdates=false fast-import --date-format=now <<-EOS &&
+	commit refs/heads/stale-fix
+	mark :1
+	committer Author <a@uth.or> now
+	data <<EOF
+	start stale fix
+	EOF
+	M 100644 inline file
+	data <<EOF
+	contents
+	EOF
+	commit refs/heads/stale-fix
+	committer Author <a@uth.or> now
+	data <<EOF
+	stale fix branch tip
+	EOF
+	from :1
+	EOS
+
+	parent_oid=$(git rev-parse stale-fix^) &&
+	test_when_finished "recover $parent_oid" &&
+	corrupt $parent_oid &&
+	git reflog expire --stale-fix
+'
+
 test_expect_success 'prune and fsck' '
 
 	git prune &&