mbox series

[0/3] fsck index files from all worktrees

Message ID Y/hv0MXAyBY3HEo9@coredump.intra.peff.net (mailing list archive)
Headers show
Series fsck index files from all worktrees | expand

Message

Jeff King Feb. 24, 2023, 8:05 a.m. UTC
On Sat, Feb 18, 2023 at 10:38:33AM +0100, Johannes Sixt wrote:

> I see three problems here:
> 
> - git fsck should detect the problem (if it really is one) in the
> worktree index. It seems that it is just an index extension that is
> affected. Perhaps it should be just a warning, not an error.

We do fsck the resolve-undo extension, but I think fsck just doesn't
know anything about worktrees. That should be easy enough to fix.
Patches below.

> - If the objects mentioned in the index extension are precious, they
> should not have been garbage-collected in earlier rounds of git gc
> (which I certainly did at some point).

Correct, but the gc error you're getting indicates that we _are_ trying
to treat them as included. I wonder if you ran git-gc long ago with an
older version of Git, and this breakage was waiting to surface. AFAICT
this was all fixed by 8a044c7f1d (Merge branch 'nd/prune-in-worktree',
2017-09-19).

> - I can't git gc the repository now, which is particularly annoying when
> auto-gc is attempted after almost every git command. Of course, I know
> how to get out of the situation, but it took some time to identify the
> worktree index as the culprit. Not something that a beginner would be
> able to do easily.

I think in general that "oops, there's something corrupt" can be hard to
get out of, just because there are so many possibilities. But if we can
at least report the nature of the problem and the offending filename via
git-fsck, that would help with pointing people in the right direction.

> The repository I use for the above commands is attached. I hope vger
> doesn't strip it away.

Thanks, it was nice to have a test case. I ended up writing a separate
test with a missing blob, just because that's simpler to do. It looks
like we don't test fsck_resolve_undo() or fsck_cache_tree() at all. That
might be a nice addition, but I punted for now to stay focused on the
worktree aspects.

  [1/3]: fsck: factor out index fsck
  [2/3]: fsck: check index files in all worktrees
  [3/3]: fsck: mention file path for index errors

 builtin/fsck.c  | 93 ++++++++++++++++++++++++++++++++-----------------
 t/t1450-fsck.sh | 30 ++++++++++++++++
 2 files changed, 92 insertions(+), 31 deletions(-)

-Peff

Comments

Junio C Hamano Feb. 24, 2023, 5:30 p.m. UTC | #1
Jeff King <peff@peff.net> writes:

> We do fsck the resolve-undo extension, but I think fsck just doesn't
> know anything about worktrees. That should be easy enough to fix.
> Patches below.
> ...
> Thanks, it was nice to have a test case. I ended up writing a separate
> test with a missing blob, just because that's simpler to do. It looks
> like we don't test fsck_resolve_undo() or fsck_cache_tree() at all. That
> might be a nice addition, but I punted for now to stay focused on the
> worktree aspects.

So we had a separate worktree with its index pointing at an object
by its resolve-undo (or cache-tree) extension, but somehow lost that
object to gc (I agree with your assessment that it should no longer
happen since 2017).  gc these days knows about looking at the index
of all worktrees, finds the issue, and stops for safety.  fsck that
is run in the primary worktree may not have noticed but fsck run
from that worktree would notice the issue.

Sounds like a frustrating one.  

Thanks, both, for finding and fixing.
Johannes Sixt Feb. 26, 2023, 9:49 p.m. UTC | #2
Am 24.02.23 um 09:05 schrieb Jeff King:
> On Sat, Feb 18, 2023 at 10:38:33AM +0100, Johannes Sixt wrote:
> 
>> I see three problems here:
>>
>> - git fsck should detect the problem (if it really is one) in the
>> worktree index. It seems that it is just an index extension that is
>> affected. Perhaps it should be just a warning, not an error.
> 
> We do fsck the resolve-undo extension, but I think fsck just doesn't
> know anything about worktrees. That should be easy enough to fix.
> Patches below.
> 
>> - If the objects mentioned in the index extension are precious, they
>> should not have been garbage-collected in earlier rounds of git gc
>> (which I certainly did at some point).
> 
> Correct, but the gc error you're getting indicates that we _are_ trying
> to treat them as included. I wonder if you ran git-gc long ago with an
> older version of Git, and this breakage was waiting to surface. AFAICT
> this was all fixed by 8a044c7f1d (Merge branch 'nd/prune-in-worktree',
> 2017-09-19).

I don't know how I got into the situation. The worktree is a lot younger
than that and was made with a Git version young enough to include this
commit. I'll see if it happens again.

>> - I can't git gc the repository now, which is particularly annoying when
>> auto-gc is attempted after almost every git command. Of course, I know
>> how to get out of the situation, but it took some time to identify the
>> worktree index as the culprit. Not something that a beginner would be
>> able to do easily.
> 
> I think in general that "oops, there's something corrupt" can be hard to
> get out of, just because there are so many possibilities. But if we can
> at least report the nature of the problem and the offending filename via
> git-fsck, that would help with pointing people in the right direction.

Agreed. Thanks a lot for the patches, they are certainly helpful.

-- Hannes