diff mbox series

[4/3] fsck: check even zero-entry index files

Message ID Y/vdV4bjorvRYoaR@coredump.intra.peff.net (mailing list archive)
State Accepted
Commit 8d3e7eac529b42319622692028b45670bdff8835
Headers show
Series fsck index files from all worktrees | expand

Commit Message

Jeff King Feb. 26, 2023, 10:29 p.m. UTC
On Fri, Feb 24, 2023 at 09:30:44AM -0800, Junio C Hamano wrote:

> So we had a separate worktree with its index pointing at an object
> by its resolve-undo (or cache-tree) extension, but somehow lost that
> object to gc (I agree with your assessment that it should no longer
> happen since 2017).  gc these days knows about looking at the index
> of all worktrees, finds the issue, and stops for safety.  fsck that
> is run in the primary worktree may not have noticed but fsck run
> from that worktree would notice the issue.
> 
> Sounds like a frustrating one.  
> 
> Thanks, both, for finding and fixing.

I saw that this hit next, but I had a few fixups that I had planned to
squash in. I saw you got the leak-fix one, but I have one more. Since
this is the end of the cycle, we _could_ just squash it in when we
rewind next. But having now written it as a patch on top, I think the
explanation kind of merits its own commit.

-- >8 --
Subject: [PATCH] fsck: check even zero-entry index files

In fb64ca526a (fsck: check index files in all worktrees, 2023-02-24), we
swapped out a call to vanilla repo_read_index() for a series of
read_index_from() calls, one per worktree. The code for the latter was
copied from add_index_objects_to_pending(), which checks for a positive
return value from the index reading function, and we do the same here in
fsck now.

But this is probably the wrong thing. I had interpreted the check as
"don't operate on the index struct if there was an error". But in
reality, if there is an error then the index-reading code will simply
die (which admittedly is not great for fsck, but that is not a new
problem).

The return value here is actually the number of entries read. So it
makes sense for add_index_objects_to_pending() to ignore a zero-entry
index (there is nothing to add). But for fsck, we would still want to
check any extensions, etc (though presumably it is unlikely to have them
in an empty index, I don't think it's impossible).

So we should ignore the return value from read_index_from() entirely.
This matches the behavior before fb64ca526a, when we ignored the return
value from repo_read_index().

Signed-off-by: Jeff King <peff@peff.net>
---
On top of jk/fsck-indices-in-worktrees.

 builtin/fsck.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

Comments

Derrick Stolee Feb. 27, 2023, 12:09 p.m. UTC | #1
On 2/26/23 5:29 PM, Jeff King wrote:
> On Fri, Feb 24, 2023 at 09:30:44AM -0800, Junio C Hamano wrote:
> 
>> So we had a separate worktree with its index pointing at an object
>> by its resolve-undo (or cache-tree) extension, but somehow lost that
>> object to gc (I agree with your assessment that it should no longer
>> happen since 2017).  gc these days knows about looking at the index
>> of all worktrees, finds the issue, and stops for safety.  fsck that
>> is run in the primary worktree may not have noticed but fsck run
>> from that worktree would notice the issue.
>>
>> Sounds like a frustrating one.  
>>
>> Thanks, both, for finding and fixing.
> 
> I saw that this hit next, but I had a few fixups that I had planned to
> squash in. I saw you got the leak-fix one, but I have one more. Since
> this is the end of the cycle, we _could_ just squash it in when we
> rewind next. But having now written it as a patch on top, I think the
> explanation kind of merits its own commit.

I just read all four (and a half) patches and agree that this
is a valuable change. Thanks for working on it.

-Stolee
Junio C Hamano Feb. 27, 2023, 3:58 p.m. UTC | #2
Jeff King <peff@peff.net> writes:

> The return value here is actually the number of entries read. So it
> makes sense for add_index_objects_to_pending() to ignore a zero-entry
> index (there is nothing to add). But for fsck, we would still want to
> check any extensions, etc (though presumably it is unlikely to have them
> in an empty index, I don't think it's impossible).

Good thinking.

Not all extensions record what needs to be fed to the reachability
machinery for fsck, but resolve-undo wants to record object names
that used to be in the directory (at higher stages) when they are
removed, so I think it is entirely possible for an index with no
entries to have index extensions that fsck needs to pay attention
to.

> So we should ignore the return value from read_index_from() entirely.
> This matches the behavior before fb64ca526a, when we ignored the return
> value from repo_read_index().

Good.  Thanks.

>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> On top of jk/fsck-indices-in-worktrees.
>
>  builtin/fsck.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/builtin/fsck.c b/builtin/fsck.c
> index 1b032eebb1..64614b43b2 100644
> --- a/builtin/fsck.c
> +++ b/builtin/fsck.c
> @@ -1007,9 +1007,8 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
>  			 * while we're examining the index.
>  			 */
>  			path = xstrdup(worktree_git_path(wt, "index"));
> -			if (read_index_from(&istate, path,
> -					    get_worktree_git_dir(wt)) > 0)
> -				fsck_index(&istate, path, wt->is_current);
> +			read_index_from(&istate, path, get_worktree_git_dir(wt));
> +			fsck_index(&istate, path, wt->is_current);
>  			discard_index(&istate);
>  			free(path);
>  		}
diff mbox series

Patch

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 1b032eebb1..64614b43b2 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -1007,9 +1007,8 @@  int cmd_fsck(int argc, const char **argv, const char *prefix)
 			 * while we're examining the index.
 			 */
 			path = xstrdup(worktree_git_path(wt, "index"));
-			if (read_index_from(&istate, path,
-					    get_worktree_git_dir(wt)) > 0)
-				fsck_index(&istate, path, wt->is_current);
+			read_index_from(&istate, path, get_worktree_git_dir(wt));
+			fsck_index(&istate, path, wt->is_current);
 			discard_index(&istate);
 			free(path);
 		}