Message ID | Z5r679AyETgMO5Ge@ArchLinux (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | add more ref consistency checks | expand |
shejialuo <shejialuo@gmail.com> writes: > Although this behavior has no harm for the program, it will > short-circuit the program. When the users execute "git refs verify" or > "git fsck", we don't want to simply die the program but rather show the > warnings or errors as many as possible to info the users. "info" is not a verb; "inform"? I can understand what you want to say with "show the warnings or errors as many as possible", but giving errors on the same issue many times is not what you meant---rather, you want the checker to keep going and discover errors in many _other things_, after it finds a single error in "HEAD". ..., we do want to diagnose a broken "HEAD", but we want to notice as many breakages on other refs as we can instead of dying after finding the first breakage. Dying on a broken "HEAD" done by get_worktrees() goes against this goal. or something, perhaps. Such a rewrite makes the sentence "Although ... short-circuit the program." unnecessary. > So, we should > avoid reading the head info. With one reservation. We still want to diagnose a broken "HEAD", so I'd probably strike this sentence out, and add a statement that says we still check the contents of "HEAD" elsewhere as a substitute at the end of the proposed commit log message, if I were writing it, after explaining the use of get_worktrees_without_reading_head() you did in the following two paragraphs (both of which read well). Thanks.
On Thu, Jan 30, 2025 at 10:04:45AM -0800, Junio C Hamano wrote: > shejialuo <shejialuo@gmail.com> writes: > > > Although this behavior has no harm for the program, it will > > short-circuit the program. When the users execute "git refs verify" or > > "git fsck", we don't want to simply die the program but rather show the > > warnings or errors as many as possible to info the users. > > "info" is not a verb; "inform"? > Let me improve this in the next version. > I can understand what you want to say with "show the warnings or > errors as many as possible", but giving errors on the same issue > many times is not what you meant Yes, this is correct. > ---rather, you want the checker to > keep going and discover errors in many _other things_, after it > finds a single error in "HEAD". > I think this is a misunderstanding. Let me explain more to you. 1. If the content of the "HEAD" is not correct, we won't detect the current directory as valid git repository. 2. If the referent of the "HEAD" is not in the "packed-refs", the referent must be a loose ref or don't exist. In this situation, because we will never touch the packed backend. 3. If the referent of the "HEAD" is in the "packed-refs", it will call "create_snapshot" to create the snapshot. In this function, it would call "verify_buffer_safe" to check the following things: 1. Check the correctness of th header. 2. Check via "verify_buffer_safe" method So, even the referent entry is not correct in the "packed-refs", the program won't die. But the above two cases will let the program die. I want to say we cannot find any error in "HEAD" at now as above described. From my perspective, we should retain the paragraph: > Although this behavior has no harm for the program... But we should change the statement > we don't want to simply die the program but rather show the > warnings or errors as many as possible to info the users. So, we should > avoid reading the head info. to We should avoid reading the head information, which may execute the read operation in packed backend with stricter checks to die the program. Instead, we should continue to check other parts of the "packed-refs" file completely. > With one reservation. We still want to diagnose a broken "HEAD", so > I'd probably strike this sentence out, and add a statement that says > we still check the contents of "HEAD" elsewhere as a substitute at > the end of the proposed commit log message, if I were writing it, > after explaining the use of get_worktrees_without_reading_head() > you did in the following two paragraphs (both of which read well). I want to say that we cannot check the content of the "HEAD" itself. If the content of "HEAD" is not correct, we cannot detect the current directory as a valid git repository. So, there is no need to say "we will check the contents of 'HEAD' else where". I think the misunderstanding is that you think that if the "HEAD" is not correct, the program will die but actually it is not. Thanks, Jialuo
shejialuo <shejialuo@gmail.com> writes: > I want to say that we cannot check the content of the "HEAD" itself. If > the content of "HEAD" is not correct, we cannot detect the current > directory as a valid git repository. So, there is no need to say "we > will check the contents of 'HEAD' else where". Instead you should say "we detected your HEAD is broken" somewhere in the documentation for this, and then the end-user should get a message to telling them about the broken HEAD in such a case, though.
diff --git a/builtin/refs.c b/builtin/refs.c index a29f195834..55ff5dae11 100644 --- a/builtin/refs.c +++ b/builtin/refs.c @@ -88,7 +88,7 @@ static int cmd_refs_verify(int argc, const char **argv, const char *prefix, git_config(git_fsck_config, &fsck_refs_options); prepare_repo_settings(the_repository); - worktrees = get_worktrees(); + worktrees = get_worktrees_without_reading_head(); for (size_t i = 0; worktrees[i]; i++) ret |= refs_fsck(get_worktree_ref_store(worktrees[i]), &fsck_refs_options, worktrees[i]); diff --git a/worktree.c b/worktree.c index 248bbb39d4..89b7d86cef 100644 --- a/worktree.c +++ b/worktree.c @@ -175,6 +175,11 @@ struct worktree **get_worktrees(void) return get_worktrees_internal(0); } +struct worktree **get_worktrees_without_reading_head(void) +{ + return get_worktrees_internal(1); +} + const char *get_worktree_git_dir(const struct worktree *wt) { if (!wt) diff --git a/worktree.h b/worktree.h index 38145df80f..1ba4a161a0 100644 --- a/worktree.h +++ b/worktree.h @@ -30,6 +30,12 @@ struct worktree { */ struct worktree **get_worktrees(void); +/* + * Like `get_worktrees`, but does not read HEAD. This is useful when checking + * the consistency, as reading HEAD may not be necessary. + */ +struct worktree **get_worktrees_without_reading_head(void); + /* * Returns 1 if linked worktrees exist, 0 otherwise. */
In "packed-backend.c", there are some functions such as "create_snapshot" and "next_record" which would check the correctness of the content of the "packed-ref" file. When anything is bad, the program will die. It may seem that we have nothing relevant to above feature, because we are going to read and parse the raw "packed-ref" file without creating the snapshot and using the ref iterator to check the consistency. However, when using "get_worktrees" in "builtin/refs", we would parse the "HEAD" information. If the referent of the "HEAD" is inside the "packed-ref", we will call "create_snapshot" function to parse the "packed-ref" to get the information. No matter whether the entry of "HEAD" in "packed-ref" is correct, "create_snapshot" would call "verify_buffer_safe" to check whether there is a newline in the last line of the file. If not, the program will die. Although this behavior has no harm for the program, it will short-circuit the program. When the users execute "git refs verify" or "git fsck", we don't want to simply die the program but rather show the warnings or errors as many as possible to info the users. So, we should avoid reading the head info. Fortunately, in 465a22b338 (worktree: skip reading HEAD when repairing worktrees, 2023-12-29), we have introduced a function "get_worktrees_internal" which allows us to get worktrees without reading head info. Create a new exposed function "get_worktrees_without_reading_head", then replace the "get_worktrees" in "builtin/refs" with the new created function. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: shejialuo <shejialuo@gmail.com> --- builtin/refs.c | 2 +- worktree.c | 5 +++++ worktree.h | 6 ++++++ 3 files changed, 12 insertions(+), 1 deletion(-)