Message ID | Zvj-jkFE9NN30uDl@ArchLinux (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | add ref content check for files backend | expand |
On Sun, Sep 29, 2024 at 03:15:26PM +0800, shejialuo wrote: > We have already set up the infrastructure to check the consistency for > refs, but we do not support multiple worktrees. As we decide to add more > checks for ref content, we need to set up support for multiple > worktrees. Use "get_worktrees" and "get_worktree_ref_store" to check > refs under the worktrees. Makes sense. > Because we should only check once for "packed-refs", let's call the fsck > function for packed-backend when in the main worktree. In order to know > which directory we check, we should default print this information > instead of specifying "--verbose". This change should likely be evicted into its own commit with a bit more explanation. > It's not suitable to print these information to the stderr. So, change > to stdout. This one, too. Why exactly is in not suitable to print to stderr? > Mentored-by: Patrick Steinhardt <ps@pks.im> > Mentored-by: Karthik Nayak <karthik.188@gmail.com> > Signed-off-by: shejialuo <shejialuo@gmail.com> > --- > builtin/refs.c | 11 ++++++-- > refs/files-backend.c | 18 ++++++++---- > t/t0602-reffiles-fsck.sh | 59 ++++++++++++++++++++++++++++++++++++++++ > 3 files changed, 81 insertions(+), 7 deletions(-) > > diff --git a/builtin/refs.c b/builtin/refs.c > index 24978a7b7b..3c492ea922 100644 > --- a/builtin/refs.c > +++ b/builtin/refs.c > @@ -5,6 +5,7 @@ > #include "parse-options.h" > #include "refs.h" > #include "strbuf.h" > +#include "worktree.h" > > #define REFS_MIGRATE_USAGE \ > N_("git refs migrate --ref-format=<format> [--dry-run]") > @@ -66,6 +67,7 @@ static int cmd_refs_migrate(int argc, const char **argv, const char *prefix) > static int cmd_refs_verify(int argc, const char **argv, const char *prefix) > { > struct fsck_options fsck_refs_options = FSCK_REFS_OPTIONS_DEFAULT; > + struct worktree **worktrees, **p; > const char * const verify_usage[] = { > REFS_VERIFY_USAGE, > NULL, > @@ -75,7 +77,7 @@ static int cmd_refs_verify(int argc, const char **argv, const char *prefix) > OPT_BOOL(0, "strict", &fsck_refs_options.strict, N_("enable strict checking")), > OPT_END(), > }; > - int ret; > + int ret = 0; > > argc = parse_options(argc, argv, prefix, options, verify_usage, 0); > if (argc) > @@ -84,9 +86,14 @@ static int cmd_refs_verify(int argc, const char **argv, const char *prefix) > git_config(git_fsck_config, &fsck_refs_options); > prepare_repo_settings(the_repository); > > - ret = refs_fsck(get_main_ref_store(the_repository), &fsck_refs_options); > + worktrees = get_worktrees(); > + for (p = worktrees; *p; p++) { > + struct worktree *wt = *p; > + ret += refs_fsck(get_worktree_ref_store(wt), &fsck_refs_options); > + } I think it is more customary to say `ret |=` instead of `ref +=`. Otherwise we could at least in theory wrap around and even land at `ret == 0`, even though this is quite unlikely. > fsck_options_clear(&fsck_refs_options); > + free_worktrees(worktrees); > return ret; > } > [snip] > @@ -3600,8 +3600,16 @@ static int files_fsck(struct ref_store *ref_store, > struct files_ref_store *refs = > files_downcast(ref_store, REF_STORE_READ, "fsck"); > > - return files_fsck_refs(ref_store, o) | > - refs->packed_ref_store->be->fsck(refs->packed_ref_store, o); > + int ret = files_fsck_refs(ref_store, o); > + > + /* > + * packed-refs should only be checked once because it is shared > + * between all worktrees. > + */ > + if (!strcmp(ref_store->gitdir, ref_store->repo->gitdir)) > + ret += refs->packed_ref_store->be->fsck(refs->packed_ref_store, o); > + > + return ret; > } > > struct ref_storage_be refs_be_files = { What is the current behaviour? Is it that we verify the packed-refs file multiple times, or rather that we call `packed_ref_store->be->fsck()` many times even though we know it won't do anything for anything except for the main worktree? If it is the former I very much agree that we should make this conditional. If it's the latter I'm more in the camp of letting it be such that if worktrees were to ever gain support for "packed-refs" we wouldn't have to change anything. In any case, as proposed I think it would make sense to evict this into a standalone commit such that these details can be explained in the commit message. Patrick
On Mon, Oct 07, 2024 at 08:58:30AM +0200, Patrick Steinhardt wrote: > On Sun, Sep 29, 2024 at 03:15:26PM +0800, shejialuo wrote: > > We have already set up the infrastructure to check the consistency for > > refs, but we do not support multiple worktrees. As we decide to add more > > checks for ref content, we need to set up support for multiple > > worktrees. Use "get_worktrees" and "get_worktree_ref_store" to check > > refs under the worktrees. > > Makes sense. > > > Because we should only check once for "packed-refs", let's call the fsck > > function for packed-backend when in the main worktree. In order to know > > which directory we check, we should default print this information > > instead of specifying "--verbose". > > This change should likely be evicted into its own commit with a bit more > explanation. > > > It's not suitable to print these information to the stderr. So, change > > to stdout. > > This one, too. Why exactly is in not suitable to print to stderr? > I am sorry for the confusion. We should not print which directory we check here into stderr. Because I think this will make test script contain many unrelated info when using "git refs verify 2>err". The reason here is when checking the consistency of refs in multiple worktrees. The ref name could be repeat. For example, worktree A has its own ref called "test" under ".git/worktrees/A/refs/worktree/test" and worktree B has its own ref still called "test" under ".git/worktrees/B/refs/worktree/test". However, the refname would be printed to "refs/worktree/test". It will make the user confused which "refs/worktree/test" is checked. So, we should print this information like: Checking references consistency in .git ... checking references consistency in .git/worktrees/A ... checking references consistency in .git/worktrees/B However, when writing this, I feel a ".git" is a bad usage. It will make the user think it will check everything here. This should be improved in the next version. > > @@ -75,7 +77,7 @@ static int cmd_refs_verify(int argc, const char **argv, const char *prefix) > > OPT_BOOL(0, "strict", &fsck_refs_options.strict, N_("enable strict checking")), > > OPT_END(), > > }; > > - int ret; > > + int ret = 0; > > > > argc = parse_options(argc, argv, prefix, options, verify_usage, 0); > > if (argc) > > @@ -84,9 +86,14 @@ static int cmd_refs_verify(int argc, const char **argv, const char *prefix) > > git_config(git_fsck_config, &fsck_refs_options); > > prepare_repo_settings(the_repository); > > > > - ret = refs_fsck(get_main_ref_store(the_repository), &fsck_refs_options); > > + worktrees = get_worktrees(); > > + for (p = worktrees; *p; p++) { > > + struct worktree *wt = *p; > > + ret += refs_fsck(get_worktree_ref_store(wt), &fsck_refs_options); > > + } > > I think it is more customary to say `ret |=` instead of `ref +=`. > Otherwise we could at least in theory wrap around and even land at `ret > == 0`, even though this is quite unlikely. > I agree here. I will improve this in the next version. [snip] > > @@ -3600,8 +3600,16 @@ static int files_fsck(struct ref_store *ref_store, > > struct files_ref_store *refs = > > files_downcast(ref_store, REF_STORE_READ, "fsck"); > > > > - return files_fsck_refs(ref_store, o) | > > - refs->packed_ref_store->be->fsck(refs->packed_ref_store, o); > > + int ret = files_fsck_refs(ref_store, o); > > + > > + /* > > + * packed-refs should only be checked once because it is shared > > + * between all worktrees. > > + */ > > + if (!strcmp(ref_store->gitdir, ref_store->repo->gitdir)) > > + ret += refs->packed_ref_store->be->fsck(refs->packed_ref_store, o); > > + > > + return ret; > > } > > > > struct ref_storage_be refs_be_files = { > > What is the current behaviour? Is it that we verify the packed-refs file > multiple times, or rather that we call `packed_ref_store->be->fsck()` > many times even though we know it won't do anything for anything except > for the main worktree? > That's a good question. I think the second is the current behaviour. We will call `packed_ref_store->be->fsck()` many times. I understand what you mean here, we just put the check into `packed_ref_store->be->fsck()` function. > If it is the former I very much agree that we should make this > conditional. If it's the latter I'm more in the camp of letting it be > such that if worktrees were to ever gain support for "packed-refs" we > wouldn't have to change anything. > I agree. > In any case, as proposed I think it would make sense to evict this into > a standalone commit such that these details can be explained in the > commit message. > Yes, the current commit message lacks of details. > Patrick Thanks, Jialuo
On Mon, Oct 07, 2024 at 04:42:21PM +0800, shejialuo wrote: > On Mon, Oct 07, 2024 at 08:58:30AM +0200, Patrick Steinhardt wrote: > > On Sun, Sep 29, 2024 at 03:15:26PM +0800, shejialuo wrote: > > > We have already set up the infrastructure to check the consistency for > > > refs, but we do not support multiple worktrees. As we decide to add more > > > checks for ref content, we need to set up support for multiple > > > worktrees. Use "get_worktrees" and "get_worktree_ref_store" to check > > > refs under the worktrees. > > > > Makes sense. > > > > > Because we should only check once for "packed-refs", let's call the fsck > > > function for packed-backend when in the main worktree. In order to know > > > which directory we check, we should default print this information > > > instead of specifying "--verbose". > > > > This change should likely be evicted into its own commit with a bit more > > explanation. > > > > > It's not suitable to print these information to the stderr. So, change > > > to stdout. > > > > This one, too. Why exactly is in not suitable to print to stderr? > > > > I am sorry for the confusion. We should not print which directory we > check here into stderr. Because I think this will make test script > contain many unrelated info when using "git refs verify 2>err". > > The reason here is when checking the consistency of refs in multiple > worktrees. The ref name could be repeat. For example, worktree A > has its own ref called "test" under ".git/worktrees/A/refs/worktree/test" > and worktree B has its own ref still called "test" under > ".git/worktrees/B/refs/worktree/test". > > However, the refname would be printed to "refs/worktree/test". It will > make the user confused which "refs/worktree/test" is checked. So, we > should print this information like: > > Checking references consistency in .git > ... > checking references consistency in .git/worktrees/A > ... > checking references consistency in .git/worktrees/B > > However, when writing this, I feel a ".git" is a bad usage. It will make > the user think it will check everything here. This should be improved in > the next version. But wouldn't it be the better solution if we printed the fully-qualified reference name "worktrees/worktree/refs/worktree/test" instead? That would remove the need to say which directory we're currently verifying in the first place. Patrick
On Mon, Oct 07, 2024 at 11:16:19AM +0200, Patrick Steinhardt wrote: [snip] > > However, the refname would be printed to "refs/worktree/test". It will > > make the user confused which "refs/worktree/test" is checked. So, we > > should print this information like: > > > > Checking references consistency in .git > > ... > > checking references consistency in .git/worktrees/A > > ... > > checking references consistency in .git/worktrees/B > > > > However, when writing this, I feel a ".git" is a bad usage. It will make > > the user think it will check everything here. This should be improved in > > the next version. > > But wouldn't it be the better solution if we printed the fully-qualified > reference name "worktrees/worktree/refs/worktree/test" instead? That > would remove the need to say which directory we're currently verifying > in the first place. > Good idea. I will use this way in the next version. > Patrick Thanks
diff --git a/builtin/refs.c b/builtin/refs.c index 24978a7b7b..3c492ea922 100644 --- a/builtin/refs.c +++ b/builtin/refs.c @@ -5,6 +5,7 @@ #include "parse-options.h" #include "refs.h" #include "strbuf.h" +#include "worktree.h" #define REFS_MIGRATE_USAGE \ N_("git refs migrate --ref-format=<format> [--dry-run]") @@ -66,6 +67,7 @@ static int cmd_refs_migrate(int argc, const char **argv, const char *prefix) static int cmd_refs_verify(int argc, const char **argv, const char *prefix) { struct fsck_options fsck_refs_options = FSCK_REFS_OPTIONS_DEFAULT; + struct worktree **worktrees, **p; const char * const verify_usage[] = { REFS_VERIFY_USAGE, NULL, @@ -75,7 +77,7 @@ static int cmd_refs_verify(int argc, const char **argv, const char *prefix) OPT_BOOL(0, "strict", &fsck_refs_options.strict, N_("enable strict checking")), OPT_END(), }; - int ret; + int ret = 0; argc = parse_options(argc, argv, prefix, options, verify_usage, 0); if (argc) @@ -84,9 +86,14 @@ static int cmd_refs_verify(int argc, const char **argv, const char *prefix) git_config(git_fsck_config, &fsck_refs_options); prepare_repo_settings(the_repository); - ret = refs_fsck(get_main_ref_store(the_repository), &fsck_refs_options); + worktrees = get_worktrees(); + for (p = worktrees; *p; p++) { + struct worktree *wt = *p; + ret += refs_fsck(get_worktree_ref_store(wt), &fsck_refs_options); + } fsck_options_clear(&fsck_refs_options); + free_worktrees(worktrees); return ret; } diff --git a/refs/files-backend.c b/refs/files-backend.c index 03d2503276..57318b4c4e 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -3558,7 +3558,7 @@ static int files_fsck_refs_dir(struct ref_store *ref_store, } else if (S_ISREG(iter->st.st_mode) || S_ISLNK(iter->st.st_mode)) { if (o->verbose) - fprintf_ln(stderr, "Checking %s/%s", + fprintf_ln(stdout, "Checking %s/%s", refs_check_dir, iter->relative_path); for (size_t i = 0; fsck_refs_fn[i]; i++) { if (fsck_refs_fn[i](ref_store, o, refs_check_dir, iter)) @@ -3589,8 +3589,8 @@ static int files_fsck_refs(struct ref_store *ref_store, NULL, }; - if (o->verbose) - fprintf_ln(stderr, _("Checking references consistency")); + fprintf_ln(stdout, _("Checking references consistency in %s"), + ref_store->gitdir); return files_fsck_refs_dir(ref_store, o, "refs", fsck_refs_fn); } @@ -3600,8 +3600,16 @@ static int files_fsck(struct ref_store *ref_store, struct files_ref_store *refs = files_downcast(ref_store, REF_STORE_READ, "fsck"); - return files_fsck_refs(ref_store, o) | - refs->packed_ref_store->be->fsck(refs->packed_ref_store, o); + int ret = files_fsck_refs(ref_store, o); + + /* + * packed-refs should only be checked once because it is shared + * between all worktrees. + */ + if (!strcmp(ref_store->gitdir, ref_store->repo->gitdir)) + ret += refs->packed_ref_store->be->fsck(refs->packed_ref_store, o); + + return ret; } struct ref_storage_be refs_be_files = { diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh index 71a4d1a5ae..4c6cd6f7d0 100755 --- a/t/t0602-reffiles-fsck.sh +++ b/t/t0602-reffiles-fsck.sh @@ -89,4 +89,63 @@ test_expect_success 'ref name check should be adapted into fsck messages' ' test_must_be_empty err ' +test_expect_success 'ref name check should work for multiple worktrees' ' + test_when_finished "rm -rf repo" && + git init repo && + + cd repo && + test_commit initial && + git checkout -b branch-1 && + test_commit second && + git checkout -b branch-2 && + test_commit third && + git checkout -b branch-3 && + git worktree add ./worktree-1 branch-1 && + git worktree add ./worktree-2 branch-2 && + worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree && + worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree && + + ( + cd worktree-1 && + git update-ref refs/worktree/branch-4 refs/heads/branch-3 + ) && + ( + cd worktree-2 && + git update-ref refs/worktree/branch-4 refs/heads/branch-3 + ) && + + cp $worktree1_refdir_prefix/branch-4 $worktree1_refdir_prefix/.branch-2 && + cp $worktree2_refdir_prefix/branch-4 $worktree2_refdir_prefix/@ && + + test_must_fail git refs verify 2>err && + cat >expect <<-EOF && + error: refs/worktree/.branch-2: badRefName: invalid refname format + error: refs/worktree/@: badRefName: invalid refname format + EOF + sort err >sorted_err && + test_cmp expect sorted_err && + + ( + cd worktree-1 && + test_must_fail git refs verify 2>err && + cat >expect <<-EOF && + error: refs/worktree/.branch-2: badRefName: invalid refname format + error: refs/worktree/@: badRefName: invalid refname format + EOF + sort err >sorted_err && + test_cmp expect sorted_err + ) && + + ( + cd worktree-2 && + test_must_fail git refs verify 2>err && + cat >expect <<-EOF && + error: refs/worktree/.branch-2: badRefName: invalid refname format + error: refs/worktree/@: badRefName: invalid refname format + EOF + sort err >sorted_err && + test_cmp expect sorted_err + ) +' + test_done
We have already set up the infrastructure to check the consistency for refs, but we do not support multiple worktrees. As we decide to add more checks for ref content, we need to set up support for multiple worktrees. Use "get_worktrees" and "get_worktree_ref_store" to check refs under the worktrees. Because we should only check once for "packed-refs", let's call the fsck function for packed-backend when in the main worktree. In order to know which directory we check, we should default print this information instead of specifying "--verbose". It's not suitable to print these information to the stderr. So, change to stdout. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: shejialuo <shejialuo@gmail.com> --- builtin/refs.c | 11 ++++++-- refs/files-backend.c | 18 ++++++++---- t/t0602-reffiles-fsck.sh | 59 ++++++++++++++++++++++++++++++++++++++++ 3 files changed, 81 insertions(+), 7 deletions(-)