diff mbox series

[v5,2/9] builtin/refs: support multiple worktrees check for refs.

Message ID Zvj-jkFE9NN30uDl@ArchLinux (mailing list archive)
State New
Headers show
Series add ref content check for files backend | expand

Commit Message

shejialuo Sept. 29, 2024, 7:15 a.m. UTC
We have already set up the infrastructure to check the consistency for
refs, but we do not support multiple worktrees. As we decide to add more
checks for ref content, we need to set up support for multiple
worktrees. Use "get_worktrees" and "get_worktree_ref_store" to check
refs under the worktrees.

Because we should only check once for "packed-refs", let's call the fsck
function for packed-backend when in the main worktree. In order to know
which directory we check, we should default print this information
instead of specifying "--verbose".

It's not suitable to print these information to the stderr. So, change
to stdout.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
 builtin/refs.c           | 11 ++++++--
 refs/files-backend.c     | 18 ++++++++----
 t/t0602-reffiles-fsck.sh | 59 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 81 insertions(+), 7 deletions(-)

Comments

Patrick Steinhardt Oct. 7, 2024, 6:58 a.m. UTC | #1
On Sun, Sep 29, 2024 at 03:15:26PM +0800, shejialuo wrote:
> We have already set up the infrastructure to check the consistency for
> refs, but we do not support multiple worktrees. As we decide to add more
> checks for ref content, we need to set up support for multiple
> worktrees. Use "get_worktrees" and "get_worktree_ref_store" to check
> refs under the worktrees.

Makes sense.

> Because we should only check once for "packed-refs", let's call the fsck
> function for packed-backend when in the main worktree. In order to know
> which directory we check, we should default print this information
> instead of specifying "--verbose".

This change should likely be evicted into its own commit with a bit more
explanation.

> It's not suitable to print these information to the stderr. So, change
> to stdout.

This one, too. Why exactly is in not suitable to print to stderr?

> Mentored-by: Patrick Steinhardt <ps@pks.im>
> Mentored-by: Karthik Nayak <karthik.188@gmail.com>
> Signed-off-by: shejialuo <shejialuo@gmail.com>
> ---
>  builtin/refs.c           | 11 ++++++--
>  refs/files-backend.c     | 18 ++++++++----
>  t/t0602-reffiles-fsck.sh | 59 ++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 81 insertions(+), 7 deletions(-)
> 
> diff --git a/builtin/refs.c b/builtin/refs.c
> index 24978a7b7b..3c492ea922 100644
> --- a/builtin/refs.c
> +++ b/builtin/refs.c
> @@ -5,6 +5,7 @@
>  #include "parse-options.h"
>  #include "refs.h"
>  #include "strbuf.h"
> +#include "worktree.h"
>  
>  #define REFS_MIGRATE_USAGE \
>  	N_("git refs migrate --ref-format=<format> [--dry-run]")
> @@ -66,6 +67,7 @@ static int cmd_refs_migrate(int argc, const char **argv, const char *prefix)
>  static int cmd_refs_verify(int argc, const char **argv, const char *prefix)
>  {
>  	struct fsck_options fsck_refs_options = FSCK_REFS_OPTIONS_DEFAULT;
> +	struct worktree **worktrees, **p;
>  	const char * const verify_usage[] = {
>  		REFS_VERIFY_USAGE,
>  		NULL,
> @@ -75,7 +77,7 @@ static int cmd_refs_verify(int argc, const char **argv, const char *prefix)
>  		OPT_BOOL(0, "strict", &fsck_refs_options.strict, N_("enable strict checking")),
>  		OPT_END(),
>  	};
> -	int ret;
> +	int ret = 0;
>  
>  	argc = parse_options(argc, argv, prefix, options, verify_usage, 0);
>  	if (argc)
> @@ -84,9 +86,14 @@ static int cmd_refs_verify(int argc, const char **argv, const char *prefix)
>  	git_config(git_fsck_config, &fsck_refs_options);
>  	prepare_repo_settings(the_repository);
>  
> -	ret = refs_fsck(get_main_ref_store(the_repository), &fsck_refs_options);
> +	worktrees = get_worktrees();
> +	for (p = worktrees; *p; p++) {
> +		struct worktree *wt = *p;
> +		ret += refs_fsck(get_worktree_ref_store(wt), &fsck_refs_options);
> +	}

I think it is more customary to say `ret |=` instead of `ref +=`.
Otherwise we could at least in theory wrap around and even land at `ret
== 0`, even though this is quite unlikely.

>  	fsck_options_clear(&fsck_refs_options);
> +	free_worktrees(worktrees);
>  	return ret;
>  }
>  
[snip]
> @@ -3600,8 +3600,16 @@ static int files_fsck(struct ref_store *ref_store,
>  	struct files_ref_store *refs =
>  		files_downcast(ref_store, REF_STORE_READ, "fsck");
>  
> -	return files_fsck_refs(ref_store, o) |
> -	       refs->packed_ref_store->be->fsck(refs->packed_ref_store, o);
> +	int ret = files_fsck_refs(ref_store, o);
> +
> +	/*
> +	 * packed-refs should only be checked once because it is shared
> +	 * between all worktrees.
> +	 */
> +	if (!strcmp(ref_store->gitdir, ref_store->repo->gitdir))
> +		ret += refs->packed_ref_store->be->fsck(refs->packed_ref_store, o);
> +
> +	return ret;
>  }
>  
>  struct ref_storage_be refs_be_files = {

What is the current behaviour? Is it that we verify the packed-refs file
multiple times, or rather that we call `packed_ref_store->be->fsck()`
many times even though we know it won't do anything for anything except
for the main worktree?

If it is the former I very much agree that we should make this
conditional. If it's the latter I'm more in the camp of letting it be
such that if worktrees were to ever gain support for "packed-refs" we
wouldn't have to change anything.

In any case, as proposed I think it would make sense to evict this into
a standalone commit such that these details can be explained in the
commit message.

Patrick
shejialuo Oct. 7, 2024, 8:42 a.m. UTC | #2
On Mon, Oct 07, 2024 at 08:58:30AM +0200, Patrick Steinhardt wrote:
> On Sun, Sep 29, 2024 at 03:15:26PM +0800, shejialuo wrote:
> > We have already set up the infrastructure to check the consistency for
> > refs, but we do not support multiple worktrees. As we decide to add more
> > checks for ref content, we need to set up support for multiple
> > worktrees. Use "get_worktrees" and "get_worktree_ref_store" to check
> > refs under the worktrees.
> 
> Makes sense.
> 
> > Because we should only check once for "packed-refs", let's call the fsck
> > function for packed-backend when in the main worktree. In order to know
> > which directory we check, we should default print this information
> > instead of specifying "--verbose".
> 
> This change should likely be evicted into its own commit with a bit more
> explanation.
> 
> > It's not suitable to print these information to the stderr. So, change
> > to stdout.
> 
> This one, too. Why exactly is in not suitable to print to stderr?
> 

I am sorry for the confusion. We should not print which directory we
check here into stderr. Because I think this will make test script
contain many unrelated info when using "git refs verify 2>err".

The reason here is when checking the consistency of refs in multiple
worktrees. The ref name could be repeat. For example, worktree A
has its own ref called "test" under ".git/worktrees/A/refs/worktree/test"
and worktree B has its own ref still called "test" under
".git/worktrees/B/refs/worktree/test".

However, the refname would be printed to "refs/worktree/test". It will
make the user confused which "refs/worktree/test" is checked. So, we
should print this information like:

    Checking references consistency in .git
    ...
    checking references consistency in .git/worktrees/A
    ...
    checking references consistency in .git/worktrees/B

However, when writing this, I feel a ".git" is a bad usage. It will make
the user think it will check everything here. This should be improved in
the next version.

> > @@ -75,7 +77,7 @@ static int cmd_refs_verify(int argc, const char **argv, const char *prefix)
> >  		OPT_BOOL(0, "strict", &fsck_refs_options.strict, N_("enable strict checking")),
> >  		OPT_END(),
> >  	};
> > -	int ret;
> > +	int ret = 0;
> >  
> >  	argc = parse_options(argc, argv, prefix, options, verify_usage, 0);
> >  	if (argc)
> > @@ -84,9 +86,14 @@ static int cmd_refs_verify(int argc, const char **argv, const char *prefix)
> >  	git_config(git_fsck_config, &fsck_refs_options);
> >  	prepare_repo_settings(the_repository);
> >  
> > -	ret = refs_fsck(get_main_ref_store(the_repository), &fsck_refs_options);
> > +	worktrees = get_worktrees();
> > +	for (p = worktrees; *p; p++) {
> > +		struct worktree *wt = *p;
> > +		ret += refs_fsck(get_worktree_ref_store(wt), &fsck_refs_options);
> > +	}
> 
> I think it is more customary to say `ret |=` instead of `ref +=`.
> Otherwise we could at least in theory wrap around and even land at `ret
> == 0`, even though this is quite unlikely.
> 

I agree here. I will improve this in the next version.

[snip]

> > @@ -3600,8 +3600,16 @@ static int files_fsck(struct ref_store *ref_store,
> >  	struct files_ref_store *refs =
> >  		files_downcast(ref_store, REF_STORE_READ, "fsck");
> >  
> > -	return files_fsck_refs(ref_store, o) |
> > -	       refs->packed_ref_store->be->fsck(refs->packed_ref_store, o);
> > +	int ret = files_fsck_refs(ref_store, o);
> > +
> > +	/*
> > +	 * packed-refs should only be checked once because it is shared
> > +	 * between all worktrees.
> > +	 */
> > +	if (!strcmp(ref_store->gitdir, ref_store->repo->gitdir))
> > +		ret += refs->packed_ref_store->be->fsck(refs->packed_ref_store, o);
> > +
> > +	return ret;
> >  }
> >  
> >  struct ref_storage_be refs_be_files = {
> 
> What is the current behaviour? Is it that we verify the packed-refs file
> multiple times, or rather that we call `packed_ref_store->be->fsck()`
> many times even though we know it won't do anything for anything except
> for the main worktree?
> 

That's a good question. I think the second is the current behaviour. We
will call `packed_ref_store->be->fsck()` many times. I understand what
you mean here, we just put the check into `packed_ref_store->be->fsck()`
function.

> If it is the former I very much agree that we should make this
> conditional. If it's the latter I'm more in the camp of letting it be
> such that if worktrees were to ever gain support for "packed-refs" we
> wouldn't have to change anything.
> 

I agree.

> In any case, as proposed I think it would make sense to evict this into
> a standalone commit such that these details can be explained in the
> commit message.
> 

Yes, the current commit message lacks of details.

> Patrick

Thanks,
Jialuo
Patrick Steinhardt Oct. 7, 2024, 9:16 a.m. UTC | #3
On Mon, Oct 07, 2024 at 04:42:21PM +0800, shejialuo wrote:
> On Mon, Oct 07, 2024 at 08:58:30AM +0200, Patrick Steinhardt wrote:
> > On Sun, Sep 29, 2024 at 03:15:26PM +0800, shejialuo wrote:
> > > We have already set up the infrastructure to check the consistency for
> > > refs, but we do not support multiple worktrees. As we decide to add more
> > > checks for ref content, we need to set up support for multiple
> > > worktrees. Use "get_worktrees" and "get_worktree_ref_store" to check
> > > refs under the worktrees.
> > 
> > Makes sense.
> > 
> > > Because we should only check once for "packed-refs", let's call the fsck
> > > function for packed-backend when in the main worktree. In order to know
> > > which directory we check, we should default print this information
> > > instead of specifying "--verbose".
> > 
> > This change should likely be evicted into its own commit with a bit more
> > explanation.
> > 
> > > It's not suitable to print these information to the stderr. So, change
> > > to stdout.
> > 
> > This one, too. Why exactly is in not suitable to print to stderr?
> > 
> 
> I am sorry for the confusion. We should not print which directory we
> check here into stderr. Because I think this will make test script
> contain many unrelated info when using "git refs verify 2>err".
> 
> The reason here is when checking the consistency of refs in multiple
> worktrees. The ref name could be repeat. For example, worktree A
> has its own ref called "test" under ".git/worktrees/A/refs/worktree/test"
> and worktree B has its own ref still called "test" under
> ".git/worktrees/B/refs/worktree/test".
> 
> However, the refname would be printed to "refs/worktree/test". It will
> make the user confused which "refs/worktree/test" is checked. So, we
> should print this information like:
> 
>     Checking references consistency in .git
>     ...
>     checking references consistency in .git/worktrees/A
>     ...
>     checking references consistency in .git/worktrees/B
> 
> However, when writing this, I feel a ".git" is a bad usage. It will make
> the user think it will check everything here. This should be improved in
> the next version.

But wouldn't it be the better solution if we printed the fully-qualified
reference name "worktrees/worktree/refs/worktree/test" instead? That
would remove the need to say which directory we're currently verifying
in the first place.

Patrick
shejialuo Oct. 7, 2024, 12:06 p.m. UTC | #4
On Mon, Oct 07, 2024 at 11:16:19AM +0200, Patrick Steinhardt wrote:

[snip]

> > However, the refname would be printed to "refs/worktree/test". It will
> > make the user confused which "refs/worktree/test" is checked. So, we
> > should print this information like:
> > 
> >     Checking references consistency in .git
> >     ...
> >     checking references consistency in .git/worktrees/A
> >     ...
> >     checking references consistency in .git/worktrees/B
> > 
> > However, when writing this, I feel a ".git" is a bad usage. It will make
> > the user think it will check everything here. This should be improved in
> > the next version.
> 
> But wouldn't it be the better solution if we printed the fully-qualified
> reference name "worktrees/worktree/refs/worktree/test" instead? That
> would remove the need to say which directory we're currently verifying
> in the first place.
> 

Good idea. I will use this way in the next version.

> Patrick

Thanks
diff mbox series

Patch

diff --git a/builtin/refs.c b/builtin/refs.c
index 24978a7b7b..3c492ea922 100644
--- a/builtin/refs.c
+++ b/builtin/refs.c
@@ -5,6 +5,7 @@ 
 #include "parse-options.h"
 #include "refs.h"
 #include "strbuf.h"
+#include "worktree.h"
 
 #define REFS_MIGRATE_USAGE \
 	N_("git refs migrate --ref-format=<format> [--dry-run]")
@@ -66,6 +67,7 @@  static int cmd_refs_migrate(int argc, const char **argv, const char *prefix)
 static int cmd_refs_verify(int argc, const char **argv, const char *prefix)
 {
 	struct fsck_options fsck_refs_options = FSCK_REFS_OPTIONS_DEFAULT;
+	struct worktree **worktrees, **p;
 	const char * const verify_usage[] = {
 		REFS_VERIFY_USAGE,
 		NULL,
@@ -75,7 +77,7 @@  static int cmd_refs_verify(int argc, const char **argv, const char *prefix)
 		OPT_BOOL(0, "strict", &fsck_refs_options.strict, N_("enable strict checking")),
 		OPT_END(),
 	};
-	int ret;
+	int ret = 0;
 
 	argc = parse_options(argc, argv, prefix, options, verify_usage, 0);
 	if (argc)
@@ -84,9 +86,14 @@  static int cmd_refs_verify(int argc, const char **argv, const char *prefix)
 	git_config(git_fsck_config, &fsck_refs_options);
 	prepare_repo_settings(the_repository);
 
-	ret = refs_fsck(get_main_ref_store(the_repository), &fsck_refs_options);
+	worktrees = get_worktrees();
+	for (p = worktrees; *p; p++) {
+		struct worktree *wt = *p;
+		ret += refs_fsck(get_worktree_ref_store(wt), &fsck_refs_options);
+	}
 
 	fsck_options_clear(&fsck_refs_options);
+	free_worktrees(worktrees);
 	return ret;
 }
 
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 03d2503276..57318b4c4e 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3558,7 +3558,7 @@  static int files_fsck_refs_dir(struct ref_store *ref_store,
 		} else if (S_ISREG(iter->st.st_mode) ||
 			   S_ISLNK(iter->st.st_mode)) {
 			if (o->verbose)
-				fprintf_ln(stderr, "Checking %s/%s",
+				fprintf_ln(stdout, "Checking %s/%s",
 					   refs_check_dir, iter->relative_path);
 			for (size_t i = 0; fsck_refs_fn[i]; i++) {
 				if (fsck_refs_fn[i](ref_store, o, refs_check_dir, iter))
@@ -3589,8 +3589,8 @@  static int files_fsck_refs(struct ref_store *ref_store,
 		NULL,
 	};
 
-	if (o->verbose)
-		fprintf_ln(stderr, _("Checking references consistency"));
+	fprintf_ln(stdout, _("Checking references consistency in %s"),
+		   ref_store->gitdir);
 	return files_fsck_refs_dir(ref_store, o,  "refs", fsck_refs_fn);
 }
 
@@ -3600,8 +3600,16 @@  static int files_fsck(struct ref_store *ref_store,
 	struct files_ref_store *refs =
 		files_downcast(ref_store, REF_STORE_READ, "fsck");
 
-	return files_fsck_refs(ref_store, o) |
-	       refs->packed_ref_store->be->fsck(refs->packed_ref_store, o);
+	int ret = files_fsck_refs(ref_store, o);
+
+	/*
+	 * packed-refs should only be checked once because it is shared
+	 * between all worktrees.
+	 */
+	if (!strcmp(ref_store->gitdir, ref_store->repo->gitdir))
+		ret += refs->packed_ref_store->be->fsck(refs->packed_ref_store, o);
+
+	return ret;
 }
 
 struct ref_storage_be refs_be_files = {
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 71a4d1a5ae..4c6cd6f7d0 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -89,4 +89,63 @@  test_expect_success 'ref name check should be adapted into fsck messages' '
 	test_must_be_empty err
 '
 
+test_expect_success 'ref name check should work for multiple worktrees' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+
+	cd repo &&
+	test_commit initial &&
+	git checkout -b branch-1 &&
+	test_commit second &&
+	git checkout -b branch-2 &&
+	test_commit third &&
+	git checkout -b branch-3 &&
+	git worktree add ./worktree-1 branch-1 &&
+	git worktree add ./worktree-2 branch-2 &&
+	worktree1_refdir_prefix=.git/worktrees/worktree-1/refs/worktree &&
+	worktree2_refdir_prefix=.git/worktrees/worktree-2/refs/worktree &&
+
+	(
+		cd worktree-1 &&
+		git update-ref refs/worktree/branch-4 refs/heads/branch-3
+	) &&
+	(
+		cd worktree-2 &&
+		git update-ref refs/worktree/branch-4 refs/heads/branch-3
+	) &&
+
+	cp $worktree1_refdir_prefix/branch-4 $worktree1_refdir_prefix/.branch-2 &&
+	cp $worktree2_refdir_prefix/branch-4 $worktree2_refdir_prefix/@ &&
+
+	test_must_fail git refs verify 2>err &&
+	cat >expect <<-EOF &&
+	error: refs/worktree/.branch-2: badRefName: invalid refname format
+	error: refs/worktree/@: badRefName: invalid refname format
+	EOF
+	sort err >sorted_err &&
+	test_cmp expect sorted_err &&
+
+	(
+		cd worktree-1 &&
+		test_must_fail git refs verify 2>err &&
+		cat >expect <<-EOF &&
+		error: refs/worktree/.branch-2: badRefName: invalid refname format
+		error: refs/worktree/@: badRefName: invalid refname format
+		EOF
+		sort err >sorted_err &&
+		test_cmp expect sorted_err
+	) &&
+
+	(
+		cd worktree-2 &&
+		test_must_fail git refs verify 2>err &&
+		cat >expect <<-EOF &&
+		error: refs/worktree/.branch-2: badRefName: invalid refname format
+		error: refs/worktree/@: badRefName: invalid refname format
+		EOF
+		sort err >sorted_err &&
+		test_cmp expect sorted_err
+	)
+'
+
 test_done