diff mbox series

[v3,3/4] submodule: record superproject gitdir during absorbgitdirs

Message ID 20210819200953.2105230-4-emilyshaffer@google.com (mailing list archive)
State New, archived
Headers show
Series cache parent project's gitdir in submodules | expand

Commit Message

Emily Shaffer Aug. 19, 2021, 8:09 p.m. UTC
Already during 'git submodule add' we record a pointer to the
superproject's gitdir. However, this doesn't help brand-new
submodules created with 'git init' and later absorbed with 'git
submodule absorbgitdir'. Let's start adding that pointer during 'git
submodule absorbgitdir' too.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 submodule.c                        | 10 ++++++++++
 t/t7412-submodule-absorbgitdirs.sh |  9 ++++++++-
 2 files changed, 18 insertions(+), 1 deletion(-)

Comments

Derrick Stolee Aug. 20, 2021, 12:50 a.m. UTC | #1
On 8/19/2021 4:09 PM, Emily Shaffer wrote:
> +	/* cache pointer to superproject's gitdir */
> +	/* NEEDSWORK: this may differ if experimental.worktreeConfig is enabled */
> +	strbuf_addf(&config_path, "%s/config", real_new_git_dir);

Regarding this NEEDSWORK: you are specifying the config file that
covers every worktree of this submodule, which seems to make sense
to me. The submodule shouldn't have worktrees to itself, right? It
might become a worktree if its superproject has a worktree.

But also: it's still correct to write to the local, non-worktree
config file even if experimental.worktreeConfig is set. That
config just means "we need to read the worktree config as well
as the local config".

Or, am I misunderstanding something?

(I'm advocating for the removal of the NEEDSWORK, in case that
wasn't clear.)

> +	git_config_set_in_file(config_path.buf, "submodule.superprojectGitdir",
> +			       relative_path(get_super_prefix_or_empty(),
> +					     path, &sb));

Thanks,
-Stolee
Matheus Tavares Sept. 4, 2021, 5:27 p.m. UTC | #2
Emily Shaffer <emilyshaffer@google.com> wrote:
>
> Already during 'git submodule add' we record a pointer to the
> superproject's gitdir. However, this doesn't help brand-new
> submodules created with 'git init' and later absorbed with 'git
> submodule absorbgitdir'. Let's start adding that pointer during 'git
> submodule absorbgitdir' too.
> 
> Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> ---
>  submodule.c                        | 10 ++++++++++
>  t/t7412-submodule-absorbgitdirs.sh |  9 ++++++++-
>  2 files changed, 18 insertions(+), 1 deletion(-)
> 
> diff --git a/submodule.c b/submodule.c
> index 0b1d9c1dde..4b314bf09c 100644
> --- a/submodule.c
> +++ b/submodule.c
> @@ -2065,6 +2065,7 @@ static void relocate_single_git_dir_into_superproject(const char *path)

This function has only one caller, `absorb_git_dir_into_superproject()`.
Perhaps it would be better to set submodule.superprojectGitdir in the caller,
right after the `else` block in which `relocate...()` is called. This way,
the config would also be set in the case of nested submodules where the inner
one already had its gitdir absorbed (which is the case handled by the code in
the `if` block).

>  	char *old_git_dir = NULL, *real_old_git_dir = NULL, *real_new_git_dir = NULL;
>  	char *new_git_dir;
>  	const struct submodule *sub;
> +	struct strbuf config_path = STRBUF_INIT, sb = STRBUF_INIT;
>  
>  	if (submodule_uses_worktrees(path))
>  		die(_("relocate_gitdir for submodule '%s' with "
> @@ -2096,6 +2097,15 @@ static void relocate_single_git_dir_into_superproject(const char *path)
>  
>  	relocate_gitdir(path, real_old_git_dir, real_new_git_dir);
>  
> +	/* cache pointer to superproject's gitdir */
> +	/* NEEDSWORK: this may differ if experimental.worktreeConfig is enabled */

s/experimental/extensions/

On the Review Club, I mentioned we might want to save
submodule.superprojectGitdir at the worktree config file. But Jonathan and Josh
gave a better suggestion, which is to cache the superproject gitdir relative to
the submodule's gitdir, instead of its working tree.

This way, the worktree config wouldn't be an issue. And more importantly, this
would prevent `git mv` from making the cached path stale, as Stolee pointed out
upthread.

> +	strbuf_addf(&config_path, "%s/config", real_new_git_dir);
> +	git_config_set_in_file(config_path.buf, "submodule.superprojectGitdir",
> +			       relative_path(get_super_prefix_or_empty(),
> +					     path, &sb));


In this code, `the_repository` corresponds to the superproject, right? I
think `get_super_prefix_or_empty()` should instead be
`absolute_path(get_git_dir())`, like you did on the previous patch.

And since the first argument to `relative_path()` will be an absolute path, I
believe we also need to convert the second one to an absolute path. Otherwise,
`relative_path()` would return the first argument as-is [1]. (I played around
with using `get_git_dir()` directly as the first argument, but it seems this
can sometimes already be absolute, in case of nested submodules.)

If we store the path as relative to the submodule's gitdir, it should be
simpler, as `real_new_git_dir` is already absolute:
	
	git_config_set_in_file(config_path.buf, "submodule.superprojectGitdir",
			       relative_path(absolute_path(get_git_dir())
					     real_new_git_dir, &sb));

[1]: I'm not sure if this is intended or if it's a bug. I was expecting that,
before comparing its two arguments, `relative_path()` would convert any
relative path given as argument to absolute, using the current working dir path.
But that's not the case.

> +	strbuf_release(&config_path);
> +	strbuf_release(&sb);
>  	free(old_git_dir);
>  	free(real_old_git_dir);
>  	free(real_new_git_dir);
> diff --git a/t/t7412-submodule-absorbgitdirs.sh b/t/t7412-submodule-absorbgitdirs.sh
> index 1cfa150768..e2d78e01df 100755
> --- a/t/t7412-submodule-absorbgitdirs.sh
> +++ b/t/t7412-submodule-absorbgitdirs.sh
> @@ -30,7 +30,14 @@ test_expect_success 'absorb the git dir' '
>  	git status >actual.1 &&
>  	git -C sub1 rev-parse HEAD >actual.2 &&
>  	test_cmp expect.1 actual.1 &&
> -	test_cmp expect.2 actual.2
> +	test_cmp expect.2 actual.2 &&
> +
> +	# make sure the submodule cached the superproject gitdir correctly
> +	test-tool path-utils real_path . >expect &&

This should be '.git' instead of '.', since the config caches the path to the
superproject's gitdir. But ...

> +	test-tool path-utils real_path \
> +		"$(git -C sub1 config submodule.superprojectGitDir)" >actual &&

... I think we could also avoid converting to an absolute path here, so that we
can test whether the setting is really caching a relative path. I.e., the test
could be:

	super_gitdir="$(git rev-parse --absolute-git-dir)" &&
	test-tool path-utils relative_path "$super_gitdir" "$PWD/sub1" >expect &&
	git -C sub1 config submodule.superprojectGitDir >actual &&
	test_cmp expect actual

> +
> +	test_cmp expect actual
>  '

It may also be interesting to test if the config is correctly set when
absorbing the gitdir of nested submodules:

diff --git a/t/t7412-submodule-absorbgitdirs.sh b/t/t7412-submodule-absorbgitdirs.sh
index e2d78e01df..c2e5e7dd1c 100755
--- a/t/t7412-submodule-absorbgitdirs.sh
+++ b/t/t7412-submodule-absorbgitdirs.sh
@@ -70,3 +70,8 @@ test_expect_success 'absorb the git dir in a nested submodule' '
 	test_cmp expect.1 actual.1 &&
-	test_cmp expect.2 actual.2
+	test_cmp expect.2 actual.2 &&
+
+	sub1_gitdir="$(git -C sub1 rev-parse --absolute-git-dir)" &&
+	test-tool path-utils relative_path "$sub1_gitdir" "$PWD/sub1/nested" >expect &&
+	git -C sub1/nested config submodule.superprojectGitDir >actual &&
+	test_cmp expect actual
 '
--- >8 ---
Emily Shaffer Oct. 13, 2021, 7:42 p.m. UTC | #3
On Thu, Aug 19, 2021 at 08:50:27PM -0400, Derrick Stolee wrote:
> 
> On 8/19/2021 4:09 PM, Emily Shaffer wrote:
> > +	/* cache pointer to superproject's gitdir */
> > +	/* NEEDSWORK: this may differ if experimental.worktreeConfig is enabled */
> > +	strbuf_addf(&config_path, "%s/config", real_new_git_dir);
> 
> Regarding this NEEDSWORK: you are specifying the config file that
> covers every worktree of this submodule, which seems to make sense
> to me. The submodule shouldn't have worktrees to itself, right? It
> might become a worktree if its superproject has a worktree.
> 
> But also: it's still correct to write to the local, non-worktree
> config file even if experimental.worktreeConfig is set. That
> config just means "we need to read the worktree config as well
> as the local config".
> 
> Or, am I misunderstanding something?
> 
> (I'm advocating for the removal of the NEEDSWORK, in case that
> wasn't clear.)

Yeah, I think by pointing from submodule gitdir to superproject gitdir
this point goes away, anyways. Regardless of which worktree is in use,
the gitdir->gitdir pointer will still be valid. I'll delete the
NEEDSWORK.

> 
> > +	git_config_set_in_file(config_path.buf, "submodule.superprojectGitdir",
> > +			       relative_path(get_super_prefix_or_empty(),
> > +					     path, &sb));
> 
> Thanks,
> -Stolee
>
Emily Shaffer Oct. 14, 2021, 6:40 p.m. UTC | #4
On Sat, Sep 04, 2021 at 02:27:46PM -0300, Matheus Tavares wrote:
> 
> Emily Shaffer <emilyshaffer@google.com> wrote:
> >
> > Already during 'git submodule add' we record a pointer to the
> > superproject's gitdir. However, this doesn't help brand-new
> > submodules created with 'git init' and later absorbed with 'git
> > submodule absorbgitdir'. Let's start adding that pointer during 'git
> > submodule absorbgitdir' too.
> > 
> > Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> > ---
> >  submodule.c                        | 10 ++++++++++
> >  t/t7412-submodule-absorbgitdirs.sh |  9 ++++++++-
> >  2 files changed, 18 insertions(+), 1 deletion(-)
> > 
> > diff --git a/submodule.c b/submodule.c
> > index 0b1d9c1dde..4b314bf09c 100644
> > --- a/submodule.c
> > +++ b/submodule.c
> > @@ -2065,6 +2065,7 @@ static void relocate_single_git_dir_into_superproject(const char *path)
> 
> This function has only one caller, `absorb_git_dir_into_superproject()`.
> Perhaps it would be better to set submodule.superprojectGitdir in the caller,
> right after the `else` block in which `relocate...()` is called. This way,
> the config would also be set in the case of nested submodules where the inner
> one already had its gitdir absorbed (which is the case handled by the code in
> the `if` block).

Since relocate_single_git_dir() is the only place where the final
submodule gitdir is resolved, I'll leave the code block where it is;
neither side of the if/else in absorb_git_dir...() learns the final
"new" gitdir, so I'd rather not re-derive it.

Anyway, wouldn't the submodule-who-is-also-a-superproject have gone
through this same codepath itself? I'll see if I can find a test to
validate that.

> 
> >  	char *old_git_dir = NULL, *real_old_git_dir = NULL, *real_new_git_dir = NULL;
> >  	char *new_git_dir;
> >  	const struct submodule *sub;
> > +	struct strbuf config_path = STRBUF_INIT, sb = STRBUF_INIT;
> >  
> >  	if (submodule_uses_worktrees(path))
> >  		die(_("relocate_gitdir for submodule '%s' with "
> > @@ -2096,6 +2097,15 @@ static void relocate_single_git_dir_into_superproject(const char *path)
> >  
> >  	relocate_gitdir(path, real_old_git_dir, real_new_git_dir);
> >  
> > +	/* cache pointer to superproject's gitdir */
> > +	/* NEEDSWORK: this may differ if experimental.worktreeConfig is enabled */
> 
> s/experimental/extensions/
> 
> On the Review Club, I mentioned we might want to save
> submodule.superprojectGitdir at the worktree config file. But Jonathan and Josh
> gave a better suggestion, which is to cache the superproject gitdir relative to
> the submodule's gitdir, instead of its working tree.
> 
> This way, the worktree config wouldn't be an issue. And more importantly, this
> would prevent `git mv` from making the cached path stale, as Stolee pointed out
> upthread.

Yep, done. Thanks.

> 
> > +	strbuf_addf(&config_path, "%s/config", real_new_git_dir);
> > +	git_config_set_in_file(config_path.buf, "submodule.superprojectGitdir",
> > +			       relative_path(get_super_prefix_or_empty(),
> > +					     path, &sb));
> 
> 
> In this code, `the_repository` corresponds to the superproject, right? I
> think `get_super_prefix_or_empty()` should instead be
> `absolute_path(get_git_dir())`, like you did on the previous patch.
> 
> And since the first argument to `relative_path()` will be an absolute path, I
> believe we also need to convert the second one to an absolute path. Otherwise,
> `relative_path()` would return the first argument as-is [1]. (I played around
> with using `get_git_dir()` directly as the first argument, but it seems this
> can sometimes already be absolute, in case of nested submodules.)
> 
> If we store the path as relative to the submodule's gitdir, it should be
> simpler, as `real_new_git_dir` is already absolute:
> 	
> 	git_config_set_in_file(config_path.buf, "submodule.superprojectGitdir",
> 			       relative_path(absolute_path(get_git_dir())
> 					     real_new_git_dir, &sb));
> 
> [1]: I'm not sure if this is intended or if it's a bug. I was expecting that,
> before comparing its two arguments, `relative_path()` would convert any
> relative path given as argument to absolute, using the current working dir path.
> But that's not the case.

Thanks, fixed.

> 
> > +	strbuf_release(&config_path);
> > +	strbuf_release(&sb);
> >  	free(old_git_dir);
> >  	free(real_old_git_dir);
> >  	free(real_new_git_dir);
> > diff --git a/t/t7412-submodule-absorbgitdirs.sh b/t/t7412-submodule-absorbgitdirs.sh
> > index 1cfa150768..e2d78e01df 100755
> > --- a/t/t7412-submodule-absorbgitdirs.sh
> > +++ b/t/t7412-submodule-absorbgitdirs.sh
> > @@ -30,7 +30,14 @@ test_expect_success 'absorb the git dir' '
> >  	git status >actual.1 &&
> >  	git -C sub1 rev-parse HEAD >actual.2 &&
> >  	test_cmp expect.1 actual.1 &&
> > -	test_cmp expect.2 actual.2
> > +	test_cmp expect.2 actual.2 &&
> > +
> > +	# make sure the submodule cached the superproject gitdir correctly
> > +	test-tool path-utils real_path . >expect &&
> 
> This should be '.git' instead of '.', since the config caches the path to the
> superproject's gitdir. But ...
> 
> > +	test-tool path-utils real_path \
> > +		"$(git -C sub1 config submodule.superprojectGitDir)" >actual &&
> 
> ... I think we could also avoid converting to an absolute path here, so that we
> can test whether the setting is really caching a relative path. I.e., the test
> could be:
> 
> 	super_gitdir="$(git rev-parse --absolute-git-dir)" &&
> 	test-tool path-utils relative_path "$super_gitdir" "$PWD/sub1" >expect &&
> 	git -C sub1 config submodule.superprojectGitDir >actual &&
> 	test_cmp expect actual

Sure.

> 
> > +
> > +	test_cmp expect actual
> >  '
> 
> It may also be interesting to test if the config is correctly set when
> absorbing the gitdir of nested submodules:
> 
> diff --git a/t/t7412-submodule-absorbgitdirs.sh b/t/t7412-submodule-absorbgitdirs.sh
> index e2d78e01df..c2e5e7dd1c 100755
> --- a/t/t7412-submodule-absorbgitdirs.sh
> +++ b/t/t7412-submodule-absorbgitdirs.sh
> @@ -70,3 +70,8 @@ test_expect_success 'absorb the git dir in a nested submodule' '
>  	test_cmp expect.1 actual.1 &&
> -	test_cmp expect.2 actual.2
> +	test_cmp expect.2 actual.2 &&
> +
> +	sub1_gitdir="$(git -C sub1 rev-parse --absolute-git-dir)" &&
> +	test-tool path-utils relative_path "$sub1_gitdir" "$PWD/sub1/nested" >expect &&
> +	git -C sub1/nested config submodule.superprojectGitDir >actual &&
> +	test_cmp expect actual

Ah, thanks. I tried it out, even without the changes you mentioned
above, and this test passes. So I think as I expected, it works because
'sub1' goes through relocate_single_git_dir() as well as 'sub1/nested'.

Thanks for the careful review.

 - Emily
diff mbox series

Patch

diff --git a/submodule.c b/submodule.c
index 0b1d9c1dde..4b314bf09c 100644
--- a/submodule.c
+++ b/submodule.c
@@ -2065,6 +2065,7 @@  static void relocate_single_git_dir_into_superproject(const char *path)
 	char *old_git_dir = NULL, *real_old_git_dir = NULL, *real_new_git_dir = NULL;
 	char *new_git_dir;
 	const struct submodule *sub;
+	struct strbuf config_path = STRBUF_INIT, sb = STRBUF_INIT;
 
 	if (submodule_uses_worktrees(path))
 		die(_("relocate_gitdir for submodule '%s' with "
@@ -2096,6 +2097,15 @@  static void relocate_single_git_dir_into_superproject(const char *path)
 
 	relocate_gitdir(path, real_old_git_dir, real_new_git_dir);
 
+	/* cache pointer to superproject's gitdir */
+	/* NEEDSWORK: this may differ if experimental.worktreeConfig is enabled */
+	strbuf_addf(&config_path, "%s/config", real_new_git_dir);
+	git_config_set_in_file(config_path.buf, "submodule.superprojectGitdir",
+			       relative_path(get_super_prefix_or_empty(),
+					     path, &sb));
+
+	strbuf_release(&config_path);
+	strbuf_release(&sb);
 	free(old_git_dir);
 	free(real_old_git_dir);
 	free(real_new_git_dir);
diff --git a/t/t7412-submodule-absorbgitdirs.sh b/t/t7412-submodule-absorbgitdirs.sh
index 1cfa150768..e2d78e01df 100755
--- a/t/t7412-submodule-absorbgitdirs.sh
+++ b/t/t7412-submodule-absorbgitdirs.sh
@@ -30,7 +30,14 @@  test_expect_success 'absorb the git dir' '
 	git status >actual.1 &&
 	git -C sub1 rev-parse HEAD >actual.2 &&
 	test_cmp expect.1 actual.1 &&
-	test_cmp expect.2 actual.2
+	test_cmp expect.2 actual.2 &&
+
+	# make sure the submodule cached the superproject gitdir correctly
+	test-tool path-utils real_path . >expect &&
+	test-tool path-utils real_path \
+		"$(git -C sub1 config submodule.superprojectGitDir)" >actual &&
+
+	test_cmp expect actual
 '
 
 test_expect_success 'absorbing does not fail for deinitialized submodules' '