diff mbox series

[v4,3/4] submodule: record superproject gitdir during absorbgitdirs

Message ID 20211014203416.2802639-4-emilyshaffer@google.com (mailing list archive)
State Superseded
Headers show
Series cache parent project's gitdir in submodules | expand

Commit Message

Emily Shaffer Oct. 14, 2021, 8:34 p.m. UTC
Already during 'git submodule add' we record a pointer to the
superproject's gitdir. However, this doesn't help brand-new
submodules created with 'git init' and later absorbed with 'git
submodule absorbgitdir'. Let's start adding that pointer during 'git
submodule absorbgitdir' too.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 submodule.c                        | 10 ++++++++++
 t/t7412-submodule-absorbgitdirs.sh | 23 +++++++++++++++++++++--
 2 files changed, 31 insertions(+), 2 deletions(-)

Comments

Jonathan Tan Oct. 18, 2021, 11:18 p.m. UTC | #1
> Already during 'git submodule add' we record a pointer to the
> superproject's gitdir. However, this doesn't help brand-new
> submodules created with 'git init' and later absorbed with 'git
> submodule absorbgitdir'. Let's start adding that pointer during 'git
> submodule absorbgitdir' too.

s/absorbgitdir/absorbgitdirs/ (note the "s" at the end)

> @@ -2114,6 +2115,15 @@ static void relocate_single_git_dir_into_superproject(const char *path)
>  
>  	relocate_gitdir(path, real_old_git_dir, real_new_git_dir);
>  
> +	/* cache pointer to superproject's gitdir */
> +	/* NEEDSWORK: this may differ if experimental.worktreeConfig is enabled */
> +	strbuf_addf(&config_path, "%s/config", real_new_git_dir);
> +	git_config_set_in_file(config_path.buf, "submodule.superprojectGitdir",
> +			       relative_path(absolute_path(get_git_dir()),
> +					     real_new_git_dir, &sb));
> +
> +	strbuf_release(&config_path);
> +	strbuf_release(&sb);
>  	free(old_git_dir);
>  	free(real_old_git_dir);
>  	free(real_new_git_dir);

Here [1] you mention that you'll delete the NEEDSWORK, but it's still
there.

Having said that, it might be better to make a test in which we call
this command while in a worktree of a superproject. The test might
reveal that (as pointed out to me internally) you might need to use the
common dir functions instead of the git dir functions to point to the
directory that you want (git-worktree.txt distinguishes the 2 if you
search for GIT_COMMON_DIR).

Besides that, all 4 patches look good (including the description of the
new config variable).

[1] https://lore.kernel.org/git/YWc2iJ7FQJYCnQ7w@google.com/
Derrick Stolee Oct. 25, 2021, 4:14 p.m. UTC | #2
On 10/18/2021 7:18 PM, Jonathan Tan wrote:
>> Already during 'git submodule add' we record a pointer to the
>> superproject's gitdir. However, this doesn't help brand-new
>> submodules created with 'git init' and later absorbed with 'git
>> submodule absorbgitdir'. Let's start adding that pointer during 'git
>> submodule absorbgitdir' too.
> 
> s/absorbgitdir/absorbgitdirs/ (note the "s" at the end)
> 
>> @@ -2114,6 +2115,15 @@ static void relocate_single_git_dir_into_superproject(const char *path)
>>  
>>  	relocate_gitdir(path, real_old_git_dir, real_new_git_dir);
>>  
>> +	/* cache pointer to superproject's gitdir */
>> +	/* NEEDSWORK: this may differ if experimental.worktreeConfig is enabled */
>> +	strbuf_addf(&config_path, "%s/config", real_new_git_dir);
>> +	git_config_set_in_file(config_path.buf, "submodule.superprojectGitdir",
>> +			       relative_path(absolute_path(get_git_dir()),
>> +					     real_new_git_dir, &sb));
>> +
>> +	strbuf_release(&config_path);
>> +	strbuf_release(&sb);
>>  	free(old_git_dir);
>>  	free(real_old_git_dir);
>>  	free(real_new_git_dir);
> 
> Here [1] you mention that you'll delete the NEEDSWORK, but it's still
> there.
> 
> Having said that, it might be better to make a test in which we call
> this command while in a worktree of a superproject. The test might
> reveal that (as pointed out to me internally) you might need to use the
> common dir functions instead of the git dir functions to point to the
> directory that you want (git-worktree.txt distinguishes the 2 if you
> search for GIT_COMMON_DIR).

I came here to say the same thing. It's a bit too direct to compute
the location of a config file this way, so we should expose a method
that can create one for a given Git directory.

Since you're setting this config value inside the submodule's config,
what does it mean for a submodule to also be a worktree (and hence
require config.worktree)? What happens in this rough scenario?

  1. git init sub
  2. git init super
  3. git -C sub worktree add super/sub
  4. git -C super submodule absorbgitdir sub

I haven't actually tried running these things, but it seems unusual
and unexpected. This doesn't even account for cases where the repo
root and a worktree are both submodules within the superproject.

If we already have protections preventing these worktrees as
submodules, then perhaps there is no need for work here. I'm not
familiar enough with the area to make a claim one way or another.

Thanks,
-Stolee
Emily Shaffer Nov. 4, 2021, 10:09 p.m. UTC | #3
On Mon, Oct 18, 2021 at 04:18:18PM -0700, Jonathan Tan wrote:
> 
> > Already during 'git submodule add' we record a pointer to the
> > superproject's gitdir. However, this doesn't help brand-new
> > submodules created with 'git init' and later absorbed with 'git
> > submodule absorbgitdir'. Let's start adding that pointer during 'git
> > submodule absorbgitdir' too.
> 
> s/absorbgitdir/absorbgitdirs/ (note the "s" at the end)
> 
> > @@ -2114,6 +2115,15 @@ static void relocate_single_git_dir_into_superproject(const char *path)
> >  
> >  	relocate_gitdir(path, real_old_git_dir, real_new_git_dir);
> >  
> > +	/* cache pointer to superproject's gitdir */
> > +	/* NEEDSWORK: this may differ if experimental.worktreeConfig is enabled */
> > +	strbuf_addf(&config_path, "%s/config", real_new_git_dir);
> > +	git_config_set_in_file(config_path.buf, "submodule.superprojectGitdir",
> > +			       relative_path(absolute_path(get_git_dir()),
> > +					     real_new_git_dir, &sb));
> > +
> > +	strbuf_release(&config_path);
> > +	strbuf_release(&sb);
> >  	free(old_git_dir);
> >  	free(real_old_git_dir);
> >  	free(real_new_git_dir);
> 
> Here [1] you mention that you'll delete the NEEDSWORK, but it's still
> there.
> 
> Having said that, it might be better to make a test in which we call
> this command while in a worktree of a superproject. The test might
> reveal that (as pointed out to me internally) you might need to use the
> common dir functions instead of the git dir functions to point to the
> directory that you want (git-worktree.txt distinguishes the 2 if you
> search for GIT_COMMON_DIR).

Huh, something interesting happened, actually.

I wrote a little test:

  test_expect_success 'absorbgitdirs works when called from a superproject worktree' '
          # set up a worktree of the superproject
          git worktree add wt &&
          (
          cd wt &&
 
          # create a new unembedded git dir
          git init sub4 &&
          test_commit -C sub4 first &&
          git submodule add ./sub4 &&
          test_tick &&
 
          # absorb the git dir
          git submodule absorbgitdirs sub4 &&
 
          # make sure the submodule cached the superproject gitdir correctly
          submodule_gitdir="$(git -C sub4 rev-parse --absolute-git-dir)" &&
          superproject_gitdir="$(git rev-parse --absolute-git-dir)" &&
 
          test-tool path-utils relative_path "$superproject_gitdir" \
                  "$submodule_gitdir" >expect &&
          git -C sub4 config submodule.superprojectGitDir >actual &&
 
          test_pause &&
          test_cmp expect actual
          )
  '

However, the `git submodule absorbgitdirs` command didn't do quite what
I expected.

When I made a new worktree, that worktree's gitdir showed up in
'$TEST_DIR/.git/worktrees/wt/'. That's not very surprising. But what did
surprise me was that when I called `git submodule absorbgitdirs sub4`,
sub4's gitdir ended up in '$TEST_DIR/.git/worktrees/wt/modules/sub4',
rather than in '$TEST_DIR/.git/modules/sub4'. That's a little
surprising!

Anyway, this test has a sort of pernicious mistake too - I'm checking
relative path between 'git rev-parse --absolute-git-dir's, but the
relative path from .git/worktrees/wt/ to .git/worktrees/wt/modules/sub4
is the same as the relative path from .git/ to .git/modules/sub4, so
this test actually passes.

I'll change the tests here and elsewhere to use 'git rev-parse
--path-format=absolute --git-common-dir', but I think that still leaves
a kind of dangerous state for people working a lot with worktrees and
submodules - if I like to make throwaway worktrees in my regular
workflow, and I create a new submodule in one, and then get rid of the
worktree when I'm done with the task that added the new submodule, I
think it will explode if I try to checkout the branch I was using in
that worktree.

I tried it out locally:

 # setup
 git init test && cd test
 git commit --allow-empty -m "first commit"
 git worktree add wt
 (
   cd wt
   git init sub
   git -C sub commit --allow-empty -m "first-commit"
   git submodule add ./sub
   git commit -m "add submodule (as initted)
   git submodule absorbgitdirs sub
   # This told me "Migrating git directory of 'sub' from
   # test/wt/sub/.git to test/.git/worktrees/wt/modules/sub, oops
 )

 # Make it possible to checkout wt branch somewhere else
 git -C wt checkout HEAD --detach

 # Try and delete the worktree; presumably my work is done
 git worktree remove wt

At this point, Git wouldn't let me remove the worktree:
  fatal: working trees containing submodules cannot be moved or removed

But wouldn't it be better to just absorb the submodule into the common
dir instead...?

By the way, when I tried checking out 'wt' branch from the original
worktree without deleting the new worktree, and then running 'git
submodule update', Git cloned 'sub' from .git/worktrees/wt/modules/sub
into .git/modules/sub. That was pretty surprising too!


Anyway, all of that is kind of a tangent. The takeaways I got are
twofold:

1) Yes, use common dir, not gitdir, in all the cached
path-to-superproject stuff.
2) Someone (me?) should send a patch keeping the "you can't delete a
submodule with a gitdir in it" die(), but *also* changing the behavior
to put the new submodule gitdir into get_common_dir() instead of
get_git_dir().

I'll try to send (2) separately - I think it will be a pretty small change,
and by keeping around the die() for deleting a worktree that already has
a submodule gitdir in it, we won't be risking deleting anybody's
existing work.

 - Emily

> 
> Besides that, all 4 patches look good (including the description of the
> new config variable).
> 
> [1] https://lore.kernel.org/git/YWc2iJ7FQJYCnQ7w@google.com/
Emily Shaffer Nov. 4, 2021, 11:22 p.m. UTC | #4
On Mon, Oct 25, 2021 at 12:14:07PM -0400, Derrick Stolee wrote:
> 
> On 10/18/2021 7:18 PM, Jonathan Tan wrote:
> >> Already during 'git submodule add' we record a pointer to the
> >> superproject's gitdir. However, this doesn't help brand-new
> >> submodules created with 'git init' and later absorbed with 'git
> >> submodule absorbgitdir'. Let's start adding that pointer during 'git
> >> submodule absorbgitdir' too.
> > 
> > s/absorbgitdir/absorbgitdirs/ (note the "s" at the end)
> > 
> >> @@ -2114,6 +2115,15 @@ static void relocate_single_git_dir_into_superproject(const char *path)
> >>  
> >>  	relocate_gitdir(path, real_old_git_dir, real_new_git_dir);
> >>  
> >> +	/* cache pointer to superproject's gitdir */
> >> +	/* NEEDSWORK: this may differ if experimental.worktreeConfig is enabled */
> >> +	strbuf_addf(&config_path, "%s/config", real_new_git_dir);
> >> +	git_config_set_in_file(config_path.buf, "submodule.superprojectGitdir",
> >> +			       relative_path(absolute_path(get_git_dir()),
> >> +					     real_new_git_dir, &sb));
> >> +
> >> +	strbuf_release(&config_path);
> >> +	strbuf_release(&sb);
> >>  	free(old_git_dir);
> >>  	free(real_old_git_dir);
> >>  	free(real_new_git_dir);
> > 
> > Here [1] you mention that you'll delete the NEEDSWORK, but it's still
> > there.
> > 
> > Having said that, it might be better to make a test in which we call
> > this command while in a worktree of a superproject. The test might
> > reveal that (as pointed out to me internally) you might need to use the
> > common dir functions instead of the git dir functions to point to the
> > directory that you want (git-worktree.txt distinguishes the 2 if you
> > search for GIT_COMMON_DIR).
> 
> I came here to say the same thing. It's a bit too direct to compute
> the location of a config file this way, so we should expose a method
> that can create one for a given Git directory.
> 
> Since you're setting this config value inside the submodule's config,
> what does it mean for a submodule to also be a worktree (and hence
> require config.worktree)? What happens in this rough scenario?
> 
>   1. git init sub
>   2. git init super
>   3. git -C sub worktree add super/sub
>   4. git -C super submodule absorbgitdir sub
> 
> I haven't actually tried running these things, but it seems unusual
> and unexpected. This doesn't even account for cases where the repo
> root and a worktree are both submodules within the superproject.
> 
> If we already have protections preventing these worktrees as
> submodules, then perhaps there is no need for work here. I'm not
> familiar enough with the area to make a claim one way or another.

Yeah, I think there is actually a test case covering this in t7412:

137 test_expect_success 'setup a submodule with multiple worktrees' '
138         # first create another unembedded git dir in a new submodule
139         git init sub3 &&
140         test_commit -C sub3 first &&
141         git submodule add ./sub3 &&
142         test_tick &&
143         git commit -m "add another submodule" &&
144         git -C sub3 worktree add ../sub3_second_work_tree
145 '
146
147 test_expect_success 'absorbing fails for a submodule with multiple worktrees' '
148         test_must_fail git submodule absorbgitdirs sub3 2>error &&
149         test_i18ngrep "not supported" error
150 '

That is, I think because 'sub/' in your scenario above has multiple
worktrees, the absorbgitdirs will fail. So I won't do additional work
here. Thanks.

 - Emily

> 
> Thanks,
> -Stolee
Derrick Stolee Nov. 8, 2021, 1:07 a.m. UTC | #5
On 11/4/2021 7:22 PM, Emily Shaffer wrote:
> On Mon, Oct 25, 2021 at 12:14:07PM -0400, Derrick Stolee wrote:
>> Since you're setting this config value inside the submodule's config,
>> what does it mean for a submodule to also be a worktree (and hence
>> require config.worktree)? What happens in this rough scenario?
>>
>>   1. git init sub
>>   2. git init super
>>   3. git -C sub worktree add super/sub
>>   4. git -C super submodule absorbgitdir sub
>>
>> I haven't actually tried running these things, but it seems unusual
>> and unexpected. This doesn't even account for cases where the repo
>> root and a worktree are both submodules within the superproject.
>>
>> If we already have protections preventing these worktrees as
>> submodules, then perhaps there is no need for work here. I'm not
>> familiar enough with the area to make a claim one way or another.
> 
> Yeah, I think there is actually a test case covering this in t7412:
...
> 147 test_expect_success 'absorbing fails for a submodule with multiple worktrees' '
> 148         test_must_fail git submodule absorbgitdirs sub3 2>error &&
> 149         test_i18ngrep "not supported" error
> 150 '
> 
> That is, I think because 'sub/' in your scenario above has multiple
> worktrees, the absorbgitdirs will fail. So I won't do additional work
> here. Thanks.

Good to hear. Thanks for tracking that down.

-Stolee
diff mbox series

Patch

diff --git a/submodule.c b/submodule.c
index 96487517f9..571b68b66d 100644
--- a/submodule.c
+++ b/submodule.c
@@ -2084,6 +2084,7 @@  static void relocate_single_git_dir_into_superproject(const char *path)
 	char *old_git_dir = NULL, *real_old_git_dir = NULL, *real_new_git_dir = NULL;
 	struct strbuf new_gitdir = STRBUF_INIT;
 	const struct submodule *sub;
+	struct strbuf config_path = STRBUF_INIT, sb = STRBUF_INIT;
 
 	if (submodule_uses_worktrees(path))
 		die(_("relocate_gitdir for submodule '%s' with "
@@ -2114,6 +2115,15 @@  static void relocate_single_git_dir_into_superproject(const char *path)
 
 	relocate_gitdir(path, real_old_git_dir, real_new_git_dir);
 
+	/* cache pointer to superproject's gitdir */
+	/* NEEDSWORK: this may differ if experimental.worktreeConfig is enabled */
+	strbuf_addf(&config_path, "%s/config", real_new_git_dir);
+	git_config_set_in_file(config_path.buf, "submodule.superprojectGitdir",
+			       relative_path(absolute_path(get_git_dir()),
+					     real_new_git_dir, &sb));
+
+	strbuf_release(&config_path);
+	strbuf_release(&sb);
 	free(old_git_dir);
 	free(real_old_git_dir);
 	free(real_new_git_dir);
diff --git a/t/t7412-submodule-absorbgitdirs.sh b/t/t7412-submodule-absorbgitdirs.sh
index 1cfa150768..9ee5ccd660 100755
--- a/t/t7412-submodule-absorbgitdirs.sh
+++ b/t/t7412-submodule-absorbgitdirs.sh
@@ -30,7 +30,17 @@  test_expect_success 'absorb the git dir' '
 	git status >actual.1 &&
 	git -C sub1 rev-parse HEAD >actual.2 &&
 	test_cmp expect.1 actual.1 &&
-	test_cmp expect.2 actual.2
+	test_cmp expect.2 actual.2 &&
+
+	# make sure the submodule cached the superproject gitdir correctly
+	submodule_gitdir="$(git -C sub1 rev-parse --absolute-git-dir)" &&
+	superproject_gitdir="$(git rev-parse --absolute-git-dir)" &&
+
+	test-tool path-utils relative_path "$superproject_gitdir" \
+		"$submodule_gitdir" >expect &&
+	git -C sub1 config submodule.superprojectGitDir >actual &&
+
+	test_cmp expect actual
 '
 
 test_expect_success 'absorbing does not fail for deinitialized submodules' '
@@ -61,7 +71,16 @@  test_expect_success 'absorb the git dir in a nested submodule' '
 	git status >actual.1 &&
 	git -C sub1/nested rev-parse HEAD >actual.2 &&
 	test_cmp expect.1 actual.1 &&
-	test_cmp expect.2 actual.2
+	test_cmp expect.2 actual.2 &&
+
+	sub1_gitdir="$(git -C sub1 rev-parse --absolute-git-dir)" &&
+	sub1_nested_gitdir="$(git -C sub1/nested rev-parse --absolute-git-dir)" &&
+
+	test-tool path-utils relative_path "$sub1_gitdir" "$sub1_nested_gitdir" \
+		>expect &&
+	git -C sub1/nested config submodule.superprojectGitDir >actual &&
+
+	test_cmp expect actual
 '
 
 test_expect_success 're-setup nested submodule' '