diff mbox series

[v2,11/12] submodule: use the correct default for the main branch name

Message ID 59d6267f099f30f830836a2422289bc83f5c35e5.1592225416.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series Allow overriding the default name of the default branch | expand

Commit Message

John Passaro via GitGitGadget June 15, 2020, 12:50 p.m. UTC
From: Johannes Schindelin <johannes.schindelin@gmx.de>

To allow for overriding the default branch name, we have introduced a
config setting. With this patch, the `git submodule` command learns
about this, too.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/submodule--helper.c | 10 ++++++++--
 t/t7406-submodule-update.sh |  7 +++++++
 2 files changed, 15 insertions(+), 2 deletions(-)

Comments

Jeff King June 16, 2020, 1:46 p.m. UTC | #1
On Mon, Jun 15, 2020 at 12:50:15PM +0000, Johannes Schindelin via GitGitGadget wrote:

> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> 
> To allow for overriding the default branch name, we have introduced a
> config setting. With this patch, the `git submodule` command learns
> about this, too.

This was the other reading case (besides guess_remote_head()) that I'm
most concerned with causing regressions in a world where some repos are
on "master" and some are on "main".

This value ends up as the output of "submodule--helper remote-branch".

I was initially worried that we used this branch name for the fallback
when the server doesn't allow us to fetch the sha1 directly, but it
doesn't look like it. That's good, because handling fallbacks there
would be tricky.

Instead, we seem to use this only after fetching all of the refs for a
submodule:

  $ git grep -h -B2 -A11 remote-branch git-submodule.sh
  		if test -n "$remote"
  		then
  			branch=$(git submodule--helper remote-branch "$sm_path")
  			if test -z "$nofetch"
  			then
  				# Fetch remote before determining tracking $sha1
  				fetch_in_submodule "$sm_path" $depth ||
  				die "$(eval_gettext "Unable to fetch in submodule path '\$sm_path'")"
  			fi
  			remote_name=$(sanitize_submodule_env; cd "$sm_path" && get_default_remote)
  			sha1=$(sanitize_submodule_env; cd "$sm_path" &&
  				git rev-parse --verify "${remote_name}/${branch}") ||
  			die "$(eval_gettext "Unable to find current \${remote_name}/\${branch} revision in submodule path '\$sm_path'")"
  		fi

and then we just use that branch name to resolve a sha1. So this will
break cases where you've set init.mainBranch, the submodule repo is
still on "master", and you haven't configured a branch in .gitmodules.

It seems like, independent of any change in the default branch names, we
ought to be using $remote_name/HEAD for this case anyway. I suspect that
would be a behavior improvement by itself, as it means more cases could
avoid having to specify the branch name in .gitmodules manually.
Probably nobody noticed so far because "HEAD" is almost always "master"
in the current world. It technically breaks the case that you truly did
want to use "master" in the submodule, but they set HEAD to something
else, and you couldn't be bothered to put it into your .gitmodules file.
That seems rather unlikely to me.

And then everything would Just Work without having to worry about the
local mainbranch value at all.

Alternatively, submodule--helper could pass back the empty string for
"no, we don't have a configured branch name" and this shell code could
actually try a sequence of reasonable guesses: init.mainbranch, then
"master" (and between the two, "main" if that later becomes the
default).

-Peff
Johannes Schindelin June 23, 2020, 9:03 p.m. UTC | #2
Hi Peff,

On Tue, 16 Jun 2020, Jeff King wrote:

> On Mon, Jun 15, 2020 at 12:50:15PM +0000, Johannes Schindelin via GitGitGadget wrote:
>
> > From: Johannes Schindelin <johannes.schindelin@gmx.de>
> >
> > To allow for overriding the default branch name, we have introduced a
> > config setting. With this patch, the `git submodule` command learns
> > about this, too.
>
> This was the other reading case (besides guess_remote_head()) that I'm
> most concerned with causing regressions in a world where some repos are
> on "master" and some are on "main".
>
> This value ends up as the output of "submodule--helper remote-branch".
>
> I was initially worried that we used this branch name for the fallback
> when the server doesn't allow us to fetch the sha1 directly, but it
> doesn't look like it. That's good, because handling fallbacks there
> would be tricky.
>
> Instead, we seem to use this only after fetching all of the refs for a
> submodule:
>
>   $ git grep -h -B2 -A11 remote-branch git-submodule.sh
>   		if test -n "$remote"
>   		then
>   			branch=$(git submodule--helper remote-branch "$sm_path")
>   			if test -z "$nofetch"
>   			then
>   				# Fetch remote before determining tracking $sha1
>   				fetch_in_submodule "$sm_path" $depth ||
>   				die "$(eval_gettext "Unable to fetch in submodule path '\$sm_path'")"
>   			fi
>   			remote_name=$(sanitize_submodule_env; cd "$sm_path" && get_default_remote)
>   			sha1=$(sanitize_submodule_env; cd "$sm_path" &&
>   				git rev-parse --verify "${remote_name}/${branch}") ||
>   			die "$(eval_gettext "Unable to find current \${remote_name}/\${branch} revision in submodule path '\$sm_path'")"
>   		fi
>
> and then we just use that branch name to resolve a sha1. So this will
> break cases where you've set init.mainBranch, the submodule repo is
> still on "master", and you haven't configured a branch in .gitmodules.
>
> It seems like, independent of any change in the default branch names, we
> ought to be using $remote_name/HEAD for this case anyway. I suspect that
> would be a behavior improvement by itself, as it means more cases could
> avoid having to specify the branch name in .gitmodules manually.
> Probably nobody noticed so far because "HEAD" is almost always "master"
> in the current world. It technically breaks the case that you truly did
> want to use "master" in the submodule, but they set HEAD to something
> else, and you couldn't be bothered to put it into your .gitmodules file.
> That seems rather unlikely to me.
>
> And then everything would Just Work without having to worry about the
> local mainbranch value at all.

This is the route that I am taking.

Please note that t7519 contains a few test cases that rely on the current
confusing behavior where `git submodule update --remote` fetches the
remote `master` even if that is not the remote repository's current
branch!

I did adjust t7519 to stop verifying this confusing behavior, and to
verify the saner behavior instead.

This is of course a bit worrisome, as there might actually be users out
there relying on the confusing behavior.

However, I think it is okay to fix this:

- The `git submodule update --remote` command does not strike me as
  awfully common. In fact, I had never heard of it before I worked on this
  here patch.

- Current Git's behavior when running this command is outright confusing,
  unless the remote repository's current branch _is_ `master` (in which
  case the proposed behavior matches the old behavior).

- It is actually easily fixed by setting `submodule.<name>.branch` to
  `master` _iff_ users want to reinstate the old behavior.

> Alternatively, submodule--helper could pass back the empty string for
> "no, we don't have a configured branch name" and this shell code could
> actually try a sequence of reasonable guesses: init.mainbranch, then
> "master" (and between the two, "main" if that later becomes the
> default).

Quite honestly: I'd rather not.

Thank you,
Dscho
Jeff King June 23, 2020, 9:14 p.m. UTC | #3
On Tue, Jun 23, 2020 at 11:03:57PM +0200, Johannes Schindelin wrote:

> > [use HEAD instead of master]
> This is the route that I am taking.
> [...]
> This is of course a bit worrisome, as there might actually be users out
> there relying on the confusing behavior.
> 
> However, I think it is okay to fix this:
> 
> - The `git submodule update --remote` command does not strike me as
>   awfully common. In fact, I had never heard of it before I worked on this
>   here patch.
> 
> - Current Git's behavior when running this command is outright confusing,
>   unless the remote repository's current branch _is_ `master` (in which
>   case the proposed behavior matches the old behavior).
> 
> - It is actually easily fixed by setting `submodule.<name>.branch` to
>   `master` _iff_ users want to reinstate the old behavior.

Thanks, I agree that this is the best way forward, and that the current
confusing behavior is as likely to be doing the wrong thing as the right
(when it even differs).

-Peff
diff mbox series

Patch

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 46c03d2a126..48d73e8d9cb 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -1980,8 +1980,14 @@  static const char *remote_submodule_branch(const char *path)
 		branch = sub->branch;
 	free(key);
 
-	if (!branch)
-		return "master";
+	if (!branch) {
+		static char *fall_back;
+
+		if (!fall_back)
+			fall_back = git_main_branch_name(MAIN_BRANCH_FOR_INIT);
+
+		return fall_back;
+	}
 
 	if (!strcmp(branch, ".")) {
 		const char *refname = resolve_ref_unsafe("HEAD", 0, NULL, NULL);
diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
index 4fb447a143e..641113afef4 100755
--- a/t/t7406-submodule-update.sh
+++ b/t/t7406-submodule-update.sh
@@ -70,6 +70,13 @@  test_expect_success 'setup a submodule tree' '
 	)
 '
 
+test_expect_success 'update --remote uses configured default main branch' '
+	git clone super main-branch &&
+	test_must_fail git -C main-branch -c init.defaultBranch=hello \
+		submodule update --init --remote submodule 2>err &&
+	test_i18ngrep origin/hello err
+'
+
 test_expect_success 'submodule update detaching the HEAD ' '
 	(cd super/submodule &&
 	 git reset --hard HEAD~1