diff mbox series

clone: in protocol v2, use remote's default branch

Message ID 20201211210508.2337494-1-jonathantanmy@google.com (mailing list archive)
State New, archived
Headers show
Series clone: in protocol v2, use remote's default branch | expand

Commit Message

Jonathan Tan Dec. 11, 2020, 9:05 p.m. UTC
When cloning an empty repository, a default branch is created. However,
it is named after the locally configured init.defaultBranch, not the
default branch of the remote repository.

To solve this, the remote needs to communicate the target of the HEAD
symref, and "git clone" needs to use this information.

Currently, symrefs that have unborn targets (such as in this case) are
not communicated by the protocol. Teach Git to advertise and support the
"unborn" feature in "ls-refs" (guarded by the lsrefs.unborn config).
This feature indicates that "ls-refs" supports the "unborn" argument;
when it is specified, "ls-refs" will send the HEAD symref with the name
of its unborn target.

On the client side, Git will always send the "unborn" argument if it is
supported by the server. During "git clone", if cloning an empty
repository, Git will use the new information to determine the local
branch to create. In all other cases, Git will ignore it.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 Documentation/technical/protocol-v2.txt | 10 ++++-
 builtin/clone.c                         | 19 +++++++--
 builtin/fetch-pack.c                    |  3 +-
 builtin/fetch.c                         |  2 +-
 builtin/ls-remote.c                     |  2 +-
 builtin/remote.c                        |  2 +-
 connect.c                               | 29 +++++++++++--
 ls-refs.c                               | 54 +++++++++++++++++++++++--
 ls-refs.h                               |  1 +
 remote.h                                |  3 +-
 serve.c                                 |  2 +-
 t/t5606-clone-options.sh                |  7 ++--
 t/t5702-protocol-v2.sh                  | 17 ++++++++
 transport-helper.c                      |  7 +++-
 transport-internal.h                    | 13 +++---
 transport.c                             | 29 ++++++++-----
 transport.h                             |  7 +++-
 17 files changed, 165 insertions(+), 42 deletions(-)

Comments

Junio C Hamano Dec. 11, 2020, 11:41 p.m. UTC | #1
Jonathan Tan <jonathantanmy@google.com> writes:

> When cloning an empty repository, a default branch is created. However,
> it is named after the locally configured init.defaultBranch, not the
> default branch of the remote repository.

The default branch of the remote repository and the current branch
pointed at by their HEAD in the remote repository can be different,
and we are interested in setting our HEAD to the latter.  So 

	..., not the current branch of the remote repository.

> To solve this, the remote needs to communicate the target of the HEAD
> symref, and "git clone" needs to use this information.

Yes.  That's a good change (I am on vacation today, so I won't be
reading the changes themselves today, but I agree with the intent
100%).

Thanks.
Ævar Arnfjörð Bjarmason Dec. 14, 2020, 12:38 p.m. UTC | #2
On Fri, Dec 11 2020, Jonathan Tan wrote:

> When cloning an empty repository, a default branch is created. However,
> it is named after the locally configured init.defaultBranch, not the
> default branch of the remote repository.
>
> To solve this, the remote needs to communicate the target of the HEAD
> symref, and "git clone" needs to use this information.
>
> Currently, symrefs that have unborn targets (such as in this case) are
> not communicated by the protocol. Teach Git to advertise and support the
> "unborn" feature in "ls-refs" (guarded by the lsrefs.unborn config).
> This feature indicates that "ls-refs" supports the "unborn" argument;
> when it is specified, "ls-refs" will send the HEAD symref with the name
> of its unborn target.
>
> On the client side, Git will always send the "unborn" argument if it is
> supported by the server. During "git clone", if cloning an empty
> repository, Git will use the new information to determine the local
> branch to create. In all other cases, Git will ignore it.

I'm not a fan of this change not because of the whole s/master/whatever/
discussion, but because of the magic it adds for seemingly little gain &
without any documentation.

So if I have init.defaultBranch explicitly set that'll be ignored on
"clone", but on "init/git remote add/fetch" it won't?

I think so, and I swear I knew yesterday when I read this patch, but now
I can't remember. Anyway, the point that I avoided re-reading the patch
to find out, because even if there's an on-list answer to that it should
really be documented because I'll forget it next week, and our users
will never know :)

This patch also leaves Documentation/config/init.txt untouched, and now
under lsrefs.unborn it explicitly contradicts the behavior of git:

    Allows overriding the default branch name e.g. when initializing
    a new repository or when cloning an empty repository.

Shouldn't this at the very least be a
init.defaultBranchFromRemote=<bool> which if set overrides
init.defaultBranch? We could turn that to "true" by default and get the
same behavior as you have here, but with less inexplicable magic for the
user, no?

It seems if you're a user and wonder why a clone of a bare repo doesn't
give you "init" defaults the only way you'll find out is
GIT_TRACE_PACKET and the like.

Another reason I'm not a fan of it is because it's another piece of
magic "clone" does that you can't emulate in "init/fetch". We have
e.g. --single-branch as an existing case of that (although you can at
least do that with parse ls-remote -> init -> config -> fetch), and
that's a case that doesn't fit into a refspec.

But shouldn't there at least be a corresponding "fetch" option? On init
we'll create head, but "git fetch --clobber-my-idea-of-HEAD-with-remote
..."?

Maybe not for reasons I haven't thought of, but I'd at least be much
happier with an updated commit message justifying another special-case
in clone that you can't do with "init/fetch".

And on the "litte gain" side of things: I very much suspect that the
only users who'll ever use this will be some big hosting providers (but
maybe not, the commit doesn't suggest a use-case). Wouldn't this be even
more useful in those cases by just a pre-receive hook on their side
detecting an initial push refusing "master", and:

    git push -o yes-use-old-init-default <...>

Instead of a patch to git to do the same & which would take $SOMEYEARS
to be rolled out, since it depends on client-side understanding.

Comment on the patch below (okey I did read some of it:):

> diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
> index e597b74da3..dfe03aa114 100644
> --- a/Documentation/technical/protocol-v2.txt
> +++ b/Documentation/technical/protocol-v2.txt
> @@ -192,11 +192,19 @@ ls-refs takes in the following arguments:
>  	When specified, only references having a prefix matching one of
>  	the provided prefixes are displayed.
>  
> +If the 'unborn' feature is advertised the following argument can be
> +included in the client's request.
> +
> +    unborn
> +	The server may send symrefs pointing to unborn branches in the form
> +	"unborn <refname> symref-target:<target>".
> +
>  The output of ls-refs is as follows:
>  
>      output = *ref
>  	     flush-pkt
> -    ref = PKT-LINE(obj-id SP refname *(SP ref-attribute) LF)
> +    obj-id-or-unborn = (obj-id | "unborn")
> +    ref = PKT-LINE(obj-id-or-unborn SP refname *(SP ref-attribute) LF)
>      ref-attribute = (symref | peeled)
>      symref = "symref-target:" symref-target
>      peeled = "peeled:" obj-id
> diff --git a/builtin/clone.c b/builtin/clone.c
> index a0841923cf..217c87fddf 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -980,6 +980,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>  	int submodule_progress;
>  
>  	struct strvec ref_prefixes = STRVEC_INIT;
> +	char *unborn_head_target = NULL;
>  
>  	packet_trace_identity("clone");
>  
> @@ -1264,7 +1265,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>  	if (!option_no_tags)
>  		strvec_push(&ref_prefixes, "refs/tags/");
>  
> -	refs = transport_get_remote_refs(transport, &ref_prefixes);
> +	refs = transport_get_remote_refs(transport, &ref_prefixes,
> +					 &unborn_head_target);
>  
>  	if (refs) {
>  		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
> @@ -1323,10 +1325,20 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>  		remote_head = NULL;
>  		option_no_checkout = 1;
>  		if (!option_bare) {
> -			const char *branch = git_default_branch_name();
> -			char *ref = xstrfmt("refs/heads/%s", branch);
> +			const char *branch;
> +			char *ref;
> +
> +			if (unborn_head_target &&
> +			    skip_prefix(unborn_head_target, "refs/heads/", &branch)) {
> +				ref = unborn_head_target;
> +				unborn_head_target = NULL;
> +			} else {
> +				branch = git_default_branch_name();
> +				ref = xstrfmt("refs/heads/%s", branch);
> +			}
>  
>  			install_branch_config(0, branch, remote_name, ref);
> +			create_symref("HEAD", ref, "");
>  			free(ref);
>  		}
>  	}
> @@ -1373,6 +1385,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>  	strbuf_release(&key);
>  	junk_mode = JUNK_LEAVE_ALL;
>  
> +	free(unborn_head_target);
>  	strvec_clear(&ref_prefixes);
>  	return err;
>  }
> diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
> index 58b7c1fbdc..9f921dfab4 100644
> --- a/builtin/fetch-pack.c
> +++ b/builtin/fetch-pack.c
> @@ -220,7 +220,8 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
>  	version = discover_version(&reader);
>  	switch (version) {
>  	case protocol_v2:
> -		get_remote_refs(fd[1], &reader, &ref, 0, NULL, NULL, args.stateless_rpc);
> +		get_remote_refs(fd[1], &reader, &ref, 0, NULL, NULL,
> +				args.stateless_rpc, NULL);
>  		break;
>  	case protocol_v1:
>  	case protocol_v0:
> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index ecf8537605..a7ef59acfc 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -1393,7 +1393,7 @@ static int do_fetch(struct transport *transport,
>  
>  	if (must_list_refs) {
>  		trace2_region_enter("fetch", "remote_refs", the_repository);
> -		remote_refs = transport_get_remote_refs(transport, &ref_prefixes);
> +		remote_refs = transport_get_remote_refs(transport, &ref_prefixes, NULL);
>  		trace2_region_leave("fetch", "remote_refs", the_repository);
>  	} else
>  		remote_refs = NULL;
> diff --git a/builtin/ls-remote.c b/builtin/ls-remote.c
> index 092917eca2..4cf3f60b1b 100644
> --- a/builtin/ls-remote.c
> +++ b/builtin/ls-remote.c
> @@ -118,7 +118,7 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix)
>  	if (server_options.nr)
>  		transport->server_options = &server_options;
>  
> -	ref = transport_get_remote_refs(transport, &ref_prefixes);
> +	ref = transport_get_remote_refs(transport, &ref_prefixes, NULL);
>  	if (ref) {
>  		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
>  		repo_set_hash_algo(the_repository, hash_algo);
> diff --git a/builtin/remote.c b/builtin/remote.c
> index c1b211b272..246e62f118 100644
> --- a/builtin/remote.c
> +++ b/builtin/remote.c
> @@ -950,7 +950,7 @@ static int get_remote_ref_states(const char *name,
>  	if (query) {
>  		transport = transport_get(states->remote, states->remote->url_nr > 0 ?
>  			states->remote->url[0] : NULL);
> -		remote_refs = transport_get_remote_refs(transport, NULL);
> +		remote_refs = transport_get_remote_refs(transport, NULL, NULL);
>  		transport_disconnect(transport);
>  
>  		states->queried = 1;
> diff --git a/connect.c b/connect.c
> index 8b8f56cf6d..3c35324b4c 100644
> --- a/connect.c
> +++ b/connect.c
> @@ -376,7 +376,8 @@ struct ref **get_remote_heads(struct packet_reader *reader,
>  }
>  
>  /* Returns 1 when a valid ref has been added to `list`, 0 otherwise */
> -static int process_ref_v2(struct packet_reader *reader, struct ref ***list)
> +static int process_ref_v2(struct packet_reader *reader, struct ref ***list,
> +			  char **unborn_head_target)
>  {
>  	int ret = 1;
>  	int i = 0;
> @@ -397,6 +398,25 @@ static int process_ref_v2(struct packet_reader *reader, struct ref ***list)
>  		goto out;
>  	}
>  
> +	if (!strcmp("unborn", line_sections.items[i].string)) {
> +		i++;
> +		if (unborn_head_target &&
> +		    !strcmp("HEAD", line_sections.items[i++].string)) {
> +			/*
> +			 * Look for the symref target (if any). If found,
> +			 * return it to the caller.
> +			 */
> +			for (; i < line_sections.nr; i++) {
> +				const char *arg = line_sections.items[i].string;
> +
> +				if (skip_prefix(arg, "symref-target:", &arg)) {
> +					*unborn_head_target = xstrdup(arg);
> +					break;
> +				}
> +			}
> +		}
> +		goto out;
> +	}
>  	if (parse_oid_hex_algop(line_sections.items[i++].string, &old_oid, &end, reader->hash_algo) ||
>  	    *end) {
>  		ret = 0;
> @@ -455,7 +475,8 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
>  			     struct ref **list, int for_push,
>  			     const struct strvec *ref_prefixes,
>  			     const struct string_list *server_options,
> -			     int stateless_rpc)
> +			     int stateless_rpc,
> +			     char **unborn_head_target)
>  {
>  	int i;
>  	const char *hash_name;
> @@ -488,6 +509,8 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
>  	if (!for_push)
>  		packet_write_fmt(fd_out, "peel\n");
>  	packet_write_fmt(fd_out, "symrefs\n");
> +	if (server_supports_feature("ls-refs", "unborn", 0))
> +		packet_write_fmt(fd_out, "unborn\n");
>  	for (i = 0; ref_prefixes && i < ref_prefixes->nr; i++) {
>  		packet_write_fmt(fd_out, "ref-prefix %s\n",
>  				 ref_prefixes->v[i]);
> @@ -496,7 +519,7 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
>  
>  	/* Process response from server */
>  	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
> -		if (!process_ref_v2(reader, &list))
> +		if (!process_ref_v2(reader, &list, unborn_head_target))
>  			die(_("invalid ls-refs response: %s"), reader->line);
>  	}
>  
> diff --git a/ls-refs.c b/ls-refs.c
> index a1e0b473e4..fdb644b482 100644
> --- a/ls-refs.c
> +++ b/ls-refs.c
> @@ -32,6 +32,8 @@ struct ls_refs_data {
>  	unsigned peel;
>  	unsigned symrefs;
>  	struct strvec prefixes;
> +	unsigned allow_unborn : 1;
> +	unsigned unborn : 1;
>  };
>  
>  static int send_ref(const char *refname, const struct object_id *oid,
> @@ -47,7 +49,10 @@ static int send_ref(const char *refname, const struct object_id *oid,
>  	if (!ref_match(&data->prefixes, refname_nons))
>  		return 0;
>  
> -	strbuf_addf(&refline, "%s %s", oid_to_hex(oid), refname_nons);
> +	if (oid)
> +		strbuf_addf(&refline, "%s %s", oid_to_hex(oid), refname_nons);
> +	else
> +		strbuf_addf(&refline, "unborn %s", refname_nons);
>  	if (data->symrefs && flag & REF_ISSYMREF) {
>  		struct object_id unused;
>  		const char *symref_target = resolve_ref_unsafe(refname, 0,
> @@ -74,8 +79,28 @@ static int send_ref(const char *refname, const struct object_id *oid,
>  	return 0;
>  }
>  
> -static int ls_refs_config(const char *var, const char *value, void *data)
> +static void send_possibly_unborn_head(struct ls_refs_data *data)
>  {
> +	struct strbuf namespaced = STRBUF_INIT;
> +	struct object_id oid;
> +	int flag;
> +	int null_oid;
> +
> +	strbuf_addf(&namespaced, "%sHEAD", get_git_namespace());
> +	resolve_ref_unsafe(namespaced.buf, 0, &oid, &flag);
> +	null_oid = is_null_oid(&oid);
> +	if (!null_oid || (data->symrefs && (flag & REF_ISSYMREF)))
> +		send_ref(namespaced.buf, null_oid ? NULL : &oid, flag, data);
> +	strbuf_release(&namespaced);
> +}
> +
> +static int ls_refs_config(const char *var, const char *value, void *cb_data)
> +{
> +	struct ls_refs_data *data = cb_data;
> +
> +	if (!strcmp("lsrefs.unborn", var))
> +		data->allow_unborn = !strcmp(value, "allow") ||
> +			!strcmp(value, "advertise");
>  	/*
>  	 * We only serve fetches over v2 for now, so respect only "uploadpack"
>  	 * config. This may need to eventually be expanded to "receive", but we
> @@ -91,7 +116,7 @@ int ls_refs(struct repository *r, struct strvec *keys,
>  
>  	memset(&data, 0, sizeof(data));
>  
> -	git_config(ls_refs_config, NULL);
> +	git_config(ls_refs_config, &data);
>  
>  	while (packet_reader_read(request) == PACKET_READ_NORMAL) {
>  		const char *arg = request->line;
> @@ -103,14 +128,35 @@ int ls_refs(struct repository *r, struct strvec *keys,
>  			data.symrefs = 1;
>  		else if (skip_prefix(arg, "ref-prefix ", &out))
>  			strvec_push(&data.prefixes, out);
> +		else if (data.allow_unborn && !strcmp("unborn", arg))
> +			data.unborn = 1;
>  	}
>  
>  	if (request->status != PACKET_READ_FLUSH)
>  		die(_("expected flush after ls-refs arguments"));
>  
> -	head_ref_namespaced(send_ref, &data);
> +	if (data.unborn)
> +		send_possibly_unborn_head(&data);
> +	else
> +		head_ref_namespaced(send_ref, &data);
>  	for_each_namespaced_ref(send_ref, &data);
>  	packet_flush(1);
>  	strvec_clear(&data.prefixes);
>  	return 0;
>  }
> +
> +int ls_refs_advertise(struct repository *r, struct strbuf *value)
> +{
> +	if (value) {
> +		char *str = NULL;
> +
> +		if (!repo_config_get_string(the_repository, "lsrefs.unborn",
> +					    &str) &&
> +		    !strcmp("advertise", str)) {
> +			strbuf_addstr(value, "unborn");
> +			free(str);
> +		}
> +	}
> +
> +	return 1;
> +}
> diff --git a/ls-refs.h b/ls-refs.h
> index 7b33a7c6b8..a99e4be0bd 100644
> --- a/ls-refs.h
> +++ b/ls-refs.h
> @@ -6,5 +6,6 @@ struct strvec;
>  struct packet_reader;
>  int ls_refs(struct repository *r, struct strvec *keys,
>  	    struct packet_reader *request);
> +int ls_refs_advertise(struct repository *r, struct strbuf *value);
>  
>  #endif /* LS_REFS_H */
> diff --git a/remote.h b/remote.h
> index 3211abdf05..967f2178d8 100644
> --- a/remote.h
> +++ b/remote.h
> @@ -198,7 +198,8 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
>  			     struct ref **list, int for_push,
>  			     const struct strvec *ref_prefixes,
>  			     const struct string_list *server_options,
> -			     int stateless_rpc);
> +			     int stateless_rpc,
> +			     char **unborn_head_target);
>  
>  int resolve_remote_symref(struct ref *ref, struct ref *list);
>  
> diff --git a/serve.c b/serve.c
> index f6341206c4..30cb56d507 100644
> --- a/serve.c
> +++ b/serve.c
> @@ -62,7 +62,7 @@ struct protocol_capability {
>  
>  static struct protocol_capability capabilities[] = {
>  	{ "agent", agent_advertise, NULL },
> -	{ "ls-refs", always_advertise, ls_refs },
> +	{ "ls-refs", ls_refs_advertise, ls_refs },
>  	{ "fetch", upload_pack_advertise, upload_pack_v2 },
>  	{ "server-option", always_advertise, NULL },
>  	{ "object-format", object_format_advertise, NULL },

All of this looks good to me, and re unrelated recent questions about
packfile-uri I had it's really nice to have a narrow example of adding a
simple ls-refs time verb / functionality like this to the protocol.

> diff --git a/t/t5606-clone-options.sh b/t/t5606-clone-options.sh
> index 7f082fb23b..d3bd79987b 100755
> --- a/t/t5606-clone-options.sh
> +++ b/t/t5606-clone-options.sh
> @@ -102,11 +102,12 @@ test_expect_success 'redirected clone -v does show progress' '
>  '
>  
>  test_expect_success 'chooses correct default initial branch name' '
> -	git init --bare empty &&
> +	git -c init.defaultBranch=foo init --bare empty &&
> +	test_config -C empty lsrefs.unborn advertise &&

Isn't this reducing test coverage? You're changing an existing
argument-less "init --bare" test's behavior,

>  	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
>  	git -c init.defaultBranch=up clone empty whats-up &&
> -	test refs/heads/up = $(git -C whats-up symbolic-ref HEAD) &&
> -	test refs/heads/up = $(git -C whats-up config branch.up.merge)
> +	test refs/heads/foo = $(git -C whats-up symbolic-ref HEAD) &&
> +	test refs/heads/foo = $(git -C whats-up config branch.foo.merge)
>  '

Also re the above point about discoverability: Right below this we test
"init --initial-branch=guess". Wouldn't a way to unify bring
fetch/init/clone functionality be to use that as a jump-off point,
i.e. clone having --use-remote-initial-branch, init optionally leaving
behind a (broken) empty/nonexisting HEAD, and "fetch" with an argument
also supporting --use-remote-initial-branch or something.

> +test_expect_success 'clone of empty repo propagates name of default branch' '
> +	git -c init.defaultbranch=mydefaultbranch init file_empty_parent &&
> +	test_config -C file_empty_parent lsrefs.unborn advertise &&
> +
> +	git -c init.defaultbranch=main -c protocol.version=2 \
> +		clone "file://$(pwd)/file_empty_parent" file_empty_child &&

Nit. Let's spell config.likeThis not config.likethis when not in the C
code.
Felipe Contreras Dec. 14, 2020, 3:51 p.m. UTC | #3
Ævar Arnfjörð Bjarmason wrote:
> On Fri, Dec 11 2020, Jonathan Tan wrote:
> 
> > When cloning an empty repository, a default branch is created. However,
> > it is named after the locally configured init.defaultBranch, not the
> > default branch of the remote repository.
> >
> > To solve this, the remote needs to communicate the target of the HEAD
> > symref, and "git clone" needs to use this information.
> >
> > Currently, symrefs that have unborn targets (such as in this case) are
> > not communicated by the protocol. Teach Git to advertise and support the
> > "unborn" feature in "ls-refs" (guarded by the lsrefs.unborn config).
> > This feature indicates that "ls-refs" supports the "unborn" argument;
> > when it is specified, "ls-refs" will send the HEAD symref with the name
> > of its unborn target.
> >
> > On the client side, Git will always send the "unborn" argument if it is
> > supported by the server. During "git clone", if cloning an empty
> > repository, Git will use the new information to determine the local
> > branch to create. In all other cases, Git will ignore it.
> 
> I'm not a fan of this change not because of the whole s/master/whatever/
> discussion, but because of the magic it adds for seemingly little gain &
> without any documentation.

I am against the master rename, and yet I am in favor of this patch.

I have been running git with "init.defaultbranch=foobar" to prepare
myself to a future in which the Git project chooses an objectively
inferior default branch name.

When I clone an empty repository, I expect the branch name to be chosen
by the person who created that repository (not 'foobar'). If GitHub
chooses to name the default branch "main", they can tell their users to
always clone the empty repository, and the users don't need to be
instructed to do anything else (like "git init -b main").

This way the Git project could follow a simple maxim:

  He who creates the repository chooses the master branch name

And this in addition offloads the burden on the Git project to choose a
particular default branch name.

> So if I have init.defaultBranch explicitly set that'll be ignored on
> "clone", but on "init/git remote add/fetch" it won't?

It is already ignored on clone... except when the repository is empty.

> I think so, and I swear I knew yesterday when I read this patch, but now
> I can't remember. Anyway, the point that I avoided re-reading the patch
> to find out, because even if there's an on-list answer to that it should
> really be documented because I'll forget it next week, and our users
> will never know :)

I think the patch does bring the expected behavior. The current behavior
is the one that is unexpected, and has been unnoticed simply because
most repositories use the name "master".

Some people would call that unexpected behavior a bug.

By removing the bug we don't have to document it.

> This patch also leaves Documentation/config/init.txt untouched, and now
> under lsrefs.unborn it explicitly contradicts the behavior of git:
> 
>     Allows overriding the default branch name e.g. when initializing
>     a new repository or when cloning an empty repository.

That should be updated.

> Shouldn't this at the very least be a
> init.defaultBranchFromRemote=<bool> which if set overrides
> init.defaultBranch? We could turn that to "true" by default and get the
> same behavior as you have here, but with less inexplicable magic for the
> user, no?

I don't think init.defaultbranch has lived long enough for people to
rely on the "buggy" behavior.

> It seems if you're a user and wonder why a clone of a bare repo doesn't
> give you "init" defaults the only way you'll find out is
> GIT_TRACE_PACKET and the like.

Yeah, but who created that repository?

If you configure Git to use "master", and cloning a new repository from
GitHub fetches "main", you know who to blame.

Let them take backlash.

I suspect they will eventually be forced to provide an option.

> Another reason I'm not a fan of it is because it's another piece of
> magic "clone" does that you can't emulate in "init/fetch". We have
> e.g. --single-branch as an existing case of that (although you can at
> least do that with parse ls-remote -> init -> config -> fetch), and
> that's a case that doesn't fit into a refspec.
> 
> But shouldn't there at least be a corresponding "fetch" option? On init
> we'll create head, but "git fetch --clobber-my-idea-of-HEAD-with-remote
> ..."?

That would be better, yes.

But let's not let the perfect be the enemy of the good.

> And on the "litte gain" side of things: I very much suspect that the
> only users who'll ever use this will be some big hosting providers (but
> maybe not, the commit doesn't suggest a use-case).

Yes, and that's enough reason.

I say let's offload the branch name decision to them, and let *them*
deal with the fallback from their users.

> Wouldn't this be even more useful in those cases by just a pre-receive
> hook on their side detecting an initial push refusing "master", and:

But that's not what they want to do (I suspect).

They want the default branch name be "main", but still the user could
rename the branch, do the initial commit, and push without problems.

Yes, it would take years to roll this change, but it also takes years
for them to update their initial repository instructions too (they
haven't included "git init -b main" yet).

Cheers.
Junio C Hamano Dec. 14, 2020, 4:30 p.m. UTC | #4
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

>> On the client side, Git will always send the "unborn" argument if it is
>> supported by the server. During "git clone", if cloning an empty
>> repository, Git will use the new information to determine the local
>> branch to create. In all other cases, Git will ignore it.
>
> I'm not a fan of this change not because of the whole s/master/whatever/
> discussion, but because of the magic it adds for seemingly little gain &
> without any documentation.
>
> So if I have init.defaultBranch explicitly set that'll be ignored on
> "clone", but on "init/git remote add/fetch" it won't?

That description is backwards.

To help interoperate with the repository you cloned from better, we
made it easy to use whatever your 'origin' uses. "git clone" does so
by (1) in the original implementation, by inferring where HEAD
points at over there by comparing the objects reported for HEAD and
tips of branches (2) later, by adding symref capability to the
protocl so that the sending repository can tell exactly which branch
its HEAD points at.  What was lacking was that symref capability is
not sent if there is nothing in the repository.  And I think this is
an attempt to bring that "cloning nothing" case in line with a clone
of a repository with contents.

> Shouldn't this at the very least be a
> init.defaultBranchFromRemote=<bool> which if set overrides
> init.defaultBranch? We could turn that to "true" by default and get the
> same behavior as you have here, but with less inexplicable magic for the
> user, no?

I view the change in the patch being discussed a bugfix (clone ought
to follow whatever the other side uses by default, unless you say -b,
and the case when cloning an empty repository was buggy).  I am OK
if we wanted to consider a _new_ feature to always use the name you
want locally (i.e. as if you added "-b $(git config init.defaultBranch)"
on your "git clone" command line), but that is a new feature that needs
to be discussed in a separate topic.

> Another reason I'm not a fan of it is because it's another piece of
> magic "clone" does that you can't emulate in "init/fetch".

That ship has sailed longlonglong time ago when dfeff66e (revamp
git-clone., 2006-03-20) started pointing our HEAD to match theirs.

> But shouldn't there at least be a corresponding "fetch" option? On init
> we'll create head, but "git fetch --clobber-my-idea-of-HEAD-with-remote
> ..."?

It may be nice to have a corresponding one, but again, that is a
separate topic on a new feature, and not relevant in the context of
this fix.

> Maybe not for reasons I haven't thought of, but I'd at least be much
> happier with an updated commit message justifying another special-case
> in clone that you can't do with "init/fetch".

This is *not* another special-case, but is 14-year old outstanding
one, so I do not think there specifically needs such justification.
The log message DOES need to be clarified.  Your mistaking that this
is a new feature and not a bugfix may be a good indication that the
proposed log message is not doing its job.

> And on the "litte gain" side of things: I very much suspect that the
> only users who'll ever use this will be some big hosting providers (but
> maybe not, the commit doesn't suggest a use-case).

Explorers who learn this new GitHub or GitLab thingy, create an
empty repository there and then clone it to their local disk, just
to dip their toes in the water, would most benefit.  Those of us who
are working on an already existing and populated projects won't be
helped or bothered.  We do sometimes create our own repositories and
publish to hosting sites, and I expect that many experienced Git
users follow the "local first and the push", and they won't be
helped or bothered.

But I expect some do "create a void at the hosting site and clone to
get a local playpen" for their real projects.  They would be helped,
and because Git userbase is populous enough that their number in
absolute terms would not be insignificant, even if they weren't in
percentage terms.
Jonathan Tan Dec. 14, 2020, 7:25 p.m. UTC | #5
> I'm not a fan of this change not because of the whole s/master/whatever/
> discussion, but because of the magic it adds for seemingly little gain &
> without any documentation.
> 
> So if I have init.defaultBranch explicitly set that'll be ignored on
> "clone", but on "init/git remote add/fetch" it won't?
> 
> I think so, and I swear I knew yesterday when I read this patch, but now
> I can't remember. Anyway, the point that I avoided re-reading the patch
> to find out, because even if there's an on-list answer to that it should
> really be documented because I'll forget it next week, and our users
> will never know :)

That's the plan - yes. It makes sense to me that "git clone" will not
use "init.defaultBranch" (especially since it has "init" in the name),
but "git init" will. (It also makes sense to me that "git remote add"
and "git fetch" will not change HEAD.)

> This patch also leaves Documentation/config/init.txt untouched, and now
> under lsrefs.unborn it explicitly contradicts the behavior of git:
> 
>     Allows overriding the default branch name e.g. when initializing
>     a new repository or when cloning an empty repository.

Ah...thanks for the pointer. I'll change it.

> Shouldn't this at the very least be a
> init.defaultBranchFromRemote=<bool> which if set overrides
> init.defaultBranch? We could turn that to "true" by default and get the
> same behavior as you have here, but with less inexplicable magic for the
> user, no?

I think you're coming with the idea that it is perfectly natural for
"git clone" to respect "init.defaultBranch", but that doesn't even
happen in the typical case wherein we clone a non-empty repository - so
I don't agree with that idea.

> It seems if you're a user and wonder why a clone of a bare repo doesn't
> give you "init" defaults the only way you'll find out is
> GIT_TRACE_PACKET and the like.

I assume you mean empty repo instead of bare repo? For me, I would find
it more surprising that the resulting local repo didn't have the same
HEAD as the remote.

> Another reason I'm not a fan of it is because it's another piece of
> magic "clone" does that you can't emulate in "init/fetch". We have
> e.g. --single-branch as an existing case of that (although you can at
> least do that with parse ls-remote -> init -> config -> fetch), and
> that's a case that doesn't fit into a refspec.

Same answer as above.

> But shouldn't there at least be a corresponding "fetch" option? On init
> we'll create head, but "git fetch --clobber-my-idea-of-HEAD-with-remote
> ..."?

I think that it's OK for "clone" to create HEAD, but not OK for "fetch"
to modify HEAD.

> Maybe not for reasons I haven't thought of, but I'd at least be much
> happier with an updated commit message justifying another special-case
> in clone that you can't do with "init/fetch".

Same answer as above - I don't think this is a special case.

> And on the "litte gain" side of things: I very much suspect that the
> only users who'll ever use this will be some big hosting providers (but
> maybe not, the commit doesn't suggest a use-case). Wouldn't this be even
> more useful in those cases by just a pre-receive hook on their side
> detecting an initial push refusing "master", and:
> 
>     git push -o yes-use-old-init-default <...>
> 
> Instead of a patch to git to do the same & which would take $SOMEYEARS
> to be rolled out, since it depends on client-side understanding.

This would detect the problem only upon push.

> > @@ -62,7 +62,7 @@ struct protocol_capability {
> >  
> >  static struct protocol_capability capabilities[] = {
> >  	{ "agent", agent_advertise, NULL },
> > -	{ "ls-refs", always_advertise, ls_refs },
> > +	{ "ls-refs", ls_refs_advertise, ls_refs },
> >  	{ "fetch", upload_pack_advertise, upload_pack_v2 },
> >  	{ "server-option", always_advertise, NULL },
> >  	{ "object-format", object_format_advertise, NULL },
> 
> All of this looks good to me, and re unrelated recent questions about
> packfile-uri I had it's really nice to have a narrow example of adding a
> simple ls-refs time verb / functionality like this to the protocol.

Thanks.

> > diff --git a/t/t5606-clone-options.sh b/t/t5606-clone-options.sh
> > index 7f082fb23b..d3bd79987b 100755
> > --- a/t/t5606-clone-options.sh
> > +++ b/t/t5606-clone-options.sh
> > @@ -102,11 +102,12 @@ test_expect_success 'redirected clone -v does show progress' '
> >  '
> >  
> >  test_expect_success 'chooses correct default initial branch name' '
> > -	git init --bare empty &&
> > +	git -c init.defaultBranch=foo init --bare empty &&
> > +	test_config -C empty lsrefs.unborn advertise &&
> 
> Isn't this reducing test coverage? You're changing an existing
> argument-less "init --bare" test's behavior,

The test here is regarding "clone", not the behavior of "init". I'm
doing some textual comparison below, so I want to insulate this test
against future default branch name changes.

> 
> >  	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
> >  	git -c init.defaultBranch=up clone empty whats-up &&
> > -	test refs/heads/up = $(git -C whats-up symbolic-ref HEAD) &&
> > -	test refs/heads/up = $(git -C whats-up config branch.up.merge)
> > +	test refs/heads/foo = $(git -C whats-up symbolic-ref HEAD) &&
> > +	test refs/heads/foo = $(git -C whats-up config branch.foo.merge)
> >  '
> 
> Also re the above point about discoverability: Right below this we test
> "init --initial-branch=guess". Wouldn't a way to unify bring
> fetch/init/clone functionality be to use that as a jump-off point,
> i.e. clone having --use-remote-initial-branch, 

OK - this is already happening for non-empty repositories, and my patch
makes it also happen for empty repositories.

> init optionally leaving
> behind a (broken) empty/nonexisting HEAD, 

I'm not sure how this is superior to just using what the remote has
(upon "clone") and using init.defaultBranch when no remote is involved
(upon "init").

> and "fetch" with an argument
> also supporting --use-remote-initial-branch or something.

Again, I don't think that "fetch" should update HEAD.

> 
> > +test_expect_success 'clone of empty repo propagates name of default branch' '
> > +	git -c init.defaultbranch=mydefaultbranch init file_empty_parent &&
> > +	test_config -C file_empty_parent lsrefs.unborn advertise &&
> > +
> > +	git -c init.defaultbranch=main -c protocol.version=2 \
> > +		clone "file://$(pwd)/file_empty_parent" file_empty_child &&
> 
> Nit. Let's spell config.likeThis not config.likethis when not in the C
> code.

OK - will do.
Felipe Contreras Dec. 14, 2020, 7:42 p.m. UTC | #6
Jonathan Tan wrote:
> > But shouldn't there at least be a corresponding "fetch" option? On init
> > we'll create head, but "git fetch --clobber-my-idea-of-HEAD-with-remote
> > ..."?
> 
> I think that it's OK for "clone" to create HEAD, but not OK for "fetch"
> to modify HEAD.

Not the local HEAD, the remote HEAD.

See my proposal to update the remote head in different scenarios:

https://lore.kernel.org/git/20201118091219.3341585-1-felipe.contreras@gmail.com/
Jeff King Dec. 15, 2020, 1:27 a.m. UTC | #7
On Fri, Dec 11, 2020 at 01:05:08PM -0800, Jonathan Tan wrote:

> Subject: Re: [PATCH] clone: in protocol v2, use remote's default branch
> 
> When cloning an empty repository, a default branch is created. However,
> it is named after the locally configured init.defaultBranch, not the
> default branch of the remote repository.

Your subject line puzzled me at first, because I thought we already did
that. And indeed we do, but this is about adding the unborn case. I
think this contributed to Ævar's confusion.

Maybe:

  Subject: clone: respect unborn remote HEAD

  When cloning, we choose the default branch based on the remote HEAD.
  But if there is no remote HEAD, we'll fall back to using our local
  init.defaultBranch. Traditionally this hasn't been a big deal, because
  everybody used "master" as the default. But these days it is likely to
  cause confusion if the server and client implementations choose
  different values (e.g., if the remote started with "main", we may
  choose "master" locally, create commits there, and then the user is
  surprised when they push to "master" and not "main").

  To solve this...

makes the current state more clear, as well as motivating why we care.

It might also be worth breaking the patch up a bit. E.g., implement the
capability in upload-pack, then infrastructure for the client to use the
capability and surface the info to transport callers, and then finally
surface it to in the program logic of ls-refs, then clone, etc.

Not strictly necessary, but it make it easier to see what is being
changed at each step.

> Currently, symrefs that have unborn targets (such as in this case) are
> not communicated by the protocol. Teach Git to advertise and support the
> "unborn" feature in "ls-refs" (guarded by the lsrefs.unborn config).
> This feature indicates that "ls-refs" supports the "unborn" argument;
> when it is specified, "ls-refs" will send the HEAD symref with the name
> of its unborn target.

It's probably also worth mentioning that v0 won't get any support here,
and why.

-Peff
Ævar Arnfjörð Bjarmason Dec. 15, 2020, 1:41 a.m. UTC | #8
On Mon, Dec 14 2020, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>> Maybe not for reasons I haven't thought of, but I'd at least be much
>> happier with an updated commit message justifying another special-case
>> in clone that you can't do with "init/fetch".
>
> This is *not* another special-case, but is 14-year old outstanding
> one, so I do not think there specifically needs such justification.
> The log message DOES need to be clarified.  Your mistaking that this
> is a new feature and not a bugfix may be a good indication that the
> proposed log message is not doing its job.

For context: This clone feature has been there since early 2009, it
wasn't until late 2017/early 2018 that we had protocol v2 that gave us
the ability to fix the bug.

I suppose the distinction between what's new behavior and what's a
bugfix in something that was really meant to work a certain way all
along but didn't is too subtle for me to discern sometimes :)

86ac7518590 (Allow cloning an empty repository, 2009-01-23) which added
it seems to match my mental model of it being just a shortcut for some
of the the URL config "init" otherwise wouldn't setup for you. At a time
when git-init.txt said:

    An initial `HEAD` file that references the HEAD of the master branch
    is also created.

> > Another reason I'm not a fan of it is because it's another piece of
> > magic "clone" does that you can't emulate in "init/fetch".
> 
> That ship has sailed longlonglong time ago when dfeff66e (revamp
> git-clone., 2006-03-20) started pointing our HEAD to match theirs.

Let me rephrase: I think it's unfortunate when we add new things things
to porcelain commands that aren't easy or possible to emulate in
plumbing.

I.e. this feature seems like a candidate to be exposed by something like
by a ls-remote flag if you'd like to do an init/config/fetch. AFAIK the
only way to do it is to mock a "clone" with GIT_TRACE_PACKET or get the
information out-of-bounds.

>> And on the "litte gain" side of things: I very much suspect that the
>> only users who'll ever use this will be some big hosting providers (but
>> maybe not, the commit doesn't suggest a use-case).
>
> Explorers who learn this new GitHub or GitLab thingy, create an
> empty repository there and then clone it to their local disk, just
> to dip their toes in the water, would most benefit.  Those of us who
> are working on an already existing and populated projects won't be
> helped or bothered.  We do sometimes create our own repositories and
> publish to hosting sites, and I expect that many experienced Git
> users follow the "local first and the push", and they won't be
> helped or bothered.
>
> But I expect some do "create a void at the hosting site and clone to
> get a local playpen" for their real projects.  They would be helped,
> and because Git userbase is populous enough that their number in
> absolute terms would not be insignificant, even if they weren't in
> percentage terms.

That's how I've always used it. Seems from the above-referenced
5cd12b85fe8 that's what it was meant for to begin with.

Anyway, there's 3 replies to my E-Mail including yours insisting this
makes perfect sense, I'm happy to go along with the consensus. I wrote
my reply with the assumption that it was obvious that this was a change
in established behavior, but apparently that's not the prevailing view.

To borrow from Felipe Contreras's reply in the side-thread "I expect the
branch name to be chosen by the person who created that repository".

I suppose this comes down to a mental model of what it means to have
"created a repository". When I click "create repo" on those popular
hosting sites (e.g. github & gitlab) and clone it I was expecting it to
just be a shorthand init + a URL in my config (and refspecs...).

That's also what happens with this patch if you "git init --bare
/tmp/my.git", then edit the HEAD symref to point to "foobar" and clone
it with file:///, it'll be "master" in your clone (or whatever
init.defaultBranch is). Isn't that discrepancy a bug then?

But of course then when you push your "foobar" as the first branch the
HEAD symref won't be updated. In the olden times when everyone ran their
own git server this was a common FAQ, "just run 'git symbolic-ref'".

On both of those big hosting sites (didn't test others) whatever their
preferred default name is they'll go with your idea and update HEAD's
pointer on the first such push. So this notion that the default unborn
symref isn't transported & it's up to the client to set it on-push (or
manually afterwards) has been reinforced by in-the-wild use.
Junio C Hamano Dec. 15, 2020, 2:22 a.m. UTC | #9
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> I.e. this feature seems like a candidate to be exposed by something like
> by a ls-remote flag if you'd like to do an init/config/fetch. AFAIK the
> only way to do it is to mock a "clone" with GIT_TRACE_PACKET or get the
> information out-of-bounds.

Yes, I think the updated protocol should be able to help adding a
new feature usable by script writers, and I agree that ls-remote may
be the ideal home for such a feature.

> To borrow from Felipe Contreras's reply in the side-thread "I expect the
> branch name to be chosen by the person who created that repository".

I expect a bit differently.  My expectation is that "git clone"
tries to help you inter-operate well with the project you clone.
You may be creating your local repository by cloning theirs, but
because I do not expect "who created" matters more than how well
end-users' workflows would work in the resulting repository, I
do not expect local init.defaultBranch should matter here.

If the project you would eventually push back to designates one
branch as its primary branch everybody is expected to work off of
(which is what it means to point it with their HEAD), it is
convenient if your local clone names your primary branch to match.
The push.default settings like 'simple' and 'current' are designed
to work well when your local branch namespace matches what they
have.

> I suppose this comes down to a mental model of what it means to have
> "created a repository". When I click "create repo" on those popular
> hosting sites (e.g. github & gitlab) and clone it I was expecting it to
> just be a shorthand init + a URL in my config (and refspecs...).

So, no, I do not think "who created a repository" has much to do
with the objective of the patch in question.  It's really "what's
the upstream's view of the primary branch".

> That's also what happens with this patch if you "git init --bare
> /tmp/my.git", then edit the HEAD symref to point to "foobar" and clone
> it with file:///, it'll be "master" in your clone (or whatever
> init.defaultBranch is). Isn't that discrepancy a bug then?

Yes, I view it as the same bug to be fixed; JTan's protocol update
patch only deals with the transport based on the git protocol and
does not (yet?) address the --local short-cut.  In principle, it
should be a lot easier than the protocol update.  Any takers?

> On both of those big hosting sites (didn't test others) whatever their
> preferred default name is they'll go with your idea and update HEAD's
> pointer on the first such push. So this notion that the default unborn
> symref isn't transported & it's up to the client to set it on-push (or
> manually afterwards) has been reinforced by in-the-wild use.

I think it would be great if somebody comes up with a protocol
update for that "other" direction to push into an unborn HEAD.  I
haven't thought things through, but you may be right to point out
that the "clone learns and prepares local to match the other side"
we are discussing may not be complete with such a corresponding fix
in the opposite direction.

Thanks.
Jeff King Dec. 15, 2020, 2:38 a.m. UTC | #10
On Tue, Dec 15, 2020 at 02:41:38AM +0100, Ævar Arnfjörð Bjarmason wrote:

> > > Another reason I'm not a fan of it is because it's another piece of
> > > magic "clone" does that you can't emulate in "init/fetch".
> > 
> > That ship has sailed longlonglong time ago when dfeff66e (revamp
> > git-clone., 2006-03-20) started pointing our HEAD to match theirs.
> 
> Let me rephrase: I think it's unfortunate when we add new things things
> to porcelain commands that aren't easy or possible to emulate in
> plumbing.
> 
> I.e. this feature seems like a candidate to be exposed by something like
> by a ls-remote flag if you'd like to do an init/config/fetch. AFAIK the
> only way to do it is to mock a "clone" with GIT_TRACE_PACKET or get the
> information out-of-bounds.

I think the situation is better than that. We are surfacing the remote
HEAD here, and there is already a command for copying that to our local
tracking symref: "git remote set-head origin -a", which will set up
refs/remotes/origin/HEAD.

I think there are two ways we could improve that further:

  - making it more natural to pick up or update the remote HEAD via
    fetch; Felipe's patches to git-fetch look good to me

  - it might be nice to be able to have some equivalent to the dwim "git
    checkout foo" that creates a new "foo" based off of origin/foo.
    Doing "git checkout origin/HEAD" will detach the HEAD. I think right
    now you'd have to do something like:

      tracking=$(git symbolic-ref refs/remotes/origin/HEAD)
      branch=${tracking#refs/remotes/origin/}
      git checkout -b $branch $tracking

    Or maybe not. It's not something people probably need to do a lot.
    And if the point is to have plumbing commands that can do the same,
    then maybe those commands are sufficient.

-Peff
Junio C Hamano Dec. 15, 2020, 2:55 a.m. UTC | #11
Jeff King <peff@peff.net> writes:

> I think the situation is better than that. We are surfacing the remote
> HEAD here, and there is already a command for copying that to our local
> tracking symref: "git remote set-head origin -a", which will set up
> refs/remotes/origin/HEAD.
>
> I think there are two ways we could improve that further:
>
>   - making it more natural to pick up or update the remote HEAD via
>     fetch; Felipe's patches to git-fetch look good to me

I do not mind that as an option (not the default) to the "git fetch"
command.  But I think Ævar was driving at the lack of a scriptable
building block.

>   - it might be nice to be able to have some equivalent to the dwim "git
>     checkout foo" that creates a new "foo" based off of origin/foo.
>     Doing "git checkout origin/HEAD" will detach the HEAD. I think right
>     now you'd have to do something like:
>
>       tracking=$(git symbolic-ref refs/remotes/origin/HEAD)
>       branch=${tracking#refs/remotes/origin/}
>       git checkout -b $branch $tracking

Meaning "git checkout origin" would look at origin/HEAD and find the
remote-tracking branch it points at, and uses that name?  I think
that does make quite a lot of sense.  You are correct to point out
that not just "git checkout origin/HEAD", but "git checkout origin",
currently detaches the HEAD at that commit, if you have origin/HEAD
pointing at one of the remote-tracking branches.

But if we were to make such a change, "git fetch" shouldn't
automatically update remotes/origin/HEAD, I would think.  It does
not matter too much if we are talking about a publishing repository
where the HEAD rarely changes (and when it does, it is a significant
event that everybody in the downstream should take notice), but if
you clone from a live repository with active development, you do not
want to lose a stable reference to what you consider as the primary
branch at your origin repository.

>     Or maybe not. It's not something people probably need to do a lot.
>     And if the point is to have plumbing commands that can do the same,
>     then maybe those commands are sufficient.
Felipe Contreras Dec. 15, 2020, 3:22 a.m. UTC | #12
Ævar Arnfjörð Bjarmason wrote:

> To borrow from Felipe Contreras's reply in the side-thread "I expect the
> branch name to be chosen by the person who created that repository".
> 
> I suppose this comes down to a mental model of what it means to have
> "created a repository". When I click "create repo" on those popular
> hosting sites (e.g. github & gitlab) and clone it I was expecting it to
> just be a shorthand init + a URL in my config (and refspecs...).

Indeed. But then it would be *them* taking away agency from the user (by
not allowing the user to choose the name of the branch), not us.

In Spanish we have a saying: "don't give extra change", which is similar to
the lawyery advice: "don't volunteer [information]".

Let's not volunteer user complaints. Let GitHub take some of those.
Jeff King Dec. 15, 2020, 4:36 a.m. UTC | #13
On Mon, Dec 14, 2020 at 06:55:33PM -0800, Junio C Hamano wrote:

> >   - it might be nice to be able to have some equivalent to the dwim "git
> >     checkout foo" that creates a new "foo" based off of origin/foo.
> >     Doing "git checkout origin/HEAD" will detach the HEAD. I think right
> >     now you'd have to do something like:
> >
> >       tracking=$(git symbolic-ref refs/remotes/origin/HEAD)
> >       branch=${tracking#refs/remotes/origin/}
> >       git checkout -b $branch $tracking
> 
> Meaning "git checkout origin" would look at origin/HEAD and find the
> remote-tracking branch it points at, and uses that name?  I think
> that does make quite a lot of sense.  You are correct to point out
> that not just "git checkout origin/HEAD", but "git checkout origin",
> currently detaches the HEAD at that commit, if you have origin/HEAD
> pointing at one of the remote-tracking branches.

I'm not sure if it's a good idea to change "git checkout origin" here or
not. It already does something useful. I was mostly suggesting that the
other thing might _also_ be useful, but I'm not sure if it is wise to
change the current behavior.

I was thinking more like an explicit way to trigger the dwim-behavior,
like:

  # same as "git checkout foo" magic that creates "foo", but we
  # have said explicitly both that we expect to make the new branch, and
  # also that we expect it to come from origin.
  git checkout --make-local origin/foo

  # similar, but because we are being explicit, we know it is reasonable
  # to dereference HEAD to find the actual branch name
  git checkout --make-local origin/HEAD

I dunno. I hate the name "--make-local", and in the non-dereferencing
form, it is not much different than just "git checkout -b foo
origin/foo". I'm mostly just thinking aloud here. :)

> But if we were to make such a change, "git fetch" shouldn't
> automatically update remotes/origin/HEAD, I would think.  It does
> not matter too much if we are talking about a publishing repository
> where the HEAD rarely changes (and when it does, it is a significant
> event that everybody in the downstream should take notice), but if
> you clone from a live repository with active development, you do not
> want to lose a stable reference to what you consider as the primary
> branch at your origin repository.

That seems orthogonal. Whether there is checkout magic or not, changing
what origin/HEAD points to would be disruptive to selecting it as a
tracking source, or doing diffs, or whatever. But that is why the
proposal in that series was to make the behavior configurable, and
default to "fill it in if missing" as the default, not "always update on
fetch".

-Peff
Jonathan Tan Dec. 15, 2020, 7:10 p.m. UTC | #14
> > Subject: Re: [PATCH] clone: in protocol v2, use remote's default branch
> > 
> > When cloning an empty repository, a default branch is created. However,
> > it is named after the locally configured init.defaultBranch, not the
> > default branch of the remote repository.
> 
> Your subject line puzzled me at first, because I thought we already did
> that. And indeed we do, but this is about adding the unborn case. I
> think this contributed to Ævar's confusion.
> 
> Maybe:
> 
>   Subject: clone: respect unborn remote HEAD
> 
>   When cloning, we choose the default branch based on the remote HEAD.
>   But if there is no remote HEAD, we'll fall back to using our local
>   init.defaultBranch. Traditionally this hasn't been a big deal, because
>   everybody used "master" as the default. But these days it is likely to
>   cause confusion if the server and client implementations choose
>   different values (e.g., if the remote started with "main", we may
>   choose "master" locally, create commits there, and then the user is
>   surprised when they push to "master" and not "main").
> 
>   To solve this...
> 
> makes the current state more clear, as well as motivating why we care.
> 
> It might also be worth breaking the patch up a bit. E.g., implement the
> capability in upload-pack, then infrastructure for the client to use the
> capability and surface the info to transport callers, and then finally
> surface it to in the program logic of ls-refs, then clone, etc.
> 
> Not strictly necessary, but it make it easier to see what is being
> changed at each step.

All this sounds good.

> > Currently, symrefs that have unborn targets (such as in this case) are
> > not communicated by the protocol. Teach Git to advertise and support the
> > "unborn" feature in "ls-refs" (guarded by the lsrefs.unborn config).
> > This feature indicates that "ls-refs" supports the "unborn" argument;
> > when it is specified, "ls-refs" will send the HEAD symref with the name
> > of its unborn target.
> 
> It's probably also worth mentioning that v0 won't get any support here,
> and why.

OK - thanks for your comments. I'll send out an updated version soon.
Junio C Hamano Dec. 16, 2020, 3:09 a.m. UTC | #15
Jeff King <peff@peff.net> writes:

>> Meaning "git checkout origin" would look at origin/HEAD and find the
>> remote-tracking branch it points at, and uses that name?  I think
>> that does make quite a lot of sense.  You are correct to point out
>> that not just "git checkout origin/HEAD", but "git checkout origin",
>> currently detaches the HEAD at that commit, if you have origin/HEAD
>> pointing at one of the remote-tracking branches.
>
> I'm not sure if it's a good idea to change "git checkout origin" here or
> not. It already does something useful. I was mostly suggesting that the
> other thing might _also_ be useful, but I'm not sure if it is wise to
> change the current behavior.

Well, "git checkout origin/HEAD" would also do something useful,
which happens to be identical to "git checkout origin", to detach
HEAD at the commit.

> I was thinking more like an explicit way to trigger the dwim-behavior,
> like:
>
>   # same as "git checkout foo" magic that creates "foo", but we
>   # have said explicitly both that we expect to make the new branch, and
>   # also that we expect it to come from origin.
>   git checkout --make-local origin/foo

By default I think --guess (formerly known as --dwim) is enabled, so
"git checkout foo" is "git checkout --guess foo", which is making
local 'foo' out of the uniquely found remote-tracking branch.  This
new one is to reduce the "uniquely found" part from the magic and
let you be a bit more explicit, but not explicit enough to say "-t"
or "-b foo"?  I am not sure if this is all that useful.

If this were a slightly different proposal, I would see the
convenience value in it, though.  Currently what "--guess" does is:

      If the name 'foo' given does not exist as a local branch,
      and the name appears exactly once as a remote-tracking branch
      from some remote (i.e. 'refs/remotes/origin/foo' exists, but
      there is no other 'refs/remotes/*/foo'), create a local 'foo'
      that builds on that remote-tracking branch and check it out.

What would happen if we tweaked the existing "--guess" behaviour
slightly?

      "git checkout --guess origin/foo", even when there is a second
      remote 'publish' that also has a remote-tracking branch for
      its 'foo' (i.e. both 'refs/remotes/{origin,publish}/foo'
      exists), can be used to disambiguate among these remotes with
      'foo'.  You'd get local 'foo' that builds on 'foo' from the
      remote 'origin' and check it out.

>   # similar, but because we are being explicit, we know it is reasonable
>   # to dereference HEAD to find the actual branch name
>   git checkout --make-local origin/HEAD

The user does not need "git symbolic-ref refs/remotes/origin/HEAD"
if such a feature were available.  "git checkout --some-option origin"
without having to say /HEAD may be a better UI, though.

And "checkout" being a Porcelain, and the DWIM feature that is
always on is subject to be improved for human use, I do not see why
that --some-option cannot be --guess.  If I want to get the current
behaviour, I can explicitly say "git checkout --detach origin"
anyway, no?

> That seems orthogonal. Whether there is checkout magic or not, changing
> what origin/HEAD points to would be disruptive to selecting it as a
> tracking source, or doing diffs, or whatever. But that is why the
> proposal in that series was to make the behavior configurable, and
> default to "fill it in if missing" as the default, not "always update on
> fetch".

Ah, I totally forgot that the favoured variant was "fill in if
missing, but don't move once it is set".  Yes, I think that is a
sensible default.

Thanks.
Jeff King Dec. 16, 2020, 6:39 p.m. UTC | #16
On Tue, Dec 15, 2020 at 07:09:50PM -0800, Junio C Hamano wrote:

> > I'm not sure if it's a good idea to change "git checkout origin" here or
> > not. It already does something useful. I was mostly suggesting that the
> > other thing might _also_ be useful, but I'm not sure if it is wise to
> > change the current behavior.
> 
> Well, "git checkout origin/HEAD" would also do something useful,
> which happens to be identical to "git checkout origin", to detach
> HEAD at the commit.

To be clear, I meant that both of those do the useful thing, and I'm not
sure if it would be confusing to users to change that (but see below).

> > I was thinking more like an explicit way to trigger the dwim-behavior,
> > like:
> >
> >   # same as "git checkout foo" magic that creates "foo", but we
> >   # have said explicitly both that we expect to make the new branch, and
> >   # also that we expect it to come from origin.
> >   git checkout --make-local origin/foo
> 
> By default I think --guess (formerly known as --dwim) is enabled, so
> "git checkout foo" is "git checkout --guess foo", which is making
> local 'foo' out of the uniquely found remote-tracking branch.  This
> new one is to reduce the "uniquely found" part from the magic and
> let you be a bit more explicit, but not explicit enough to say "-t"
> or "-b foo"?  I am not sure if this is all that useful.

I agree it's not all that useful in that example. What I was thinking is
that by giving the implicit/heuristic magic a more explicit verbose
name, then we make it natural to extend the explicit version to more
cases where it might be questionable to do it implicitly.

So no, I doubt anybody would normally type what I wrote above. But it
lets us explain it as:

  - there's a feature "--make-local" (I still hate the name) that makes
    a local branch from a remote one if it doesn't already exist

  - that feature knows how to resolve symrefs and create the branch from
    the pointed-to name

  - as a shortcut, "git checkout foo" is a synonym for "--make-local
    origin/foo" when "origin/foo" exists but "foo" does not

It's definitely not worth it, though, if we decide that there's an
implicit/heuristic syntax that should trigger the symref thing.

> If this were a slightly different proposal, I would see the
> convenience value in it, though.  Currently what "--guess" does is:
> 
>       If the name 'foo' given does not exist as a local branch,
>       and the name appears exactly once as a remote-tracking branch
>       from some remote (i.e. 'refs/remotes/origin/foo' exists, but
>       there is no other 'refs/remotes/*/foo'), create a local 'foo'
>       that builds on that remote-tracking branch and check it out.
> 
> What would happen if we tweaked the existing "--guess" behaviour
> slightly?
> 
>       "git checkout --guess origin/foo", even when there is a second
>       remote 'publish' that also has a remote-tracking branch for
>       its 'foo' (i.e. both 'refs/remotes/{origin,publish}/foo'
>       exists), can be used to disambiguate among these remotes with
>       'foo'.  You'd get local 'foo' that builds on 'foo' from the
>       remote 'origin' and check it out.

I forgot we had --guess. Piggy-backing on that might be sensible as a
stronger "explicit" signal that this is what the user wants (though
"--guess" is still a funny name here, because we're no longer guessing
at all; the user told us what they want).

But yeah, the semantics you outlined in the second paragraph match what
I was expecting "--make-local" to do.

> >   # similar, but because we are being explicit, we know it is reasonable
> >   # to dereference HEAD to find the actual branch name
> >   git checkout --make-local origin/HEAD
> 
> The user does not need "git symbolic-ref refs/remotes/origin/HEAD"
> if such a feature were available.  "git checkout --some-option origin"
> without having to say /HEAD may be a better UI, though.

Right. I'm assuming that "origin/HEAD" and "origin" could be used
interchangeably in my example.

> And "checkout" being a Porcelain, and the DWIM feature that is
> always on is subject to be improved for human use, I do not see why
> that --some-option cannot be --guess.  If I want to get the current
> behaviour, I can explicitly say "git checkout --detach origin"
> anyway, no?

I think:

  git checkout --guess origin

would make sense to dereference origin/HEAD to "foo", as if we had said
"git checkout foo". That's the explicit part that seems safe. My
question is whether:

  git checkout origin

should likewise do so. As you note, one can always use --detach to make
their intention clear, and checkout is a porcelain, so we are OK to
change it. But would users find that annoying? I frequently use "git
checkout origin" to get a detached HEAD pointing at your master (e.g.,
because I want to reproduce a bug, or do a "something like this..."
patch). I'm sure I could retrain my fingers, but I wonder if I'm not the
only one.

Doing it for only an explicit "--guess" turns that feature into a
tri-state (explicitly off, explicitly on, or "implicit, so be a little
more conservative"). Which perhaps is harder to explain, but I think
cleanly adds the new feature in a consistent way, without really
changing any existing behavior.

Related, I assume that:

  git checkout --guess origin/foo
  git checkout origin/foo

should behave the same as their "origin" or "origin/HEAD" counterparts
for consistency (obviously making "foo" in the former case, and either
detaching or making "foo" in the second case, depending on what you
think of the earlier paragraphs).

-Peff
Junio C Hamano Dec. 16, 2020, 8:56 p.m. UTC | #17
Jeff King <peff@peff.net> writes:

> I think:
>
>   git checkout --guess origin
>
> would make sense to dereference origin/HEAD to "foo", as if we had said
> "git checkout foo". That's the explicit part that seems safe. My
> question is whether:
>
>   git checkout origin
>
> should likewise do so.

I see.  I think "--guess" is by default true, so unless you have
checkout.guess=false configured, my answer to the above question is
yes.

> As you note, one can always use --detach to make
> their intention clear, and checkout is a porcelain, so we are OK to
> change it. But would users find that annoying? I frequently use "git
> checkout origin" to get a detached HEAD pointing at your master (e.g.,
> because I want to reproduce a bug, or do a "something like this..."
> patch). I'm sure I could retrain my fingers, but I wonder if I'm not the
> only one.

My fingers say "git checkout X^0" instead of "--detach" when I want
to detach for any value of X (e.g. "HEAD", "v2.28.0").  But I do
understand people like to be implicit when they can.

> Doing it for only an explicit "--guess" turns that feature into a
> tri-state (explicitly off, explicitly on, or "implicit, so be a little
> more conservative"). Which perhaps is harder to explain, but I think
> cleanly adds the new feature in a consistent way, without really
> changing any existing behavior.

Hmmmm...  I do not offhand know if that is a good idea or not.

> Related, I assume that:
>
>   git checkout --guess origin/foo
>   git checkout origin/foo
>
> should behave the same as their "origin" or "origin/HEAD" counterparts
> for consistency (obviously making "foo" in the former case, and either
> detaching or making "foo" in the second case, depending on what you
> think of the earlier paragraphs).

I think that is what I said in the "what would happen if we tweaked"
paragraph about using origin/ prefix as a disambiguator?  Then yes,
I think we are in agreement.

Thanks.
Jeff King Dec. 18, 2020, 6:19 a.m. UTC | #18
On Wed, Dec 16, 2020 at 12:56:22PM -0800, Junio C Hamano wrote:

> > I think:
> >
> >   git checkout --guess origin
> >
> > would make sense to dereference origin/HEAD to "foo", as if we had said
> > "git checkout foo". That's the explicit part that seems safe. My
> > question is whether:
> >
> >   git checkout origin
> >
> > should likewise do so.
> 
> I see.  I think "--guess" is by default true, so unless you have
> checkout.guess=false configured, my answer to the above question is
> yes.

Yes, I agree with the current definition of "--guess", the two would be
the same. I'm just concerned that people will be unhappy with changing
the behavior of the latter, so everything else (the "tri-state --guess"
thing) is an attempt to band-aid over that.

If we decide it's not a concern worth addressing, then I agree the two
should behave the same. I'm just not convinced it won't annoy people who
are used to how "git checkout" works now with non-local branches.

-Peff
diff mbox series

Patch

diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
index e597b74da3..dfe03aa114 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -192,11 +192,19 @@  ls-refs takes in the following arguments:
 	When specified, only references having a prefix matching one of
 	the provided prefixes are displayed.
 
+If the 'unborn' feature is advertised the following argument can be
+included in the client's request.
+
+    unborn
+	The server may send symrefs pointing to unborn branches in the form
+	"unborn <refname> symref-target:<target>".
+
 The output of ls-refs is as follows:
 
     output = *ref
 	     flush-pkt
-    ref = PKT-LINE(obj-id SP refname *(SP ref-attribute) LF)
+    obj-id-or-unborn = (obj-id | "unborn")
+    ref = PKT-LINE(obj-id-or-unborn SP refname *(SP ref-attribute) LF)
     ref-attribute = (symref | peeled)
     symref = "symref-target:" symref-target
     peeled = "peeled:" obj-id
diff --git a/builtin/clone.c b/builtin/clone.c
index a0841923cf..217c87fddf 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -980,6 +980,7 @@  int cmd_clone(int argc, const char **argv, const char *prefix)
 	int submodule_progress;
 
 	struct strvec ref_prefixes = STRVEC_INIT;
+	char *unborn_head_target = NULL;
 
 	packet_trace_identity("clone");
 
@@ -1264,7 +1265,8 @@  int cmd_clone(int argc, const char **argv, const char *prefix)
 	if (!option_no_tags)
 		strvec_push(&ref_prefixes, "refs/tags/");
 
-	refs = transport_get_remote_refs(transport, &ref_prefixes);
+	refs = transport_get_remote_refs(transport, &ref_prefixes,
+					 &unborn_head_target);
 
 	if (refs) {
 		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
@@ -1323,10 +1325,20 @@  int cmd_clone(int argc, const char **argv, const char *prefix)
 		remote_head = NULL;
 		option_no_checkout = 1;
 		if (!option_bare) {
-			const char *branch = git_default_branch_name();
-			char *ref = xstrfmt("refs/heads/%s", branch);
+			const char *branch;
+			char *ref;
+
+			if (unborn_head_target &&
+			    skip_prefix(unborn_head_target, "refs/heads/", &branch)) {
+				ref = unborn_head_target;
+				unborn_head_target = NULL;
+			} else {
+				branch = git_default_branch_name();
+				ref = xstrfmt("refs/heads/%s", branch);
+			}
 
 			install_branch_config(0, branch, remote_name, ref);
+			create_symref("HEAD", ref, "");
 			free(ref);
 		}
 	}
@@ -1373,6 +1385,7 @@  int cmd_clone(int argc, const char **argv, const char *prefix)
 	strbuf_release(&key);
 	junk_mode = JUNK_LEAVE_ALL;
 
+	free(unborn_head_target);
 	strvec_clear(&ref_prefixes);
 	return err;
 }
diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 58b7c1fbdc..9f921dfab4 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -220,7 +220,8 @@  int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 	version = discover_version(&reader);
 	switch (version) {
 	case protocol_v2:
-		get_remote_refs(fd[1], &reader, &ref, 0, NULL, NULL, args.stateless_rpc);
+		get_remote_refs(fd[1], &reader, &ref, 0, NULL, NULL,
+				args.stateless_rpc, NULL);
 		break;
 	case protocol_v1:
 	case protocol_v0:
diff --git a/builtin/fetch.c b/builtin/fetch.c
index ecf8537605..a7ef59acfc 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1393,7 +1393,7 @@  static int do_fetch(struct transport *transport,
 
 	if (must_list_refs) {
 		trace2_region_enter("fetch", "remote_refs", the_repository);
-		remote_refs = transport_get_remote_refs(transport, &ref_prefixes);
+		remote_refs = transport_get_remote_refs(transport, &ref_prefixes, NULL);
 		trace2_region_leave("fetch", "remote_refs", the_repository);
 	} else
 		remote_refs = NULL;
diff --git a/builtin/ls-remote.c b/builtin/ls-remote.c
index 092917eca2..4cf3f60b1b 100644
--- a/builtin/ls-remote.c
+++ b/builtin/ls-remote.c
@@ -118,7 +118,7 @@  int cmd_ls_remote(int argc, const char **argv, const char *prefix)
 	if (server_options.nr)
 		transport->server_options = &server_options;
 
-	ref = transport_get_remote_refs(transport, &ref_prefixes);
+	ref = transport_get_remote_refs(transport, &ref_prefixes, NULL);
 	if (ref) {
 		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
 		repo_set_hash_algo(the_repository, hash_algo);
diff --git a/builtin/remote.c b/builtin/remote.c
index c1b211b272..246e62f118 100644
--- a/builtin/remote.c
+++ b/builtin/remote.c
@@ -950,7 +950,7 @@  static int get_remote_ref_states(const char *name,
 	if (query) {
 		transport = transport_get(states->remote, states->remote->url_nr > 0 ?
 			states->remote->url[0] : NULL);
-		remote_refs = transport_get_remote_refs(transport, NULL);
+		remote_refs = transport_get_remote_refs(transport, NULL, NULL);
 		transport_disconnect(transport);
 
 		states->queried = 1;
diff --git a/connect.c b/connect.c
index 8b8f56cf6d..3c35324b4c 100644
--- a/connect.c
+++ b/connect.c
@@ -376,7 +376,8 @@  struct ref **get_remote_heads(struct packet_reader *reader,
 }
 
 /* Returns 1 when a valid ref has been added to `list`, 0 otherwise */
-static int process_ref_v2(struct packet_reader *reader, struct ref ***list)
+static int process_ref_v2(struct packet_reader *reader, struct ref ***list,
+			  char **unborn_head_target)
 {
 	int ret = 1;
 	int i = 0;
@@ -397,6 +398,25 @@  static int process_ref_v2(struct packet_reader *reader, struct ref ***list)
 		goto out;
 	}
 
+	if (!strcmp("unborn", line_sections.items[i].string)) {
+		i++;
+		if (unborn_head_target &&
+		    !strcmp("HEAD", line_sections.items[i++].string)) {
+			/*
+			 * Look for the symref target (if any). If found,
+			 * return it to the caller.
+			 */
+			for (; i < line_sections.nr; i++) {
+				const char *arg = line_sections.items[i].string;
+
+				if (skip_prefix(arg, "symref-target:", &arg)) {
+					*unborn_head_target = xstrdup(arg);
+					break;
+				}
+			}
+		}
+		goto out;
+	}
 	if (parse_oid_hex_algop(line_sections.items[i++].string, &old_oid, &end, reader->hash_algo) ||
 	    *end) {
 		ret = 0;
@@ -455,7 +475,8 @@  struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 			     struct ref **list, int for_push,
 			     const struct strvec *ref_prefixes,
 			     const struct string_list *server_options,
-			     int stateless_rpc)
+			     int stateless_rpc,
+			     char **unborn_head_target)
 {
 	int i;
 	const char *hash_name;
@@ -488,6 +509,8 @@  struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 	if (!for_push)
 		packet_write_fmt(fd_out, "peel\n");
 	packet_write_fmt(fd_out, "symrefs\n");
+	if (server_supports_feature("ls-refs", "unborn", 0))
+		packet_write_fmt(fd_out, "unborn\n");
 	for (i = 0; ref_prefixes && i < ref_prefixes->nr; i++) {
 		packet_write_fmt(fd_out, "ref-prefix %s\n",
 				 ref_prefixes->v[i]);
@@ -496,7 +519,7 @@  struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 
 	/* Process response from server */
 	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
-		if (!process_ref_v2(reader, &list))
+		if (!process_ref_v2(reader, &list, unborn_head_target))
 			die(_("invalid ls-refs response: %s"), reader->line);
 	}
 
diff --git a/ls-refs.c b/ls-refs.c
index a1e0b473e4..fdb644b482 100644
--- a/ls-refs.c
+++ b/ls-refs.c
@@ -32,6 +32,8 @@  struct ls_refs_data {
 	unsigned peel;
 	unsigned symrefs;
 	struct strvec prefixes;
+	unsigned allow_unborn : 1;
+	unsigned unborn : 1;
 };
 
 static int send_ref(const char *refname, const struct object_id *oid,
@@ -47,7 +49,10 @@  static int send_ref(const char *refname, const struct object_id *oid,
 	if (!ref_match(&data->prefixes, refname_nons))
 		return 0;
 
-	strbuf_addf(&refline, "%s %s", oid_to_hex(oid), refname_nons);
+	if (oid)
+		strbuf_addf(&refline, "%s %s", oid_to_hex(oid), refname_nons);
+	else
+		strbuf_addf(&refline, "unborn %s", refname_nons);
 	if (data->symrefs && flag & REF_ISSYMREF) {
 		struct object_id unused;
 		const char *symref_target = resolve_ref_unsafe(refname, 0,
@@ -74,8 +79,28 @@  static int send_ref(const char *refname, const struct object_id *oid,
 	return 0;
 }
 
-static int ls_refs_config(const char *var, const char *value, void *data)
+static void send_possibly_unborn_head(struct ls_refs_data *data)
 {
+	struct strbuf namespaced = STRBUF_INIT;
+	struct object_id oid;
+	int flag;
+	int null_oid;
+
+	strbuf_addf(&namespaced, "%sHEAD", get_git_namespace());
+	resolve_ref_unsafe(namespaced.buf, 0, &oid, &flag);
+	null_oid = is_null_oid(&oid);
+	if (!null_oid || (data->symrefs && (flag & REF_ISSYMREF)))
+		send_ref(namespaced.buf, null_oid ? NULL : &oid, flag, data);
+	strbuf_release(&namespaced);
+}
+
+static int ls_refs_config(const char *var, const char *value, void *cb_data)
+{
+	struct ls_refs_data *data = cb_data;
+
+	if (!strcmp("lsrefs.unborn", var))
+		data->allow_unborn = !strcmp(value, "allow") ||
+			!strcmp(value, "advertise");
 	/*
 	 * We only serve fetches over v2 for now, so respect only "uploadpack"
 	 * config. This may need to eventually be expanded to "receive", but we
@@ -91,7 +116,7 @@  int ls_refs(struct repository *r, struct strvec *keys,
 
 	memset(&data, 0, sizeof(data));
 
-	git_config(ls_refs_config, NULL);
+	git_config(ls_refs_config, &data);
 
 	while (packet_reader_read(request) == PACKET_READ_NORMAL) {
 		const char *arg = request->line;
@@ -103,14 +128,35 @@  int ls_refs(struct repository *r, struct strvec *keys,
 			data.symrefs = 1;
 		else if (skip_prefix(arg, "ref-prefix ", &out))
 			strvec_push(&data.prefixes, out);
+		else if (data.allow_unborn && !strcmp("unborn", arg))
+			data.unborn = 1;
 	}
 
 	if (request->status != PACKET_READ_FLUSH)
 		die(_("expected flush after ls-refs arguments"));
 
-	head_ref_namespaced(send_ref, &data);
+	if (data.unborn)
+		send_possibly_unborn_head(&data);
+	else
+		head_ref_namespaced(send_ref, &data);
 	for_each_namespaced_ref(send_ref, &data);
 	packet_flush(1);
 	strvec_clear(&data.prefixes);
 	return 0;
 }
+
+int ls_refs_advertise(struct repository *r, struct strbuf *value)
+{
+	if (value) {
+		char *str = NULL;
+
+		if (!repo_config_get_string(the_repository, "lsrefs.unborn",
+					    &str) &&
+		    !strcmp("advertise", str)) {
+			strbuf_addstr(value, "unborn");
+			free(str);
+		}
+	}
+
+	return 1;
+}
diff --git a/ls-refs.h b/ls-refs.h
index 7b33a7c6b8..a99e4be0bd 100644
--- a/ls-refs.h
+++ b/ls-refs.h
@@ -6,5 +6,6 @@  struct strvec;
 struct packet_reader;
 int ls_refs(struct repository *r, struct strvec *keys,
 	    struct packet_reader *request);
+int ls_refs_advertise(struct repository *r, struct strbuf *value);
 
 #endif /* LS_REFS_H */
diff --git a/remote.h b/remote.h
index 3211abdf05..967f2178d8 100644
--- a/remote.h
+++ b/remote.h
@@ -198,7 +198,8 @@  struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 			     struct ref **list, int for_push,
 			     const struct strvec *ref_prefixes,
 			     const struct string_list *server_options,
-			     int stateless_rpc);
+			     int stateless_rpc,
+			     char **unborn_head_target);
 
 int resolve_remote_symref(struct ref *ref, struct ref *list);
 
diff --git a/serve.c b/serve.c
index f6341206c4..30cb56d507 100644
--- a/serve.c
+++ b/serve.c
@@ -62,7 +62,7 @@  struct protocol_capability {
 
 static struct protocol_capability capabilities[] = {
 	{ "agent", agent_advertise, NULL },
-	{ "ls-refs", always_advertise, ls_refs },
+	{ "ls-refs", ls_refs_advertise, ls_refs },
 	{ "fetch", upload_pack_advertise, upload_pack_v2 },
 	{ "server-option", always_advertise, NULL },
 	{ "object-format", object_format_advertise, NULL },
diff --git a/t/t5606-clone-options.sh b/t/t5606-clone-options.sh
index 7f082fb23b..d3bd79987b 100755
--- a/t/t5606-clone-options.sh
+++ b/t/t5606-clone-options.sh
@@ -102,11 +102,12 @@  test_expect_success 'redirected clone -v does show progress' '
 '
 
 test_expect_success 'chooses correct default initial branch name' '
-	git init --bare empty &&
+	git -c init.defaultBranch=foo init --bare empty &&
+	test_config -C empty lsrefs.unborn advertise &&
 	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
 	git -c init.defaultBranch=up clone empty whats-up &&
-	test refs/heads/up = $(git -C whats-up symbolic-ref HEAD) &&
-	test refs/heads/up = $(git -C whats-up config branch.up.merge)
+	test refs/heads/foo = $(git -C whats-up symbolic-ref HEAD) &&
+	test refs/heads/foo = $(git -C whats-up config branch.foo.merge)
 '
 
 test_expect_success 'guesses initial branch name correctly' '
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 7d5b17909b..380333b662 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -209,6 +209,23 @@  test_expect_success 'clone with file:// using protocol v2' '
 	grep "ref-prefix refs/tags/" log
 '
 
+test_expect_success 'clone of empty repo propagates name of default branch' '
+	git -c init.defaultbranch=mydefaultbranch init file_empty_parent &&
+	test_config -C file_empty_parent lsrefs.unborn advertise &&
+
+	git -c init.defaultbranch=main -c protocol.version=2 \
+		clone "file://$(pwd)/file_empty_parent" file_empty_child &&
+	grep "refs/heads/mydefaultbranch" file_empty_child/.git/HEAD
+'
+
+test_expect_success '...but not if it is not advertised' '
+	test_config -C file_empty_parent lsrefs.unborn none &&
+
+	git -c init.defaultbranch=main -c protocol.version=2 \
+		clone "file://$(pwd)/file_empty_parent" file_empty_child_2 &&
+	grep "refs/heads/main" file_empty_child_2/.git/HEAD
+'
+
 test_expect_success 'fetch with file:// using protocol v2' '
 	test_when_finished "rm -f log" &&
 
diff --git a/transport-helper.c b/transport-helper.c
index 5f6e0b3bd8..5d97eba935 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -1162,13 +1162,16 @@  static int has_attribute(const char *attrs, const char *attr)
 }
 
 static struct ref *get_refs_list(struct transport *transport, int for_push,
-				 const struct strvec *ref_prefixes)
+				 const struct strvec *ref_prefixes,
+				 char **unborn_head_target)
 {
 	get_helper(transport);
 
 	if (process_connect(transport, for_push)) {
 		do_take_over(transport);
-		return transport->vtable->get_refs_list(transport, for_push, ref_prefixes);
+		return transport->vtable->get_refs_list(transport, for_push,
+							ref_prefixes,
+							unborn_head_target);
 	}
 
 	return get_refs_list_using_list(transport, for_push);
diff --git a/transport-internal.h b/transport-internal.h
index 27c9daffc4..5037f6197d 100644
--- a/transport-internal.h
+++ b/transport-internal.h
@@ -18,19 +18,16 @@  struct transport_vtable {
 	 * the transport to try to share connections, for_push is a
 	 * hint as to whether the ultimate operation is a push or a fetch.
 	 *
-	 * If communicating using protocol v2 a list of prefixes can be
-	 * provided to be sent to the server to enable it to limit the ref
-	 * advertisement.  Since ref filtering is done on the server's end, and
-	 * only when using protocol v2, this list will be ignored when not
-	 * using protocol v2 meaning this function can return refs which don't
-	 * match the provided ref_prefixes.
-	 *
 	 * If the transport is able to determine the remote hash for
 	 * the ref without a huge amount of effort, it should store it
 	 * in the ref's old_sha1 field; otherwise it should be all 0.
+	 *
+	 * See transport_get_remote_refs() for information on ref_prefixes and
+	 * unborn_head_target.
 	 **/
 	struct ref *(*get_refs_list)(struct transport *transport, int for_push,
-				     const struct strvec *ref_prefixes);
+				     const struct strvec *ref_prefixes,
+				     char **unborn_head_target);
 
 	/**
 	 * Fetch the objects for the given refs. Note that this gets
diff --git a/transport.c b/transport.c
index 47da955e4f..815e175017 100644
--- a/transport.c
+++ b/transport.c
@@ -127,7 +127,8 @@  struct bundle_transport_data {
 
 static struct ref *get_refs_from_bundle(struct transport *transport,
 					int for_push,
-					const struct strvec *ref_prefixes)
+					const struct strvec *ref_prefixes,
+					char **unborn_head_target)
 {
 	struct bundle_transport_data *data = transport->data;
 	struct ref *result = NULL;
@@ -163,7 +164,7 @@  static int fetch_refs_from_bundle(struct transport *transport,
 	int ret;
 
 	if (!data->get_refs_from_bundle_called)
-		get_refs_from_bundle(transport, 0, NULL);
+		get_refs_from_bundle(transport, 0, NULL, NULL);
 	ret = unbundle(the_repository, &data->header, data->fd,
 			   transport->progress ? BUNDLE_VERBOSE : 0);
 	transport->hash_algo = data->header.hash_algo;
@@ -281,7 +282,7 @@  static void die_if_server_options(struct transport *transport)
  */
 static struct ref *handshake(struct transport *transport, int for_push,
 			     const struct strvec *ref_prefixes,
-			     int must_list_refs)
+			     int must_list_refs, char **unborn_head_target)
 {
 	struct git_transport_data *data = transport->data;
 	struct ref *refs = NULL;
@@ -301,7 +302,8 @@  static struct ref *handshake(struct transport *transport, int for_push,
 			get_remote_refs(data->fd[1], &reader, &refs, for_push,
 					ref_prefixes,
 					transport->server_options,
-					transport->stateless_rpc);
+					transport->stateless_rpc,
+					unborn_head_target);
 		break;
 	case protocol_v1:
 	case protocol_v0:
@@ -324,9 +326,11 @@  static struct ref *handshake(struct transport *transport, int for_push,
 }
 
 static struct ref *get_refs_via_connect(struct transport *transport, int for_push,
-					const struct strvec *ref_prefixes)
+					const struct strvec *ref_prefixes,
+					char **unborn_head_target)
 {
-	return handshake(transport, for_push, ref_prefixes, 1);
+	return handshake(transport, for_push, ref_prefixes, 1,
+			 unborn_head_target);
 }
 
 static int fetch_refs_via_pack(struct transport *transport,
@@ -370,7 +374,7 @@  static int fetch_refs_via_pack(struct transport *transport,
 				break;
 			}
 		}
-		refs_tmp = handshake(transport, 0, NULL, must_list_refs);
+		refs_tmp = handshake(transport, 0, NULL, must_list_refs, NULL);
 	}
 
 	if (data->version == protocol_unknown_version)
@@ -765,7 +769,7 @@  static int git_transport_push(struct transport *transport, struct ref *remote_re
 		return -1;
 
 	if (!data->got_remote_heads)
-		get_refs_via_connect(transport, 1, NULL);
+		get_refs_via_connect(transport, 1, NULL, NULL);
 
 	memset(&args, 0, sizeof(args));
 	args.send_mirror = !!(flags & TRANSPORT_PUSH_MIRROR);
@@ -1251,7 +1255,8 @@  int transport_push(struct repository *r,
 
 		trace2_region_enter("transport_push", "get_refs_list", r);
 		remote_refs = transport->vtable->get_refs_list(transport, 1,
-							       &ref_prefixes);
+							       &ref_prefixes,
+							       NULL);
 		trace2_region_leave("transport_push", "get_refs_list", r);
 
 		strvec_clear(&ref_prefixes);
@@ -1370,12 +1375,14 @@  int transport_push(struct repository *r,
 }
 
 const struct ref *transport_get_remote_refs(struct transport *transport,
-					    const struct strvec *ref_prefixes)
+					    const struct strvec *ref_prefixes,
+					    char **unborn_head_target)
 {
 	if (!transport->got_remote_refs) {
 		transport->remote_refs =
 			transport->vtable->get_refs_list(transport, 0,
-							 ref_prefixes);
+							 ref_prefixes,
+							 unborn_head_target);
 		transport->got_remote_refs = 1;
 	}
 
diff --git a/transport.h b/transport.h
index 24558c027d..65de0c9c00 100644
--- a/transport.h
+++ b/transport.h
@@ -241,9 +241,14 @@  int transport_push(struct repository *repo,
  * advertisement.  Since ref filtering is done on the server's end (and only
  * when using protocol v2), this can return refs which don't match the provided
  * ref_prefixes.
+ *
+ * If unborn_head_target is not NULL, and the remote reports HEAD as pointing
+ * to an unborn branch, this function stores the unborn branch in
+ * unborn_head_target. It should be freed by the caller.
  */
 const struct ref *transport_get_remote_refs(struct transport *transport,
-					    const struct strvec *ref_prefixes);
+					    const struct strvec *ref_prefixes,
+					    char **unborn_head_target);
 
 /*
  * Fetch the hash algorithm used by a remote.