Message ID | 20201118091219.3341585-1-felipe.contreras@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [RFC/PATCH] Add fetch.updateHead option | expand |
On Wed, Nov 18 2020, Felipe Contreras wrote: > Users might change the behavior when running "git fetch" so that the > remote's HEAD symbolic ref is updated at certain point. > > For example after running "git remote add" the remote HEAD is not > set like it is with "git clone". > > Setting "fetch.updatehead = missing" would probably be a sensible > default that everyone would want, but for now the default behavior is to > never update HEAD, so there shouldn't be any functional changes. > > For the next major version of Git, we might want to change this default. > > Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> > --- > > This is just a RFC, the tests are missing. I haven't taken much time to re-think through the patch/implications of this, but I remember running into this and going through some pre-patch investigation at some point. It's really annoying in some cases that "clone" isn't creating the same state as "remote". IIRC I was doing some heuristics to figure out the remote branch name etc. Isn't this something we can just change without an option? There were a bunch of cases in clone/fetch that were different for no different reasons, IIRC I patched one or two of those in the past. But I haven't gone through the history of the feature and checked if it was intentional.
On Wed, Nov 18, 2020 at 3:30 AM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote: > I haven't taken much time to re-think through the patch/implications of > this, but I remember running into this and going through some pre-patch > investigation at some point. > > It's really annoying in some cases that "clone" isn't creating the same > state as "remote". IIRC I was doing some heuristics to figure out the > remote branch name etc. > > Isn't this something we can just change without an option? There were a > bunch of cases in clone/fetch that were different for no different > reasons, IIRC I patched one or two of those in the past. But I haven't > gone through the history of the feature and checked if it was > intentional. Apparently remote/HEAD is supposed to be set manually, which is why there is "git remote add -m master", and "git remote set-head origin master". Personally I don't see any point in that. I think if no remote/HEAD is set (manually), it should be set automatically on the first "git fetch", and that should mirror the behavior of "git clone". This would be the equivalent of "fetch.updatehead = missing", which in my opinion should be the default. This configuration was suggested by Jeff King, see: https://lore.kernel.org/git/20201118020618.GE650959@coredump.intra.peff.net/ Personally I don't need a configuration, I would be happy if "fetch.updatehead = missing" was the default. Cheers.
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > On Wed, Nov 18 2020, Felipe Contreras wrote: > >> Users might change the behavior when running "git fetch" so that the >> remote's HEAD symbolic ref is updated at certain point. >> >> For example after running "git remote add" the remote HEAD is not >> set like it is with "git clone". >> >> Setting "fetch.updatehead = missing" would probably be a sensible >> default that everyone would want, but for now the default behavior is to >> never update HEAD, so there shouldn't be any functional changes. >> >> For the next major version of Git, we might want to change this default. >> >> Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> >> --- >> >> This is just a RFC, the tests are missing. > > I haven't taken much time to re-think through the patch/implications of > this, but I remember running into this and going through some pre-patch > investigation at some point. > > It's really annoying in some cases that "clone" isn't creating the same > state as "remote". IIRC I was doing some heuristics to figure out the > remote branch name etc. > > Isn't this something we can just change without an option? There were a > bunch of cases in clone/fetch that were different for no different > reasons, IIRC I patched one or two of those in the past. But I haven't > gone through the history of the feature and checked if it was > intentional. I think what Peff outlined earlier was reasonable. "remote add -f", since it talks with the remote, should be able to learn where their HEAD points at and set it up. "remote add" that does not talk to the remote cannot do so and "fetch" could help but we should not touch existing refs/remotes/$name/HEAD by default [*1*], as the symref is meant to indicate the local choice of which one of their branches is significant to _us_ and what "clone" does is merely to give it the initial value. But when interacting with a remote whose choice of HEAD is always what the local user wants to follow, letting "git fetch" update refs/remotes/$name/HEAD to a newly observed value would be a welcome optional feature. I haven't thought through what Jonathan (Nieder) said about an option to fail a fetch when refs/remotes/$name/HEAD and the remote HEAD "git fetch" observed do not match. It may make sense in some cases but not always, and I do not know what the right explanation for the use case the mode is supposed to be a good match is. [Footnote] *1* This is important when you work in more than one repositories with working tree and use fetch+rebase to keep them in sync.
On Wed, Nov 18, 2020 at 9:53 AM Junio C Hamano <gitster@pobox.com> wrote: > Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > > I haven't taken much time to re-think through the patch/implications of > > this, but I remember running into this and going through some pre-patch > > investigation at some point. > > > > It's really annoying in some cases that "clone" isn't creating the same > > state as "remote". IIRC I was doing some heuristics to figure out the > > remote branch name etc. > > > > Isn't this something we can just change without an option? There were a > > bunch of cases in clone/fetch that were different for no different > > reasons, IIRC I patched one or two of those in the past. But I haven't > > gone through the history of the feature and checked if it was > > intentional. > > I think what Peff outlined earlier was reasonable. "remote add -f", > since it talks with the remote, should be able to learn where their > HEAD points at and set it up. "remote add" that does not talk to > the remote cannot do so and "fetch" could help but we should not > touch existing refs/remotes/$name/HEAD by default [*1*], as the > symref is meant to indicate the local choice of which one of their > branches is significant to _us_ and what "clone" does is merely to > give it the initial value. The new suggested behavior (fetch.updatehead = missing) is that "git fetch" touches $remote/HEAD *only* when it doesn't exist. So if you set $remote/HEAD manually, you would not affected. However, that behavior is clearly not ideal in the long term, since it seems basically nobody uses that feature. Either way I see no argument against adding this option, and making the default fetch.updatehead = never, which doesn't change the current behavior at all. > But when interacting with a remote whose choice of HEAD is always > what the local user wants to follow, letting "git fetch" update > refs/remotes/$name/HEAD to a newly observed value would be a welcome > optional feature. That would be fetch.updatehead = always. Cheers.
On Wed, Nov 18, 2020 at 03:12:19AM -0600, Felipe Contreras wrote: > Users might change the behavior when running "git fetch" so that the > remote's HEAD symbolic ref is updated at certain point. > > For example after running "git remote add" the remote HEAD is not > set like it is with "git clone". > > Setting "fetch.updatehead = missing" would probably be a sensible > default that everyone would want, but for now the default behavior is to > never update HEAD, so there shouldn't be any functional changes. > > For the next major version of Git, we might want to change this default. Thanks for working on this. Regardless of whether we change the default behavior, this seems like an obvious improvement (and I do think it makes sense to eventually change the default; I'd even be OK switching it to "missing" in the near term). The implementation looks pretty straight-forward, but I have a few comments below: > --- a/Documentation/config/remote.txt > +++ b/Documentation/config/remote.txt > @@ -84,3 +84,6 @@ remote.<name>.promisor:: > remote.<name>.partialclonefilter:: > The filter that will be applied when fetching from this > promisor remote. > + > +remote.<name>.updateHead:: > + Defines when to update the remote HEAD symbolic ref. See `fetch.updateHead`. Nice. I think fetch.updateHead is likely to be the commonly used one, but if this isn't hard to support, it's a nice bonus for flexibility. > +static void update_head(int config, const struct ref *head, const struct remote *remote) > +{ > + struct strbuf ref = STRBUF_INIT, target = STRBUF_INIT; > + const char *r, *head_name = NULL; > + > + if (!head || !head->symref || !remote) > + return; I'd expect us to return early here, as well, if the config is set to "never". I think your function will usually still do the right thing (because we return early before writing), but it makes a pointless lookup of the current origin/HEAD. But see below. > + strbuf_addf(&ref, "refs/remotes/%s/HEAD", remote->name); > + skip_prefix(head->symref, "refs/heads/", &head_name); Should we bail or complain if this skip_prefix() doesn't match? I think it would be a sign of weirdness on the remote side, since HEAD is never supposed to point to anything except refs/heads/. But if we ever did see it, it's probably better to bail than to point to refs/remotes/origin/refs/foo/bar or whatever. > + strbuf_addf(&target, "refs/remotes/%s/%s", remote->name, head_name); > + > + r = refs_resolve_ref_unsafe(get_main_ref_store(the_repository), > + ref.buf, RESOLVE_REF_READING, > + NULL, NULL); This won't resolve a symref pointing to an unborn branch, so it would count as "missing". I.e.: git symbolic-ref refs/remotes/origin/HEAD refs/remotes/origin/nope git -c fetch.updatehead=missing fetch will update it based on the remote HEAD. I guess I could see some argument for defining "missing" in that way, but I suspect it is not what somebody in this situation would expect. I think it's hard to get into this situation intentionally. If you clone an empty repo, we won't write the symref at all (since we have nothing to write into it), so both sides would be missing. But it's easy enough to do the right thing; see symbolic-ref.c's check_symref() function (basically drop the READING flag). Likewise, I'd not expect to see a non-symref at that name, but what should we do if we see one? I think overwrite it for "always", but not for "missing". You can tell the difference by checking REF_ISSYMREF in the returned flags, though the distinction may not matter if we follow the semantics I just said. > + if (r) { > + if (config == FETCH_UPDATE_HEAD_MISSING) > + /* already present */ > + return; > + else if (config == FETCH_UPDATE_HEAD_ALWAYS && !strcmp(r, target.buf)) > + /* already up-to-date */ > + return; > + else > + /* should never happen */ > + return; > + } > + > + if (create_symref(ref.buf, target.buf, "remote update head")) > + warning(_("could not set remote head")); > +} If we do update the symref, we should probably tell the user. Better still if we can print it as part of the usual fetch output table. > @@ -1382,8 +1420,18 @@ static int do_fetch(struct transport *transport, > break; > } > } > - } else if (transport->remote && transport->remote->fetch.nr) > + } else if (transport->remote && transport->remote->fetch.nr) { > + if (transport->remote->update_head) > + update_head_config = transport->remote->update_head; > + else > + update_head_config = fetch_update_head; > + > + need_update_head = update_head_config && update_head_config != FETCH_UPDATE_HEAD_NEVER; > + > + if (need_update_head) > + strvec_push(&ref_prefixes, "HEAD"); > refspec_ref_prefixes(&transport->remote->fetch, &ref_prefixes); > + } Good catch. We need this for: git fetch origin since otherwise it doesn't ask about HEAD in a v2 exchange. What about: git fetch origin master That won't report on HEAD either, even with your patch, because it hits the part of the conditional before your "else if". What should it do? I can see an argument for "nothing, we only update head on full configured fetches", but if so we should definitely make that clear in the documentation. I can also see an argument for "always, if we happen to have heard about it" (just like we opportunistically update tracking refs even if they are fetched by name into FETCH_HEAD). > diff --git a/remote.h b/remote.h > index 3211abdf05..c16a2d7b2e 100644 > --- a/remote.h > +++ b/remote.h I wondered if we should be interacting with remote.[ch]'s guess_remote_head() here, which is what powers "remote set-head -a", as well as the initial clone decision. But I don't think it make sense. If we were not explicitly given information about a symref, it does not make much sense to guess without seeing all of the refs (which we may well not, due to the v2 ref-prefix capability). In most cases the distinction would not matter anyway. The "guess" part comes in only for ancient pre-symref-capability versions of Git (which we are unlikely to see and it's OK if they just don't make use of the feature), or for remotes with detached HEADs (in which case not setting our local origin/HEAD is probably the best thing, as a bad guess would foul up a later use of the "missing" mode). -Peff
Jeff King <peff@peff.net> writes: > Thanks for working on this. Regardless of whether we change the default > behavior, this seems like an obvious improvement (and I do think it > makes sense to eventually change the default; I'd even be OK switching > it to "missing" in the near term). I agree that "missing" would be an easy thing to take, and I do not mind seeing it made the default in the near term. It won't break existing expectations too much, and can even be seen as a bugfix for the current behaviour by making "init && fetch" a step closer to "clone". Beyond that to modify what the end user already has is a much harder sell. For some it may be an improvement, but for others it would be a breaking change. > The implementation looks pretty straight-forward, but I have a few > comments below: > ... >> + strbuf_addf(&ref, "refs/remotes/%s/HEAD", remote->name); >> + skip_prefix(head->symref, "refs/heads/", &head_name); > > Should we bail or complain if this skip_prefix() doesn't match? I think > it would be a sign of weirdness on the remote side, ... Yes, we should notice the weirdness and stop doing any further harm to the local side. >> + strbuf_addf(&target, "refs/remotes/%s/%s", remote->name, head_name); >> + >> + r = refs_resolve_ref_unsafe(get_main_ref_store(the_repository), >> + ref.buf, RESOLVE_REF_READING, >> + NULL, NULL); > > This won't resolve a symref pointing to an unborn branch, so it would > count as "missing". I.e.: > > git symbolic-ref refs/remotes/origin/HEAD refs/remotes/origin/nope > git -c fetch.updatehead=missing fetch > > will update it based on the remote HEAD. I guess I could see some > argument for defining "missing" in that way, but I suspect it is not > what somebody in this situation would expect. What do we do in "git clone" of an empty repository with the current branch not yet to be born? Modern Git tells where the HEAD points at even for unborn branch, so using that would be a natural thing to do. > If we do update the symref, we should probably tell the user. Better > still if we can print it as part of the usual fetch output table. True. >> + if (need_update_head) >> + strvec_push(&ref_prefixes, "HEAD"); >> refspec_ref_prefixes(&transport->remote->fetch, &ref_prefixes); >> + } > > Good catch. We need this for: > > git fetch origin > > since otherwise it doesn't ask about HEAD in a v2 exchange. What about: > > git fetch origin master > > That won't report on HEAD either, even with your patch, because it hits > the part of the conditional before your "else if". What should it do? I > can see an argument for "nothing, we only update head on full configured > fetches", but if so we should definitely make that clear in the > documentation. I can also see an argument for "always, if we happen to > have heard about it" (just like we opportunistically update tracking > refs even if they are fetched by name into FETCH_HEAD). After seeing that all users got accustomed to seeing an explicit fetch into FETCH_HEAD still update the remote-tracking branches, it is more consistent and easier to understand by users if the "if we happen to have heard about it" is used, I would think. Documenting the behaviour certainly is needed in a real version after this RFC. > In most cases the distinction would not matter anyway. The "guess" part > comes in only for ancient pre-symref-capability versions of Git (which > we are unlikely to see and it's OK if they just don't make use of the > feature), I agree with you that at this point the versions of Git that does not know about advertising symref can be left to rot. > or for remotes with detached HEADs (in which case not setting > our local origin/HEAD is probably the best thing, as a bad guess would > foul up a later use of the "missing" mode). Yes. That would help the "missing" mode. With a mode stronger than "missing" that modifies what the local side has, detaching origin/HEAD in a similar way as the remote has would be OK (although I do not see me using that mode personally). Thanks for a detailed review.
On Fri, Nov 20, 2020 at 04:28:45PM -0800, Junio C Hamano wrote: > > This won't resolve a symref pointing to an unborn branch, so it would > > count as "missing". I.e.: > > > > git symbolic-ref refs/remotes/origin/HEAD refs/remotes/origin/nope > > git -c fetch.updatehead=missing fetch > > > > will update it based on the remote HEAD. I guess I could see some > > argument for defining "missing" in that way, but I suspect it is not > > what somebody in this situation would expect. > > What do we do in "git clone" of an empty repository with the current > branch not yet to be born? Modern Git tells where the HEAD points at > even for unborn branch, so using that would be a natural thing to do. We don't seem to do anything: $ git init $ git clone . dst Cloning into 'dst'... warning: You appear to have cloned an empty repository. done. $ find dst/.git/refs dst/.git/refs dst/.git/refs/tags dst/.git/refs/heads Likewise with --no-local. I don't think we do advertise the symref in such a case. In v2, the symref information is attached to individual lines in the ref advertisement. And we don't advertise the unborn line (we could, but I think we might need a special syntax for the oid field). In v0, it comes in the capability section attached to the first line of the advertisement, but it doesn't have to be about that particular line. If there are no refs to advertise, we don't seem to send anything (I _thought_ we sent a capabilities^{} line, but I think that is only receive-pack; if we have no refs to fetch, then capabilities are not interesting on the upload-pack side anyway). But even if we do have a ref in v0, it looks like we don't advertise the symref: $ git init $ git commit --allow-empty -m foo $ git checkout --orphan another-branch $ git-upload-pack . 0104d4cebe701d3d7b36e6c383193e92ef4bd49ab2b0 refs/heads/mastermulti_ack thin-pack side-band side-band-64k ofs-delta shallow deepen-since deepen-not deepen-relative no-progress include-tag multi_ack_detailed object-format=sha1 agent=git/2.29.2.730.g3e418f96ba We could likewise support it there, but I don't think modifying the v0 protocol at this point is that interesting. -Peff
On Fri, Nov 20, 2020 at 6:40 PM Jeff King <peff@peff.net> wrote: > I don't think we do advertise the symref in such a case. In v2, the > symref information is attached to individual lines in the ref > advertisement. And we don't advertise the unborn line (we could, but I > think we might need a special syntax for the oid field). This may be worth considering changing. If a hosting provider (e.g. GitHub) decides to configure an initial branch (e.g. main) it would be nice for "git clone" to have information about that initial branch so the user doesn't have to change it manually to please the provider's aesthetics. So the instructions could be: git clone $url . echo "# myproject" >> README.md git add README.md git commit -m "first commit" # git branch -M main # this step would not be necessary git push -u origin HEAD Personally I don't care about this. But others might find it useful. Cheers.
On Fri, Nov 20, 2020 at 5:52 PM Jeff King <peff@peff.net> wrote: > On Wed, Nov 18, 2020 at 03:12:19AM -0600, Felipe Contreras wrote: > > +static void update_head(int config, const struct ref *head, const struct remote *remote) > > +{ > > + struct strbuf ref = STRBUF_INIT, target = STRBUF_INIT; > > + const char *r, *head_name = NULL; > > + > > + if (!head || !head->symref || !remote) > > + return; > > I'd expect us to return early here, as well, if the config is set to > "never". I think your function will usually still do the right thing > (because we return early before writing), but it makes a pointless > lookup of the current origin/HEAD. But see below. If the config is set to "never", the function update_head is never called, since the boolean need_update_head is never set in the outer function. I don't like the convolutedness of this approach, but couldn't think of anything better. > > + strbuf_addf(&ref, "refs/remotes/%s/HEAD", remote->name); > > + skip_prefix(head->symref, "refs/heads/", &head_name); > > Should we bail or complain if this skip_prefix() doesn't match? I think > it would be a sign of weirdness on the remote side, since HEAD is never > supposed to point to anything except refs/heads/. But if we ever did see > it, it's probably better to bail than to point to > refs/remotes/origin/refs/foo/bar or whatever. OK. An extra check doesn't hurt. > > + strbuf_addf(&target, "refs/remotes/%s/%s", remote->name, head_name); > > + > > + r = refs_resolve_ref_unsafe(get_main_ref_store(the_repository), > > + ref.buf, RESOLVE_REF_READING, > > + NULL, NULL); > > This won't resolve a symref pointing to an unborn branch, so it would > count as "missing". I.e.: > > git symbolic-ref refs/remotes/origin/HEAD refs/remotes/origin/nope > git -c fetch.updatehead=missing fetch > > will update it based on the remote HEAD. I guess I could see some > argument for defining "missing" in that way, but I suspect it is not > what somebody in this situation would expect. > > I think it's hard to get into this situation intentionally. If you clone > an empty repo, we won't write the symref at all (since we have nothing > to write into it), so both sides would be missing. But it's easy enough > to do the right thing; see symbolic-ref.c's check_symref() function > (basically drop the READING flag). > > Likewise, I'd not expect to see a non-symref at that name, but what > should we do if we see one? I think overwrite it for "always", but not > for "missing". You can tell the difference by checking REF_ISSYMREF in > the returned flags, though the distinction may not matter if we follow > the semantics I just said. All right. So, still checking if REF_ISSYMREF is set, just in case. > > + if (r) { > > + if (config == FETCH_UPDATE_HEAD_MISSING) > > + /* already present */ > > + return; > > + else if (config == FETCH_UPDATE_HEAD_ALWAYS && !strcmp(r, target.buf)) > > + /* already up-to-date */ > > + return; > > + else > > + /* should never happen */ > > + return; > > + } > > + > > + if (create_symref(ref.buf, target.buf, "remote update head")) > > + warning(_("could not set remote head")); > > +} > > If we do update the symref, we should probably tell the user. Better > still if we can print it as part of the usual fetch output table. Agreed. > > @@ -1382,8 +1420,18 @@ static int do_fetch(struct transport *transport, > > break; > > } > > } > > - } else if (transport->remote && transport->remote->fetch.nr) > > + } else if (transport->remote && transport->remote->fetch.nr) { > > + if (transport->remote->update_head) > > + update_head_config = transport->remote->update_head; > > + else > > + update_head_config = fetch_update_head; > > + > > + need_update_head = update_head_config && update_head_config != FETCH_UPDATE_HEAD_NEVER; > > + > > + if (need_update_head) > > + strvec_push(&ref_prefixes, "HEAD"); > > refspec_ref_prefixes(&transport->remote->fetch, &ref_prefixes); > > + } > > Good catch. We need this for: > > git fetch origin > > since otherwise it doesn't ask about HEAD in a v2 exchange. What about: > > git fetch origin master > > That won't report on HEAD either, even with your patch, because it hits > the part of the conditional before your "else if". What should it do? I > can see an argument for "nothing, we only update head on full configured > fetches", but if so we should definitely make that clear in the > documentation. I can also see an argument for "always, if we happen to > have heard about it" (just like we opportunistically update tracking > refs even if they are fetched by name into FETCH_HEAD). I don't see the point in complicating the code for those corner-cases. And I also don't see how HEAD can be fetched unless we specifically ask for it. What would that command look like? Also, there's two optimizations that apparently went unnoticed: 1. In the case of "missing". We could preemptively check if there's already an "origin/HEAD" before adding "HEAD" to the prefixes (and setting need_update_head). That would probably complicate the code. 2. Also in the case of "missing". There's no point in building the "target" string buffer, since we are never going to compare it to anything. Again, this would complicate the code. I decided against these in v1 because as I already stated: it complicates the code. Plus, I forgot to release the string buffers (I'm glad nobody noticed). That will be in v2. Cheers. Cheers.
On Fri, Nov 20, 2020 at 6:28 PM Junio C Hamano <gitster@pobox.com> wrote: > > Jeff King <peff@peff.net> writes: > > > Thanks for working on this. Regardless of whether we change the default > > behavior, this seems like an obvious improvement (and I do think it > > makes sense to eventually change the default; I'd even be OK switching > > it to "missing" in the near term). > > I agree that "missing" would be an easy thing to take, and I do not > mind seeing it made the default in the near term. It won't break > existing expectations too much, and can even be seen as a bugfix for > the current behaviour by making "init && fetch" a step closer to > "clone". Beyond that to modify what the end user already has is a > much harder sell. For some it may be an improvement, but for others > it would be a breaking change. Changing the default to "missing" breaks a lot of tests. I already have the fixes for the tests, but this can be seen as an indication that the expectations of users would change. I never knew of this $remote/HEAD feature, and searching forums some people seem perplexed by its existence and ask how to remove it. So, if the "missing" behavior is the one we are targeting (which I argue we should), we probably need a warning before doing the flip, so that users become aware of the feature and are not surprised by a sudden $remote/HEAD popping (or repopping) seemingly out of nowhere. Cheers.
On Fri, Nov 20, 2020 at 07:18:39PM -0600, Felipe Contreras wrote: > On Fri, Nov 20, 2020 at 6:40 PM Jeff King <peff@peff.net> wrote: > > > I don't think we do advertise the symref in such a case. In v2, the > > symref information is attached to individual lines in the ref > > advertisement. And we don't advertise the unborn line (we could, but I > > think we might need a special syntax for the oid field). > > This may be worth considering changing. I agree it might be nice, but may not be worth the effort (it will require a protocol extension). However... > If a hosting provider (e.g. GitHub) decides to configure an initial > branch (e.g. main) it would be nice for "git clone" to have > information about that initial branch so the user doesn't have to > change it manually to please the provider's aesthetics. > > So the instructions could be: > > git clone $url . > echo "# myproject" >> README.md > git add README.md > git commit -m "first commit" > # git branch -M main # this step would not be necessary > git push -u origin HEAD > > Personally I don't care about this. But others might find it useful. I don't think you have to do the rename here if you don't want to. If there is already content on the "main" branch at GitHub, then clone should pick that as the default branch for your local clone. If there isn't, then you'll get whatever your local git-clone decides on. But when you push, at least to GitHub, if the server-side HEAD points to an unborn branch, it will re-point to the newly pushed branch. So with the current version of Git, you would end up with "master" on the remote side. And that should continue to work as it always has (AFAIK any changes on GitHub's side are purely about default branch names, such as when you add template files as part of the repo initialization). Of course if you _want_ "main" in both cases, you'd have to rename the branch. But you would probably set init.defaultBranch in that case. -Peff
On Fri, Nov 20, 2020 at 07:41:48PM -0600, Felipe Contreras wrote: > > > +static void update_head(int config, const struct ref *head, const struct remote *remote) > > > +{ > > > + struct strbuf ref = STRBUF_INIT, target = STRBUF_INIT; > > > + const char *r, *head_name = NULL; > > > + > > > + if (!head || !head->symref || !remote) > > > + return; > > > > I'd expect us to return early here, as well, if the config is set to > > "never". I think your function will usually still do the right thing > > (because we return early before writing), but it makes a pointless > > lookup of the current origin/HEAD. But see below. > > If the config is set to "never", the function update_head is never > called, since the boolean need_update_head is never set in the outer > function. True. > I don't like the convolutedness of this approach, but couldn't think > of anything better. In general, I think keeping as much logic as possible in update_head() makes sense, rather than the caller. We don't need to avoid running update_head() when the config is "never" if it is a true noop in that case. Unfortunately, we do still need to make a decision on whether to send a request for "HEAD" to the other side in some cases, so some logic has to be there. But if we don't send such a request, but we _do_ find out about HEAD somehow (e.g., because the user said "git fetch origin HEAD", which seems like an obvious thing for someone to try in this case), we probably should trigger the function. > > since otherwise it doesn't ask about HEAD in a v2 exchange. What about: > > > > git fetch origin master > > > > That won't report on HEAD either, even with your patch, because it hits > > the part of the conditional before your "else if". What should it do? I > > can see an argument for "nothing, we only update head on full configured > > fetches", but if so we should definitely make that clear in the > > documentation. I can also see an argument for "always, if we happen to > > have heard about it" (just like we opportunistically update tracking > > refs even if they are fetched by name into FETCH_HEAD). > > I don't see the point in complicating the code for those corner-cases. I would think: git pull origin master isn't that much of a corner case, and people would expect a HEAD update to trigger (especially if that HEAD is master). That will run "git fetch origin master" under the hood (and update both FETCH_HEAD and refs/remotes/origin/master). > And I also don't see how HEAD can be fetched unless we specifically > ask for it. What would that command look like? I think it would be nice if any of: - git fetch - git fetch origin - git fetch origin master - git fetch origin HEAD triggered the auto-update, along with their git-pull equivalents. In the first three, v2 protocol will require mentioning HEAD as a ref-prefix. > Also, there's two optimizations that apparently went unnoticed: > > 1. In the case of "missing". We could preemptively check if there's > already an "origin/HEAD" before adding "HEAD" to the prefixes (and > setting need_update_head). That would probably complicate the code. Hmm. That might be worth doing, as it does involve extra network traffic (and the server-side lookup is more complicated, as it looks up all possible variants of HEAD). But it is only a few bytes. I guess it would involve saving the value of origin/HEAD between the two calls, but that doesn't seem all that hard. But I'd also be OK with just unconditionally asking for HEAD (but as above, I think it should happen even with a refspec, and anytime we hear about HEAD we should consider updating the symref). -Peff
diff --git a/Documentation/config/fetch.txt b/Documentation/config/fetch.txt index 6af6f5edb2..93d6c59fac 100644 --- a/Documentation/config/fetch.txt +++ b/Documentation/config/fetch.txt @@ -94,3 +94,7 @@ fetch.writeCommitGraph:: merge and the write may take longer. Having an updated commit-graph file helps performance of many Git commands, including `git merge-base`, `git push -f`, and `git log --graph`. Defaults to false. + +fetch.updateHead:: + Defines when to update the remote HEAD symbolic ref. Values are 'never', + 'missing' (update only when HEAD is missing), and 'always'. diff --git a/Documentation/config/remote.txt b/Documentation/config/remote.txt index a8e6437a90..905661c7f7 100644 --- a/Documentation/config/remote.txt +++ b/Documentation/config/remote.txt @@ -84,3 +84,6 @@ remote.<name>.promisor:: remote.<name>.partialclonefilter:: The filter that will be applied when fetching from this promisor remote. + +remote.<name>.updateHead:: + Defines when to update the remote HEAD symbolic ref. See `fetch.updateHead`. diff --git a/builtin/fetch.c b/builtin/fetch.c index f9c3c49f14..b47b06f001 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -55,6 +55,8 @@ static int fetch_prune_tags_config = -1; /* unspecified */ static int prune_tags = -1; /* unspecified */ #define PRUNE_TAGS_BY_DEFAULT 0 /* do we prune tags by default? */ +static int fetch_update_head = FETCH_UPDATE_HEAD_DEFAULT; + static int all, append, dry_run, force, keep, multiple, update_head_ok; static int write_fetch_head = 1; static int verbosity, deepen_relative, set_upstream; @@ -120,6 +122,9 @@ static int git_fetch_config(const char *k, const char *v, void *cb) return 0; } + if (!strcmp(k, "fetch.updatehead")) + return parse_update_head(&fetch_update_head, k, v); + return git_default_config(k, v, cb); } @@ -1342,6 +1347,38 @@ static void backfill_tags(struct transport *transport, struct ref *ref_map) } } +static void update_head(int config, const struct ref *head, const struct remote *remote) +{ + struct strbuf ref = STRBUF_INIT, target = STRBUF_INIT; + const char *r, *head_name = NULL; + + if (!head || !head->symref || !remote) + return; + + strbuf_addf(&ref, "refs/remotes/%s/HEAD", remote->name); + skip_prefix(head->symref, "refs/heads/", &head_name); + strbuf_addf(&target, "refs/remotes/%s/%s", remote->name, head_name); + + r = refs_resolve_ref_unsafe(get_main_ref_store(the_repository), + ref.buf, RESOLVE_REF_READING, + NULL, NULL); + + if (r) { + if (config == FETCH_UPDATE_HEAD_MISSING) + /* already present */ + return; + else if (config == FETCH_UPDATE_HEAD_ALWAYS && !strcmp(r, target.buf)) + /* already up-to-date */ + return; + else + /* should never happen */ + return; + } + + if (create_symref(ref.buf, target.buf, "remote update head")) + warning(_("could not set remote head")); +} + static int do_fetch(struct transport *transport, struct refspec *rs) { @@ -1351,6 +1388,7 @@ static int do_fetch(struct transport *transport, const struct ref *remote_refs; struct strvec ref_prefixes = STRVEC_INIT; int must_list_refs = 1; + int need_update_head = 0, update_head_config = 0; if (tags == TAGS_DEFAULT) { if (transport->remote->fetch_tags == 2) @@ -1382,8 +1420,18 @@ static int do_fetch(struct transport *transport, break; } } - } else if (transport->remote && transport->remote->fetch.nr) + } else if (transport->remote && transport->remote->fetch.nr) { + if (transport->remote->update_head) + update_head_config = transport->remote->update_head; + else + update_head_config = fetch_update_head; + + need_update_head = update_head_config && update_head_config != FETCH_UPDATE_HEAD_NEVER; + + if (need_update_head) + strvec_push(&ref_prefixes, "HEAD"); refspec_ref_prefixes(&transport->remote->fetch, &ref_prefixes); + } if (tags == TAGS_SET || tags == TAGS_DEFAULT) { must_list_refs = 1; @@ -1427,6 +1475,9 @@ static int do_fetch(struct transport *transport, goto cleanup; } + if (need_update_head) + update_head(update_head_config, find_ref_by_name(remote_refs, "HEAD"), transport->remote); + if (set_upstream) { struct branch *branch = branch_get("HEAD"); struct ref *rm; diff --git a/remote.c b/remote.c index 8a6dbbb903..dad8062561 100644 --- a/remote.c +++ b/remote.c @@ -298,6 +298,25 @@ static void read_branches_file(struct remote *remote) remote->fetch_tags = 1; /* always auto-follow */ } +int parse_update_head(int *r, const char *var, const char *value) +{ + if (!r) + return -1; + else if (!value) + return config_error_nonbool(var); + else if (!strcmp(value, "never")) + *r = FETCH_UPDATE_HEAD_NEVER; + else if (!strcmp(value, "missing")) + *r = FETCH_UPDATE_HEAD_MISSING; + else if (!strcmp(value, "always")) + *r = FETCH_UPDATE_HEAD_ALWAYS; + else { + error(_("malformed value for %s: %s"), var, value); + return error(_("must be one of never, missing, or always")); + } + return 0; +} + static int handle_config(const char *key, const char *value, void *cb) { const char *name; @@ -418,6 +437,8 @@ static int handle_config(const char *key, const char *value, void *cb) key, value); } else if (!strcmp(subkey, "vcs")) { return git_config_string(&remote->foreign_vcs, key, value); + } else if (!strcmp(subkey, "updatehead")) { + return parse_update_head(&remote->update_head, key, value); } return 0; } diff --git a/remote.h b/remote.h index 3211abdf05..c16a2d7b2e 100644 --- a/remote.h +++ b/remote.h @@ -21,6 +21,13 @@ enum { REMOTE_BRANCHES }; +enum { + FETCH_UPDATE_HEAD_DEFAULT = 0, + FETCH_UPDATE_HEAD_NEVER, + FETCH_UPDATE_HEAD_MISSING, + FETCH_UPDATE_HEAD_ALWAYS, +}; + struct remote { struct hashmap_entry ent; @@ -62,6 +69,8 @@ struct remote { int prune; int prune_tags; + int update_head; + /** * The configured helper programs to run on the remote side, for * Git-native protocols. @@ -372,4 +381,6 @@ int parseopt_push_cas_option(const struct option *, const char *arg, int unset); int is_empty_cas(const struct push_cas_option *); void apply_push_cas(struct push_cas_option *, struct remote *, struct ref *); +int parse_update_head(int *r, const char *var, const char *value); + #endif
Users might change the behavior when running "git fetch" so that the remote's HEAD symbolic ref is updated at certain point. For example after running "git remote add" the remote HEAD is not set like it is with "git clone". Setting "fetch.updatehead = missing" would probably be a sensible default that everyone would want, but for now the default behavior is to never update HEAD, so there shouldn't be any functional changes. For the next major version of Git, we might want to change this default. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> --- This is just a RFC, the tests are missing. Documentation/config/fetch.txt | 4 +++ Documentation/config/remote.txt | 3 ++ builtin/fetch.c | 53 ++++++++++++++++++++++++++++++++- remote.c | 21 +++++++++++++ remote.h | 11 +++++++ 5 files changed, 91 insertions(+), 1 deletion(-)