mbox series

[RFC,v2,0/2] set remote/HEAD with fetch

Message ID 20240910203835.2288291-1-bence@ferdinandy.com (mailing list archive)
Headers show
Series set remote/HEAD with fetch | expand

Message

Bence Ferdinandy Sept. 10, 2024, 8:37 p.m. UTC
Sorry for the noise, I managed to badly mess up receipients in the previous
thread so v2 is just fixing that ...

Hi,

these two patches attempt to solve the issue raised in
https://lore.kernel.org/git/D3HBD7C1FR14.74FL1Q1S9UCB@ferdinandy.com/

As a quick summary: `clone` sets `refs/remotes/[remote]/HEAD` while going init
-> remote add -> fetch does not, one has to manually run `remote set-head -a
[remote]`.

The first patch adds a `--set-head` flag to `fetch` and `remote update` which
runs `remote set-head -a` for us. Unfortunately, with the current behaviour of
set-head this will always print a message, even though a no-op fetch doesn't
print anything, and I think this is also confusing for people who do not care
about remote/HEAD, so the second patch removes the print if `set-head -a` is
a no-op (and actually makes it into a no-op, instead of just idempotent).

Another way could of course be duplicating some of the code from remote
set-head in fetch.c instead of calling directly, but it didn't look like an
anti-pattern in the code-base and it felt the best way to insure identical
behaviour between a `git fetch --all --set-head` and a 
`git fetch --all && git remote | xargs -i git remote set-head -a {}`.

What is missing for sure is:
- documentation
- tests (if needed)
- settings

For settings, my idea would be a fetch/remote.set_head that could take three values:
    * never
    * missing: run it only if the ref is missing, this setting would basically
      allow replicating the result of a clone
    * always (with the other patch, this would still be a no-op if it didn't change)

This would probably also require a --no-set-head flag, to disable an
always/missing setting. A --missing-set-head or something of the like also may
or may not make sense. Alternatively, only two behaviours might be enough
(missing and always) since clone already sort of does this.

I'm not sure if the general approach is fine or not, nor am I sure the code
itself is any good, but it "works on my computer" :) I'm also hoping that
I managed to read all the relevant parts for sending a patch.

Feedback would be highly appreciated!

Thanks,
Bence


Bence Ferdinandy (2):
  fetch: set-head with --set-head option
  set-head: do not update if there is no change

 builtin/fetch.c  | 29 ++++++++++++++++++++++++-----
 builtin/remote.c | 15 +++++++++++----
 2 files changed, 35 insertions(+), 9 deletions(-)

Comments

Junio C Hamano Sept. 10, 2024, 10:29 p.m. UTC | #1
Bence Ferdinandy <bence@ferdinandy.com> writes:

> What is missing for sure is:
> - documentation
> - tests (if needed)

What change does not need tests?

> - settings
>
> For settings, my idea would be a fetch/remote.set_head that could take three values:
>     * never
>     * missing: run it only if the ref is missing, this setting would basically
>       allow replicating the result of a clone
>     * always (with the other patch, this would still be a no-op if it didn't change)
>
> This would probably also require a --no-set-head flag, to disable an
> always/missing setting. A --missing-set-head or something of the like also may
> or may not make sense. Alternatively, only two behaviours might be enough
> (missing and always) since clone already sort of does this.

If we were to assume "always" is needed, then the tristate like the
above may be a reasonable way to go.

But as I outlined in my response to [1/2], I suspect that an
approach without configuration or command line option would give the
users the most smooth experience.  They are used to seeing "clone"
to give them a remote tracking HEAD and nobody complained that we
lack the option to "clone" to prevent that.  If "fetch" notices we
do not have remote tracking HEAD for a remote [*] and stores what it
observed at their HEAD in remote tracking HEAD, that should not
bother anybody.  No matter what mechanism gave you the initial
remote tracking HEAD, if you want to update it to something else, we
already have the "remote set-head" command.


[Footnote]

 * One thing we MUST be careful about is that some remotes may not
   have ANY remote tracking branches (i.e. you only want to use the
   remote mechanism to give you a shorthand for URL, but you do not
   have fetch refspec at all).  Even if refs/remotes/$repo/HEAD is
   missing for such a remote, we should *not* attempt to create it,
   as we are not populating refs/remotes/$repo/master and friends at
   all in such a case.
Bence Ferdinandy Sept. 11, 2024, 12:24 p.m. UTC | #2
On Wed Sep 11, 2024 at 00:29, Junio C Hamano <gitster@pobox.com> wrote:
> Bence Ferdinandy <bence@ferdinandy.com> writes:
>
> > What is missing for sure is:
> > - documentation
> > - tests (if needed)
>
> What change does not need tests?

Fair enough, for the next iteration I'll look into tests as well!

>
> > - settings
> >
> > For settings, my idea would be a fetch/remote.set_head that could take three values:
> >     * never
> >     * missing: run it only if the ref is missing, this setting would basically
> >       allow replicating the result of a clone
> >     * always (with the other patch, this would still be a no-op if it didn't change)
> >
> > This would probably also require a --no-set-head flag, to disable an
> > always/missing setting. A --missing-set-head or something of the like also may
> > or may not make sense. Alternatively, only two behaviours might be enough
> > (missing and always) since clone already sort of does this.
>
> If we were to assume "always" is needed, then the tristate like the
> above may be a reasonable way to go.
>
> But as I outlined in my response to [1/2], I suspect that an
> approach without configuration or command line option would give the
> users the most smooth experience.  They are used to seeing "clone"
> to give them a remote tracking HEAD and nobody complained that we
> lack the option to "clone" to prevent that.  If "fetch" notices we
> do not have remote tracking HEAD for a remote [*] and stores what it
> observed at their HEAD in remote tracking HEAD, that should not
> bother anybody.  No matter what mechanism gave you the initial
> remote tracking HEAD, if you want to update it to something else, we
> already have the "remote set-head" command.

So I guess we did conclude that no settings are actually needed.

>
>
> [Footnote]
>
>  * One thing we MUST be careful about is that some remotes may not
>    have ANY remote tracking branches (i.e. you only want to use the
>    remote mechanism to give you a shorthand for URL, but you do not
>    have fetch refspec at all).  Even if refs/remotes/$repo/HEAD is
>    missing for such a remote, we should *not* attempt to create it,
>    as we are not populating refs/remotes/$repo/master and friends at
>    all in such a case.

You mean that somebody does git init && git remote add origin $remote, but
never does calls fetch? Currently remote set-head -a origin will fail with

error: Not a valid ref: refs/remotes/origin/master

Although if everything is done _after_ a fetch, unless I misunderstood, we
shouldn't be able to be in this situation at all.

Best,
Bence
Junio C Hamano Sept. 11, 2024, 3:59 p.m. UTC | #3
"Bence Ferdinandy" <bence@ferdinandy.com> writes:

> On Wed Sep 11, 2024 at 00:29, Junio C Hamano <gitster@pobox.com> wrote:
>> Bence Ferdinandy <bence@ferdinandy.com> writes:
>>
>> > What is missing for sure is:
>> > - documentation
>> > - tests (if needed)
>>
>> What change does not need tests?
>
> Fair enough, for the next iteration I'll look into tests as well!

In this project, tests are written not because we want to
demonstrate the shiny new feature we just invented.  We write tests
so that the invention we are shipping to our end users will keep
behaveing in the way we originall intended without getting broken by
future careless developers.  The tests serve as watching eyes while
the original author of the feature is not watching ;-) 

And that is why we also write tests that a feature does not trigger
when it should not, as well as it triggers when it should.

>>  * One thing we MUST be careful about is that some remotes may not
>>    have ANY remote tracking branches (i.e. you only want to use the
>>    remote mechanism to give you a shorthand for URL, but you do not
>>    have fetch refspec at all).  Even if refs/remotes/$repo/HEAD is
>>    missing for such a remote, we should *not* attempt to create it,
>>    as we are not populating refs/remotes/$repo/master and friends at
>>    all in such a case.
>
> You mean that somebody does git init && git remote add origin $remote, but
> never does calls fetch?

No.  If the remote HEAD does not exist, we may still not want to
create it.  Imagine

	[remote "his"]
		url = https://git.kernel.org/pub/scm/git/git
		push = refs/heads/maint
		push = refs/heads/master
		push = refs/heads/next
		push = +refs/heads/seen

without any refspec for the fetching side.  "git fetch his master"
may learn where the remote HEAD is, and it may even be pointing at
their 'master' branch, but because we do not maintain any remote
tracking information for their 'master' (in other words,
refs/remotes/his/master is not updated by this 'fetch' and there is
no configuration to make future 'fetch' to do so).

Thanks.