[v2] fetch: document and test --refmap=""
diff mbox series

Message ID pull.532.v2.git.1579570692766.gitgitgadget@gmail.com
State New
Headers show
Series
  • [v2] fetch: document and test --refmap=""
Related show

Commit Message

Ralf Thielow via GitGitGadget Jan. 21, 2020, 1:38 a.m. UTC
From: Derrick Stolee <dstolee@microsoft.com>

To prevent long blocking time during a 'git fetch' call, a user
may want to set up a schedule for background 'git fetch' processes.
However, these runs will update the refs/remotes branches due to
the default refspec set in the config when Git adds a remote.
Hence the user will not notice when remote refs are updated during
their foreground fetches. In fact, they may _want_ those refs to
stay put so they can work with the refs from their last foreground
fetch call.

This can be accomplished by overriding the configured refspec using
'--refmap=' along with a custom refspec:

  git fetch <remote> --refmap= +refs/heads/*:refs/hidden/<remote>/*

to populate a custom ref space and download a pack of the new
reachable objects. This kind of call allows a few things to happen:

1. We download a new pack if refs have updated.
2. Since the refs/hidden branches exist, GC will not remove the
   newly-downloaded data.
3. With fetch.writeCommitGraph enabled, the refs/hidden refs are
   used to update the commit-graph file.

To avoid the refs/hidden directory from filling without bound, the
--prune option can be included. When providing a refspec like this,
the --prune option does not delete remote refs and instead only
deletes refs in the target refspace.

Update the documentation to clarify how '--refmap=""' works and
create tests to guarantee this behavior remains in the future.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
    fetch: document and test '--refmap=""'
    
    Thanks for all the feedback leading to absolutely no code change. It's
    good we already have the flexibility for this. I'm a bit embarrassed
    that I did not discover this, so perhaps this doc change and new tests
    will help clarify the behavior.
    
    Thanks, -Stolee

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-532%2Fderrickstolee%2Ffetch-no-update-remote-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-532/derrickstolee/fetch-no-update-remote-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/532

Range-diff vs v1:

 1:  26f13ea5f5 ! 1:  51c6221a31 fetch: add --no-update-remote-refs
     @@ -1,20 +1,20 @@
      Author: Derrick Stolee <dstolee@microsoft.com>
      
     -    fetch: add --no-update-remote-refs
     +    fetch: document and test --refmap=""
      
          To prevent long blocking time during a 'git fetch' call, a user
          may want to set up a schedule for background 'git fetch' processes.
     -    However, these runs will update the refs/remotes branches, and
     -    hence the user will not notice when remote refs are updated during
     +    However, these runs will update the refs/remotes branches due to
     +    the default refspec set in the config when Git adds a remote.
     +    Hence the user will not notice when remote refs are updated during
          their foreground fetches. In fact, they may _want_ those refs to
          stay put so they can work with the refs from their last foreground
          fetch call.
      
     -    Add a --[no-]update-remote-refs option to 'git fetch' which defaults
     -    to the existing behavior of updating the remote refs. This allows
     -    a user to run
     +    This can be accomplished by overriding the configured refspec using
     +    '--refmap=' along with a custom refspec:
      
     -      git fetch <remote> --no-update-remote-refs +refs/heads/*:refs/hidden/*
     +      git fetch <remote> --refmap= +refs/heads/*:refs/hidden/<remote>/*
      
          to populate a custom ref space and download a pack of the new
          reachable objects. This kind of call allows a few things to happen:
     @@ -30,9 +30,8 @@
          the --prune option does not delete remote refs and instead only
          deletes refs in the target refspace.
      
     -    Note: with the default refpsec, the --prune option will override
     -    the --no-update-remote-refs option and will delete the refs that
     -    do not exist on the remote.
     +    Update the documentation to clarify how '--refmap=""' works and
     +    create tests to guarantee this behavior remains in the future.
      
          Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
      
     @@ -40,50 +39,17 @@
       --- a/Documentation/fetch-options.txt
       +++ b/Documentation/fetch-options.txt
      @@
     - 	'git-pull' the --ff-only option will still check for forced updates
     - 	before attempting a fast-forward update. See linkgit:git-config[1].
     + 	specified refspec (can be given more than once) to map the
     + 	refs to remote-tracking branches, instead of the values of
     + 	`remote.*.fetch` configuration variables for the remote
     +-	repository.  See section on "Configured Remote-tracking
     ++	repository.  Providing an empty `<refspec>` to the
     ++	`--refmap` option causes Git to ignore the configured
     ++	refspecs and rely entirely on the refspecs supplied as
     ++	command-line arguments. See section on "Configured Remote-tracking
     + 	Branches" for details.
       
     -+--no-update-remote-refs::
     -+	By default, git updates the `refs/remotes/` refspace with the refs
     -+	advertised by the remotes during a `git fetch` command. With this
     -+	option, those refs will be ignored. If the `--prune` option is
     -+	specified and the default refpsec is used, then a ref that does not
     -+	appear in the remote will still be deleted from refs/remotes.
     -+
     - -4::
     - --ipv4::
     - 	Use IPv4 addresses only, ignoring IPv6 addresses.
     -
     - diff --git a/builtin/fetch.c b/builtin/fetch.c
     - --- a/builtin/fetch.c
     - +++ b/builtin/fetch.c
     -@@
     - static struct string_list server_options = STRING_LIST_INIT_DUP;
     - static struct string_list negotiation_tip = STRING_LIST_INIT_NODUP;
     - static int fetch_write_commit_graph = -1;
     -+static int update_remote_refs = 1;
     - 
     - static int git_fetch_config(const char *k, const char *v, void *cb)
     - {
     -@@
     - 		 N_("check for forced-updates on all updated branches")),
     - 	OPT_BOOL(0, "write-commit-graph", &fetch_write_commit_graph,
     - 		 N_("write the commit-graph after fetching")),
     -+	OPT_BOOL(0, "update-remote-refs", &update_remote_refs,
     -+		 N_("update the refs/remotes/ refspace")),
     - 	OPT_END()
     - };
     - 
     -@@
     - 	const char *pretty_ref = prettify_refname(ref->name);
     - 	int fast_forward = 0;
     - 
     -+	if (!update_remote_refs && starts_with(ref->name, "refs/remotes/"))
     -+		return 0;
     -+
     - 	type = oid_object_info(the_repository, &ref->new_oid, NULL);
     - 	if (type < 0)
     - 		die(_("object %s not found"), oid_to_hex(&ref->new_oid));
     + -t::
      
       diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
       --- a/t/t5510-fetch.sh
     @@ -92,13 +58,13 @@
       	git rev-parse sometag
       '
       
     -+test_expect_success 'fetch --no-update-remote-refs keeps existing refs' '
     ++test_expect_success '--refmap="" ignores configured refspec' '
      +	cd "$TRASH_DIRECTORY" &&
      +	git clone "$D" remote-refs &&
      +	git -C remote-refs rev-parse remotes/origin/master >old &&
      +	git -C remote-refs update-ref refs/remotes/origin/master master~1 &&
      +	git -C remote-refs rev-parse remotes/origin/master >new &&
     -+	git -C remote-refs fetch --no-update-remote-refs origin &&
     ++	git -C remote-refs fetch --refmap= origin "+refs/heads/*:refs/hidden/origin/*" &&
      +	git -C remote-refs rev-parse remotes/origin/master >actual &&
      +	test_cmp new actual &&
      +	git -C remote-refs fetch origin &&
     @@ -106,13 +72,13 @@
      +	test_cmp old actual
      +'
      +
     -+test_expect_success 'fetch --no-update-remote-refs --prune with refspec' '
     ++test_expect_success '--refmap="" and --prune' '
      +	git -C remote-refs update-ref refs/remotes/origin/foo/otherbranch master &&
      +	git -C remote-refs update-ref refs/hidden/foo/otherbranch master &&
     -+	git -C remote-refs fetch --prune --no-update-remote-refs origin +refs/heads/*:refs/hidden/* &&
     ++	git -C remote-refs fetch --prune --refmap="" origin +refs/heads/*:refs/hidden/* &&
      +	git -C remote-refs rev-parse remotes/origin/foo/otherbranch &&
      +	test_must_fail git -C remote-refs rev-parse refs/hidden/foo/otherbranch &&
     -+	git -C remote-refs fetch --prune --no-update-remote-refs origin &&
     ++	git -C remote-refs fetch --prune origin &&
      +	test_must_fail git -C remote-refs rev-parse remotes/origin/foo/otherbranch
      +'
      +


 Documentation/fetch-options.txt |  5 ++++-
 t/t5510-fetch.sh                | 24 ++++++++++++++++++++++++
 2 files changed, 28 insertions(+), 1 deletion(-)


base-commit: d0654dc308b0ba76dd8ed7bbb33c8d8f7aacd783

Comments

Jeff King Jan. 21, 2020, 4:24 p.m. UTC | #1
On Tue, Jan 21, 2020 at 01:38:12AM +0000, Derrick Stolee via GitGitGadget wrote:

> Update the documentation to clarify how '--refmap=""' works and
> create tests to guarantee this behavior remains in the future.

Yeah, this looks like a good solution to me.

> This can be accomplished by overriding the configured refspec using
> '--refmap=' along with a custom refspec:
> 
>   git fetch <remote> --refmap= +refs/heads/*:refs/hidden/<remote>/*

This isn't strictly related to your patch, but since the rationale here
describes the concept of a background job and people might end up using
it as a reference, do you want to add in --no-tags to help them out?

>     Thanks for all the feedback leading to absolutely no code change. It's
>     good we already have the flexibility for this. I'm a bit embarrassed
>     that I did not discover this, so perhaps this doc change and new tests
>     will help clarify the behavior.

If it makes you feel better, I only found --refmap because I was the one
who implemented the original "update based on refspecs" logic, and while
looking for that commit with "git log --grep=opportunistic" I stumbled
onto Junio's commit adding --refmap, which referenced mine. Maybe this
also works as a good case study in why we should write good commit
messages and refer to related work. :)

Anyway, I wasn't at all sure that a blank --refmap= would do what you
want until I tried it. But it was always intended to work that way. From
c5558f80c3 (fetch: allow explicit --refmap to override configuration,
2014-05-29):

  +static int parse_refmap_arg(const struct option *opt, const char *arg, int unset)
  +{
  +       ALLOC_GROW(refmap_array, refmap_nr + 1, refmap_alloc);
  +
  +       /*
  +        * "git fetch --refmap='' origin foo"
  +        * can be used to tell the command not to store anywhere
  +        */
  +       if (*arg)
  +               refmap_array[refmap_nr++] = arg;
  +       return 0;
  +}

At first I thought the comment was wrong, since we don't actually
increment refmap_nr. But the ALLOC_GROW makes refmap_array non-NULL,
which is what triggers the "do not use configured refspecs" logic.

This code switched to refspec_append() in e4cffacc80 (fetch: convert
refmap to use struct refspec, 2018-05-16), and I think we actually do
push an empty string onto the list. Which then triggers the "do not use
configured refspecs" logic, but doesn't match anything itself. I'm not
sure whether that behavior was entirely planned, or just what the code
happens to do. So it's doubly useful to add a test here covering the
expected behavior.

>  Documentation/fetch-options.txt |  5 ++++-
>  t/t5510-fetch.sh                | 24 ++++++++++++++++++++++++
>  2 files changed, 28 insertions(+), 1 deletion(-)

The patch looks good to me.

-Peff
Derrick Stolee Jan. 21, 2020, 6:01 p.m. UTC | #2
On 1/21/2020 11:24 AM, Jeff King wrote:
> On Tue, Jan 21, 2020 at 01:38:12AM +0000, Derrick Stolee via GitGitGadget wrote:
> 
>> Update the documentation to clarify how '--refmap=""' works and
>> create tests to guarantee this behavior remains in the future.
> 
> Yeah, this looks like a good solution to me.
> 
>> This can be accomplished by overriding the configured refspec using
>> '--refmap=' along with a custom refspec:
>>
>>   git fetch <remote> --refmap= +refs/heads/*:refs/hidden/<remote>/*
> 
> This isn't strictly related to your patch, but since the rationale here
> describes the concept of a background job and people might end up using
> it as a reference, do you want to add in --no-tags to help them out?

That's a good idea. I keep forgetting about that. It's interesting that
tags are fetched even though my refpsec does not include refs/tags.

>>     Thanks for all the feedback leading to absolutely no code change. It's
>>     good we already have the flexibility for this. I'm a bit embarrassed
>>     that I did not discover this, so perhaps this doc change and new tests
>>     will help clarify the behavior.

> Anyway, I wasn't at all sure that a blank --refmap= would do what you
> want until I tried it. But it was always intended to work that way. From
> c5558f80c3 (fetch: allow explicit --refmap to override configuration,
> 2014-05-29):
> 
>   +static int parse_refmap_arg(const struct option *opt, const char *arg, int unset)
>   +{
>   +       ALLOC_GROW(refmap_array, refmap_nr + 1, refmap_alloc);
>   +
>   +       /*
>   +        * "git fetch --refmap='' origin foo"
>   +        * can be used to tell the command not to store anywhere
>   +        */
>   +       if (*arg)
>   +               refmap_array[refmap_nr++] = arg;
>   +       return 0;
>   +}
> 
> At first I thought the comment was wrong, since we don't actually
> increment refmap_nr. But the ALLOC_GROW makes refmap_array non-NULL,
> which is what triggers the "do not use configured refspecs" logic.

This works due to a subtle arrangement of things, like a non-NULL
but "empty" array. That justifies the test even more.
 
> This code switched to refspec_append() in e4cffacc80 (fetch: convert
> refmap to use struct refspec, 2018-05-16), and I think we actually do
> push an empty string onto the list. Which then triggers the "do not use
> configured refspecs" logic, but doesn't match anything itself. I'm not
> sure whether that behavior was entirely planned, or just what the code
> happens to do. So it's doubly useful to add a test here covering the
> expected behavior.

Excellent. Glad I'm not just adding test bloat for now reason.

-Stolee
Junio C Hamano Jan. 21, 2020, 6:22 p.m. UTC | #3
Jeff King <peff@peff.net> writes:

> On Tue, Jan 21, 2020 at 01:38:12AM +0000, Derrick Stolee via GitGitGadget wrote:
>
>> Update the documentation to clarify how '--refmap=""' works and
>> create tests to guarantee this behavior remains in the future.
>
> Yeah, this looks like a good solution to me.
>
>> This can be accomplished by overriding the configured refspec using
>> '--refmap=' along with a custom refspec:
>> 
>>   git fetch <remote> --refmap= +refs/heads/*:refs/hidden/<remote>/*
>
> This isn't strictly related to your patch, but since the rationale here
> describes the concept of a background job and people might end up using
> it as a reference, do you want to add in --no-tags to help them out?

Also, is the order of the arguments correct?  I would have expected
to see <remote> after all dashed options and before refspecs.

> If it makes you feel better, I only found --refmap because I was the one
> who implemented the original "update based on refspecs" logic, and while
> looking for that commit with "git log --grep=opportunistic" I stumbled
> onto Junio's commit adding --refmap, which referenced mine. Maybe this
> also works as a good case study in why we should write good commit
> messages and refer to related work. :)

;-)

> Anyway, I wasn't at all sure that a blank --refmap= would do what you
> want until I tried it. But it was always intended to work that way. From
> c5558f80c3 (fetch: allow explicit --refmap to override configuration,
> 2014-05-29):
>
>   +static int parse_refmap_arg(const struct option *opt, const char *arg, int unset)
>   +{
>   +       ALLOC_GROW(refmap_array, refmap_nr + 1, refmap_alloc);
>   +
>   +       /*
>   +        * "git fetch --refmap='' origin foo"
>   +        * can be used to tell the command not to store anywhere
>   +        */
>   +       if (*arg)
>   +               refmap_array[refmap_nr++] = arg;
>   +       return 0;
>   +}

Good analysis, and also the answer to my previous question is also
there ;-)
Junio C Hamano Jan. 21, 2020, 6:24 p.m. UTC | #4
"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Derrick Stolee <dstolee@microsoft.com>
>
> To prevent long blocking time during a 'git fetch' call, a user
> may want to set up a schedule for background 'git fetch' processes.
> However, these runs will update the refs/remotes branches due to
> the default refspec set in the config when Git adds a remote.
> Hence the user will not notice when remote refs are updated during
> their foreground fetches. In fact, they may _want_ those refs to
> stay put so they can work with the refs from their last foreground
> fetch call.
>
> This can be accomplished by overriding the configured refspec using
> '--refmap=' along with a custom refspec:
>
>   git fetch <remote> --refmap= +refs/heads/*:refs/hidden/<remote>/*
>
> to populate a custom ref space and download a pack of the new
> reachable objects. This kind of call allows a few things to happen:
>
> 1. We download a new pack if refs have updated.
> 2. Since the refs/hidden branches exist, GC will not remove the
>    newly-downloaded data.
> 3. With fetch.writeCommitGraph enabled, the refs/hidden refs are
>    used to update the commit-graph file.
>
> To avoid the refs/hidden directory from filling without bound, the
> --prune option can be included. When providing a refspec like this,
> the --prune option does not delete remote refs and instead only
> deletes refs in the target refspace.
>
> Update the documentation to clarify how '--refmap=""' works and
> create tests to guarantee this behavior remains in the future.
>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---

Thanks.  Looks quite good.

Will queue.
Jeff King Jan. 21, 2020, 7:06 p.m. UTC | #5
On Tue, Jan 21, 2020 at 01:01:47PM -0500, Derrick Stolee wrote:

> > This isn't strictly related to your patch, but since the rationale here
> > describes the concept of a background job and people might end up using
> > it as a reference, do you want to add in --no-tags to help them out?
> 
> That's a good idea. I keep forgetting about that. It's interesting that
> tags are fetched even though my refpsec does not include refs/tags.

Yeah, because tag-following is an extra thing outside of the refspecs. I
think it would be nice if there were a way to specify a "following"
refspec, something like:

  ~refs/tags/*:refs/tags/*

(where "~" is just a character I picked out of the blue that I think
doesn't have any other meaning there now). And then we could specify it
in the config alongside other refspecs, override it with a refmap, etc.

As a bonus, it would also give us a stepping stone towards being able to
do remote-specific tags like:

  [remote "origin"]
  url = ...
  fetch = +refs/heads/*:refs/remotes/origin/heads/*
  fetch = ~refs/tags/*:refs/remotes/origin/tags/*

But I'm sure there are a lot of backwards-compatibility gotchas.

-Peff

Patch
diff mbox series

diff --git a/Documentation/fetch-options.txt b/Documentation/fetch-options.txt
index a2f78624a2..a115a1ae0e 100644
--- a/Documentation/fetch-options.txt
+++ b/Documentation/fetch-options.txt
@@ -139,7 +139,10 @@  ifndef::git-pull[]
 	specified refspec (can be given more than once) to map the
 	refs to remote-tracking branches, instead of the values of
 	`remote.*.fetch` configuration variables for the remote
-	repository.  See section on "Configured Remote-tracking
+	repository.  Providing an empty `<refspec>` to the
+	`--refmap` option causes Git to ignore the configured
+	refspecs and rely entirely on the refspecs supplied as
+	command-line arguments. See section on "Configured Remote-tracking
 	Branches" for details.
 
 -t::
diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
index 4b60282689..5f8f1c287f 100755
--- a/t/t5510-fetch.sh
+++ b/t/t5510-fetch.sh
@@ -174,6 +174,30 @@  test_expect_success 'fetch --prune --tags with refspec prunes based on refspec'
 	git rev-parse sometag
 '
 
+test_expect_success '--refmap="" ignores configured refspec' '
+	cd "$TRASH_DIRECTORY" &&
+	git clone "$D" remote-refs &&
+	git -C remote-refs rev-parse remotes/origin/master >old &&
+	git -C remote-refs update-ref refs/remotes/origin/master master~1 &&
+	git -C remote-refs rev-parse remotes/origin/master >new &&
+	git -C remote-refs fetch --refmap= origin "+refs/heads/*:refs/hidden/origin/*" &&
+	git -C remote-refs rev-parse remotes/origin/master >actual &&
+	test_cmp new actual &&
+	git -C remote-refs fetch origin &&
+	git -C remote-refs rev-parse remotes/origin/master >actual &&
+	test_cmp old actual
+'
+
+test_expect_success '--refmap="" and --prune' '
+	git -C remote-refs update-ref refs/remotes/origin/foo/otherbranch master &&
+	git -C remote-refs update-ref refs/hidden/foo/otherbranch master &&
+	git -C remote-refs fetch --prune --refmap="" origin +refs/heads/*:refs/hidden/* &&
+	git -C remote-refs rev-parse remotes/origin/foo/otherbranch &&
+	test_must_fail git -C remote-refs rev-parse refs/hidden/foo/otherbranch &&
+	git -C remote-refs fetch --prune origin &&
+	test_must_fail git -C remote-refs rev-parse remotes/origin/foo/otherbranch
+'
+
 test_expect_success 'fetch tags when there is no tags' '
 
     cd "$D" &&