mbox series

[v2,0/2] Document two partial clone bugs, fix one

Message ID pull.556.v2.git.1582321648.gitgitgadget@gmail.com (mailing list archive)
Headers show
Series Document two partial clone bugs, fix one | expand

Message

John Passaro via GitGitGadget Feb. 21, 2020, 9:47 p.m. UTC
While playing with partial clone, I discovered a few bugs and document them
with tests in patch 1. One seems to be a server-side bug that happens in a
somewhat rare situation, but not terribly unlikely. The other is a
client-side bug that leads to quadratic amounts of data transfer; I fix this
bug in patch 2.

UPDATES in V2:

 * Added "|| return 1" inside the for loops.
   
   
 * Added an in-test comment about the test ordering.
   
   
 * Required protocol.version=2 in the tags test due to the bisect Junio
   performed.
   
   
 * Updated the commit message via Jonathan Tan's suggestion.
   
   

You can ignore the stack traces I sent earlier, as those seem to be from
states I cannot get into without being destructive to my .git directory.

Thanks, -Stolee

Derrick Stolee (2):
  partial-clone: demonstrate bugs in partial fetch
  partial-clone: avoid fetching when looking for objects

 builtin/fetch.c          | 10 +++++-----
 t/t5616-partial-clone.sh | 31 +++++++++++++++++++++++++++++++
 2 files changed, 36 insertions(+), 5 deletions(-)


base-commit: d0654dc308b0ba76dd8ed7bbb33c8d8f7aacd783
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-556%2Fderrickstolee%2Fpartial-clone-fix-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-556/derrickstolee/partial-clone-fix-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/556

Range-diff vs v1:

 1:  dbc1bdcae16 ! 1:  04965a8c7a4 partial-clone: demonstrate bugs in partial fetch
     @@ -14,7 +14,8 @@
          In the first test, we find that when fetching with blob filters from
          a repository that previously did not have any tags, the 'git fetch
          --tags origin' command fails because the server sends "multiple
     -    filter-specs cannot be combined".
     +    filter-specs cannot be combined". This only happens when using
     +    protocol v2.
      
          In the second test, we see that a 'git fetch origin' request with
          several ref updates results in multiple pack-file downloads. This must
     @@ -41,15 +42,20 @@
       	grep "want $(cat hash)" trace
       '
       
     ++# The following two tests must be in this order, or else
     ++# the first will not fail. It is important that the srv.bare
     ++# repository did not have tags during clone, but has tags
     ++# in the fetch.
     ++
      +test_expect_failure 'verify fetch succeeds when asking for new tags' '
      +	git clone --filter=blob:none "file://$(pwd)/srv.bare" tag-test &&
      +	for i in I J K
      +	do
      +		test_commit -C src $i &&
     -+		git -C src branch $i
     ++		git -C src branch $i || return 1
      +	done &&
      +	git -C srv.bare fetch --tags origin +refs/heads/*:refs/heads/* &&
     -+	git -C tag-test fetch --tags origin
     ++	git -C tag-test -c protocol.version=2 fetch --tags origin
      +'
      +
      +test_expect_failure 'verify fetch downloads only one pack when updating refs' '
     @@ -59,7 +65,7 @@
      +	for i in A B C
      +	do
      +		test_commit -C src $i &&
     -+		git -C src branch $i
     ++		git -C src branch $i || return 1
      +	done &&
      +	git -C srv.bare fetch origin +refs/heads/*:refs/heads/* &&
      +	git -C pack-test fetch origin &&
 2:  937a882261d ! 2:  7c4c9f0f8e1 partial-clone: avoid fetching when looking for objects
     @@ -2,10 +2,13 @@
      
          partial-clone: avoid fetching when looking for objects
      
     -    When using partial-clone, do_oid_object_info_extended() can trigger a
     -    fetch for missing objects. This can be extremely expensive when asking
     -    for a tag or commit, as we are completely removed from the context of
     -    the missing object and thus supply no "haves" in the request.
     +    When using partial clone, find_non_local_tags() in builtin/fetch.c
     +    checks each remote tag to see if its object also exists locally. There
     +    is no expectation that the object exist locally, but this function
     +    nevertheless triggers a lazy fetch if the object does not exist. This
     +    can be extremely expensive when asking for a commit, as we are
     +    completely removed from the context of the non-existent object and
     +    thus supply no "haves" in the request.
      
          6462d5eb9a (fetch: remove fetch_if_missing=0, 2019-11-05) removed a
          global variable that prevented these fetches in favor of a bitflag.
     @@ -68,7 +71,7 @@
       --- a/t/t5616-partial-clone.sh
       +++ b/t/t5616-partial-clone.sh
      @@
     - 	git -C tag-test fetch --tags origin
     + 	git -C tag-test -c protocol.version=2 fetch --tags origin
       '
       
      -test_expect_failure 'verify fetch downloads only one pack when updating refs' '

Comments

Junio C Hamano Feb. 22, 2020, 5:25 p.m. UTC | #1
"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> While playing with partial clone, I discovered a few bugs and document them
> with tests in patch 1. One seems to be a server-side bug that happens in a
> somewhat rare situation, but not terribly unlikely. The other is a
> client-side bug that leads to quadratic amounts of data transfer; I fix this
> bug in patch 2.
>
> UPDATES in V2:
>
>  * Added "|| return 1" inside the for loops.
>    
>    
>  * Added an in-test comment about the test ordering.
>    
>    
>  * Required protocol.version=2 in the tags test due to the bisect Junio
>    performed.
>    
>    
>  * Updated the commit message via Jonathan Tan's suggestion.
>    

Now this can safely be queued directly on v2.25.0, I'll
rebase it (earlyer I queued it after the merge to make protocol v2
the default).

Thanks.