diff mbox series

[2/2] partial-clone: avoid fetching when looking for objects

Message ID 937a882261d4d4552a144e5f0efad8abd8002ab4.1582129312.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series Document two partial clone bugs, fix one | expand

Commit Message

Linus Arver via GitGitGadget Feb. 19, 2020, 4:21 p.m. UTC
From: Derrick Stolee <dstolee@microsoft.com>

When using partial-clone, do_oid_object_info_extended() can trigger a
fetch for missing objects. This can be extremely expensive when asking
for a tag or commit, as we are completely removed from the context of
the missing object and thus supply no "haves" in the request.

6462d5eb9a (fetch: remove fetch_if_missing=0, 2019-11-05) removed a
global variable that prevented these fetches in favor of a bitflag.
However, some object existence checks were not updated to use this flag.

Update find_non_local_tags() to use OBJECT_INFO_SKIP_FETCH_OBJECT in
addition to OBJECT_INFO_QUICK. The _QUICK option only prevents
repreparing the pack-file structures. We need to be extremely careful
about supplying _SKIP_FETCH_OBJECT when we expect an object to not exist
due to updated refs.

This resolves a broken test in t5616-partial-clone.sh.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/fetch.c          | 10 +++++-----
 t/t5616-partial-clone.sh |  2 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

Comments

Jonathan Tan Feb. 19, 2020, 6:10 p.m. UTC | #1
> From: Derrick Stolee <dstolee@microsoft.com>
> 
> When using partial-clone, do_oid_object_info_extended() can trigger a
> fetch for missing objects. This can be extremely expensive when asking
> for a tag or commit, as we are completely removed from the context of
> the missing object and thus supply no "haves" in the request.
> 
> 6462d5eb9a (fetch: remove fetch_if_missing=0, 2019-11-05) removed a
> global variable that prevented these fetches in favor of a bitflag.
> However, some object existence checks were not updated to use this flag.
> 
> Update find_non_local_tags() to use OBJECT_INFO_SKIP_FETCH_OBJECT in
> addition to OBJECT_INFO_QUICK. The _QUICK option only prevents
> repreparing the pack-file structures. We need to be extremely careful
> about supplying _SKIP_FETCH_OBJECT when we expect an object to not exist
> due to updated refs.
> 
> This resolves a broken test in t5616-partial-clone.sh.
> 
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>

Thanks for catching this.

I wonder if the commit message in this patch could be better worded -
the first paragraph seems to say that fetching missing commits and tags
are expensive, but that is not the problem here; the problem is that the
client lazily fetches refs advertised by the server, thinking that it is
lacking them due to partial clone, even when there is no expectation
that the client have them (so the commits and tags are not truly
missing).

So I would reword the first paragraph as:

  When using partial clone, find_non_local_tags() in builtin/fetch.c
  checks each remote tag to see if its object also exists locally. There
  is no expectation that the object exist locally, but this function
  nevertheless triggers a lazy fetch if the object does not exist. This
  can be extremely expensive when asking for a commit, as we are
  completely removed from the context of the non-existent object and
  thus supply no "haves" in the request.

All this rests on my thinking that "missing" has the connotation (or
actual meaning) that we expect the object to be there. If we think that
"missing" can also mean that the remote has it but the local doesn't,
then you can ignore what I just said :-)

Other than that, both patches look good to me.
diff mbox series

Patch

diff --git a/builtin/fetch.c b/builtin/fetch.c
index b4c6d921d06..fd69c4f69d7 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -335,6 +335,7 @@  static void find_non_local_tags(const struct ref *refs,
 	struct string_list_item *remote_ref_item;
 	const struct ref *ref;
 	struct refname_hash_entry *item = NULL;
+	const int quick_flags = OBJECT_INFO_QUICK | OBJECT_INFO_SKIP_FETCH_OBJECT;
 
 	refname_hash_init(&existing_refs);
 	refname_hash_init(&remote_refs);
@@ -353,10 +354,9 @@  static void find_non_local_tags(const struct ref *refs,
 		 */
 		if (ends_with(ref->name, "^{}")) {
 			if (item &&
-			    !has_object_file_with_flags(&ref->old_oid,
-							OBJECT_INFO_QUICK) &&
+			    !has_object_file_with_flags(&ref->old_oid, quick_flags) &&
 			    !oidset_contains(&fetch_oids, &ref->old_oid) &&
-			    !has_object_file_with_flags(&item->oid, OBJECT_INFO_QUICK) &&
+			    !has_object_file_with_flags(&item->oid, quick_flags) &&
 			    !oidset_contains(&fetch_oids, &item->oid))
 				clear_item(item);
 			item = NULL;
@@ -370,7 +370,7 @@  static void find_non_local_tags(const struct ref *refs,
 		 * fetch.
 		 */
 		if (item &&
-		    !has_object_file_with_flags(&item->oid, OBJECT_INFO_QUICK) &&
+		    !has_object_file_with_flags(&item->oid, quick_flags) &&
 		    !oidset_contains(&fetch_oids, &item->oid))
 			clear_item(item);
 
@@ -391,7 +391,7 @@  static void find_non_local_tags(const struct ref *refs,
 	 * checked to see if it needs fetching.
 	 */
 	if (item &&
-	    !has_object_file_with_flags(&item->oid, OBJECT_INFO_QUICK) &&
+	    !has_object_file_with_flags(&item->oid, quick_flags) &&
 	    !oidset_contains(&fetch_oids, &item->oid))
 		clear_item(item);
 
diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
index ed2ef45c37a..c70516734d5 100755
--- a/t/t5616-partial-clone.sh
+++ b/t/t5616-partial-clone.sh
@@ -385,7 +385,7 @@  test_expect_failure 'verify fetch succeeds when asking for new tags' '
 	git -C tag-test fetch --tags origin
 '
 
-test_expect_failure 'verify fetch downloads only one pack when updating refs' '
+test_expect_success 'verify fetch downloads only one pack when updating refs' '
 	git clone --filter=blob:none "file://$(pwd)/srv.bare" pack-test &&
 	ls pack-test/.git/objects/pack/*pack >pack-list &&
 	test_line_count = 2 pack-list &&