Message ID | 20200401121537.GA1916590@coredump.intra.peff.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | clone: use "quick" lookup while following tags | expand |
01.04.2020, 15:15, "Jeff King" <peff@peff.net>: > On Sat, Mar 28, 2020 at 10:40:23AM -0400, Jeff King wrote: > >> So I guess the problem is not with shallow clones specifically, but they >> lead us to not having fetched the commits pointed to by tags, which >> leads to us trying to fault in those commits (and their trees) rather >> than realizing that we weren't meant to have them. And the size of the >> local repo balloons because you're fetching all those commits one by >> one, and not getting the benefit of the deltas you would when you do a >> single --filter=blob:none fetch. >> >> I guess we need something like this: > > The issue is actually with --single-branch, which is implied by --depth. > But the fix is the same either way. > > Here it is with a commit message and test. > > -- >8 -- > Subject: [PATCH] clone: use "quick" lookup while following tags > > When cloning with --single-branch, we implement git-fetch's usual > tag-following behavior, grabbing any tag objects that point to objects > we have locally. > > When we're a partial clone, though, our has_object_file() check will > actually lazy-fetch each tag. That not only defeats the purpose of > --single-branch, but it does it incredibly slowly, potentially kicking > off a new fetch for each tag. This is even worse for a shallow clone, > which implies --single-branch, because even tags which are supersets of > each other will be fetched individually. > > We can fix this by passing OBJECT_INFO_SKIP_FETCH_OBJECT to the call, > which is what git-fetch does in this case. > > Likewise, let's include OBJECT_INFO_QUICK, as that's what git-fetch > does. The rationale is discussed in 5827a03545 (fetch: use "quick" > has_sha1_file for tag following, 2016-10-13), but here the tradeoff > would apply even more so because clone is very unlikely to be racing > with another process repacking our newly-created repository. > > This may provide a very small speedup even in the non-partial case case, > as we'd avoid calling reprepare_packed_git() for each tag (though in > practice, we'd only have a single packfile, so that reprepare should be > quite cheap). > > Signed-off-by: Jeff King <peff@peff.net> > --- > builtin/clone.c | 4 +++- > t/t5616-partial-clone.sh | 8 ++++++++ > 2 files changed, 11 insertions(+), 1 deletion(-) > > diff --git a/builtin/clone.c b/builtin/clone.c > index d8b1f413aa..9da6459f1d 100644 > --- a/builtin/clone.c > +++ b/builtin/clone.c > @@ -643,7 +643,9 @@ static void write_followtags(const struct ref *refs, const char *msg) > continue; > if (ends_with(ref->name, "^{}")) > continue; > - if (!has_object_file(&ref->old_oid)) > + if (!has_object_file_with_flags(&ref->old_oid, > + OBJECT_INFO_QUICK | > + OBJECT_INFO_SKIP_FETCH_OBJECT)) > continue; > update_ref(msg, ref->name, &ref->old_oid, NULL, 0, > UPDATE_REFS_DIE_ON_ERR); > diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh > index 77bb91e976..8f0d81a27e 100755 > --- a/t/t5616-partial-clone.sh > +++ b/t/t5616-partial-clone.sh > @@ -415,6 +415,14 @@ test_expect_success 'verify fetch downloads only one pack when updating refs' ' > test_line_count = 3 pack-list > ' > > +test_expect_success 'single-branch tag following respects partial clone' ' > + git clone --single-branch -b B --filter=blob:none \ > + "file://$(pwd)/srv.bare" single && > + git -C single rev-parse --verify refs/tags/B && > + git -C single rev-parse --verify refs/tags/A && > + test_must_fail git -C single rev-parse --verify refs/tags/C > +' > + > . "$TEST_DIRECTORY"/lib-httpd.sh > start_httpd > > -- > 2.26.0.408.gebd8a4413c Would you recommend using this patch in production, or is it better to wait for reviews?
On Wed, Apr 01, 2020 at 10:12:30PM +0300, Konstantin Tokarev wrote:
> Would you recommend using this patch in production, or is it better to wait for reviews?
Only you know the risk tolerance of your production systems, so I make
no guarantees. But I would personally think it is quite safe.
-Peff
diff --git a/builtin/clone.c b/builtin/clone.c index d8b1f413aa..9da6459f1d 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -643,7 +643,9 @@ static void write_followtags(const struct ref *refs, const char *msg) continue; if (ends_with(ref->name, "^{}")) continue; - if (!has_object_file(&ref->old_oid)) + if (!has_object_file_with_flags(&ref->old_oid, + OBJECT_INFO_QUICK | + OBJECT_INFO_SKIP_FETCH_OBJECT)) continue; update_ref(msg, ref->name, &ref->old_oid, NULL, 0, UPDATE_REFS_DIE_ON_ERR); diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh index 77bb91e976..8f0d81a27e 100755 --- a/t/t5616-partial-clone.sh +++ b/t/t5616-partial-clone.sh @@ -415,6 +415,14 @@ test_expect_success 'verify fetch downloads only one pack when updating refs' ' test_line_count = 3 pack-list ' +test_expect_success 'single-branch tag following respects partial clone' ' + git clone --single-branch -b B --filter=blob:none \ + "file://$(pwd)/srv.bare" single && + git -C single rev-parse --verify refs/tags/B && + git -C single rev-parse --verify refs/tags/A && + test_must_fail git -C single rev-parse --verify refs/tags/C +' + . "$TEST_DIRECTORY"/lib-httpd.sh start_httpd