Message ID | 002868ee6bec3dac38749d0f05bf2db8da0969a5.1541536484.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | send-pack: set core.warnAmbiguousRefs=false | expand |
On Tue, Nov 06, 2018 at 12:34:47PM -0800, Derrick Stolee via GitGitGadget wrote: > From: Derrick Stolee <dstolee@microsoft.com> > > A git push process runs several processes during its run, but one > includes git send-pack which calls git pack-objects and passes > the known have/wants into stdin using object ids. However, the > default setting for core.warnAmbiguousRefs requires git pack-objects > to check for ref names matching the ref_rev_parse_rules array in > refs.c. This means that every object is triggering at least six > "file exists?" queries. When there are a lot of refs, this can > add up significantly! I observed a simple push spending three > seconds checking these paths. Thanks, this fills out the details from the cover letter a bit better. > The fix here is similar to 4c30d50 "rev-list: disable object/refname > ambiguity check with --stdin". Save the value of the global > warn_on_object_refname_ambiguity variable (which is usually set to > the boolean config variable core.warnAmbiguousRefs) and change the > state to false. Do this only during the get_object_list() method > which reads the objects from stdin. I think this parenthetical isn't quite right. We never change the value of warn_on_object_refname_ambiguity based on user config. It's just that if warn_ambiguous_refs is off, we do not even bother looking at the object_refname variant. So we'd never expect to see anything except "1" in our save_warning variable. Doing a save/restore is just about code hygiene and maintainability. Other than that minor nit, this whole thing looks good to me. -Peff
Jeff King <peff@peff.net> writes: > So we'd never expect to see anything except "1" in our save_warning > variable. Doing a save/restore is just about code hygiene and > maintainability. Here is what I plan to queue. Thanks, both. -- >8 -- From: Derrick Stolee <dstolee@microsoft.com> Date: Tue, 6 Nov 2018 12:34:47 -0800 Subject: [PATCH] pack-objects: ignore ambiguous object warnings A git push process runs several processes during its run, but one includes git send-pack which calls git pack-objects and passes the known have/wants into stdin using object ids. However, the default setting for core.warnAmbiguousRefs requires git pack-objects to check for ref names matching the ref_rev_parse_rules array in refs.c. This means that every object is triggering at least six "file exists?" queries. When there are a lot of refs, this can add up significantly! I observed a simple push spending three seconds checking these paths. The fix here is similar to 4c30d50 "rev-list: disable object/refname ambiguity check with --stdin". While the get_object_list() method reads the objects from stdin, turn warn_on_object_refname_ambiguity flag (which is usually true) to false. Just for code hygiene, save away the original at the beginning and restore it once we are done. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> --- builtin/pack-objects.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index d1144a8f7e..f703e6df9b 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -2988,6 +2988,7 @@ static void get_object_list(int ac, const char **av) struct rev_info revs; char line[1000]; int flags = 0; + int save_warning; init_revisions(&revs, NULL); save_commit_buffer = 0; @@ -2996,6 +2997,9 @@ static void get_object_list(int ac, const char **av) /* make sure shallows are read */ is_repository_shallow(the_repository); + save_warning = warn_on_object_refname_ambiguity; + warn_on_object_refname_ambiguity = 0; + while (fgets(line, sizeof(line), stdin) != NULL) { int len = strlen(line); if (len && line[len - 1] == '\n') @@ -3022,6 +3026,8 @@ static void get_object_list(int ac, const char **av) die(_("bad revision '%s'"), line); } + warn_on_object_refname_ambiguity = save_warning; + if (use_bitmap_index && !get_object_list_from_bitmap(&revs)) return;
Junio C Hamano <gitster@pobox.com> writes: > The fix here is similar to 4c30d50 "rev-list: disable object/refname > ambiguity check with --stdin". While the get_object_list() method > reads the objects from stdin, turn warn_on_object_refname_ambiguity > flag (which is usually true) to false. Just for code hygiene, save > away the original at the beginning and restore it once we are done. I actually think this is a bit too broad. The calling process of this program does know that it is feeding list of raw object names (prefixed with ^ for negative ones), but the codepath this patch touches does not know who is calling it with what. It would be safer to introduce a mechanism for the caller to tell this codepath not to bother checking refnames, as it knows it is feeding the raw object names and not refnames. After all, you can feed things like "refs/heads/master" from the standard input of "git pack-objects --revs", no? > Helped-by: Jeff King <peff@peff.net> > Signed-off-by: Derrick Stolee <dstolee@microsoft.com> > Signed-off-by: Junio C Hamano <gitster@pobox.com> > --- > builtin/pack-objects.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c > index d1144a8f7e..f703e6df9b 100644 > --- a/builtin/pack-objects.c > +++ b/builtin/pack-objects.c > @@ -2988,6 +2988,7 @@ static void get_object_list(int ac, const char **av) > struct rev_info revs; > char line[1000]; > int flags = 0; > + int save_warning; > > init_revisions(&revs, NULL); > save_commit_buffer = 0; > @@ -2996,6 +2997,9 @@ static void get_object_list(int ac, const char **av) > /* make sure shallows are read */ > is_repository_shallow(the_repository); > > + save_warning = warn_on_object_refname_ambiguity; > + warn_on_object_refname_ambiguity = 0; > + > while (fgets(line, sizeof(line), stdin) != NULL) { > int len = strlen(line); > if (len && line[len - 1] == '\n') > @@ -3022,6 +3026,8 @@ static void get_object_list(int ac, const char **av) > die(_("bad revision '%s'"), line); > } > > + warn_on_object_refname_ambiguity = save_warning; > + > if (use_bitmap_index && !get_object_list_from_bitmap(&revs)) > return;
On Wed, Nov 07, 2018 at 05:52:05PM +0900, Junio C Hamano wrote: > Junio C Hamano <gitster@pobox.com> writes: > > > The fix here is similar to 4c30d50 "rev-list: disable object/refname > > ambiguity check with --stdin". While the get_object_list() method > > reads the objects from stdin, turn warn_on_object_refname_ambiguity > > flag (which is usually true) to false. Just for code hygiene, save > > away the original at the beginning and restore it once we are done. > > I actually think this is a bit too broad. The calling process of > this program does know that it is feeding list of raw object names > (prefixed with ^ for negative ones), but the codepath this patch > touches does not know who is calling it with what. It would be > safer to introduce a mechanism for the caller to tell this codepath > not to bother checking refnames, as it knows it is feeding the raw > object names and not refnames. > > After all, you can feed things like "refs/heads/master" from the > standard input of "git pack-objects --revs", no? Keep in mind that this is not disallowing "refs/heads/master", nor even disabling the ambiguity checking if we feed a "foo" that exists as both a tag and a branch. It is only disabling the check that when the caller feeds a 40-hex sha1, we do not double-check to make sure that there is not a ref of the same name (which is not even really an ambiguity, but just an informational message for the user; the object id has always and must always take precedence). So yes, the caller does know better "I am only going to feed object ids". But you can likewise look from the callee's perspective: "I am going to take a lot of inputs, and spending time for an informational message for each one is not worth doing". So I think it's justifiable from that point of view. And from a practical point of view, it's much simpler: we do not have teach every caller to specify its input, or risk being slowed down by a low-value informational check. -Peff
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index d1144a8f7e..f703e6df9b 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -2988,6 +2988,7 @@ static void get_object_list(int ac, const char **av) struct rev_info revs; char line[1000]; int flags = 0; + int save_warning; init_revisions(&revs, NULL); save_commit_buffer = 0; @@ -2996,6 +2997,9 @@ static void get_object_list(int ac, const char **av) /* make sure shallows are read */ is_repository_shallow(the_repository); + save_warning = warn_on_object_refname_ambiguity; + warn_on_object_refname_ambiguity = 0; + while (fgets(line, sizeof(line), stdin) != NULL) { int len = strlen(line); if (len && line[len - 1] == '\n') @@ -3022,6 +3026,8 @@ static void get_object_list(int ac, const char **av) die(_("bad revision '%s'"), line); } + warn_on_object_refname_ambiguity = save_warning; + if (use_bitmap_index && !get_object_list_from_bitmap(&revs)) return;