diff mbox series

[v2,1/1] pack-objects: ignore ambiguous object warnings

Message ID 002868ee6bec3dac38749d0f05bf2db8da0969a5.1541536484.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series send-pack: set core.warnAmbiguousRefs=false | expand

Commit Message

Linus Arver via GitGitGadget Nov. 6, 2018, 8:34 p.m. UTC
From: Derrick Stolee <dstolee@microsoft.com>

A git push process runs several processes during its run, but one
includes git send-pack which calls git pack-objects and passes
the known have/wants into stdin using object ids. However, the
default setting for core.warnAmbiguousRefs requires git pack-objects
to check for ref names matching the ref_rev_parse_rules array in
refs.c. This means that every object is triggering at least six
"file exists?" queries.  When there are a lot of refs, this can
add up significantly! I observed a simple push spending three
seconds checking these paths.

The fix here is similar to 4c30d50 "rev-list: disable object/refname
ambiguity check with --stdin". Save the value of the global
warn_on_object_refname_ambiguity variable (which is usually set to
the boolean config variable core.warnAmbiguousRefs) and change the
state to false. Do this only during the get_object_list() method
which reads the objects from stdin.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/pack-objects.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Jeff King Nov. 6, 2018, 9:12 p.m. UTC | #1
On Tue, Nov 06, 2018 at 12:34:47PM -0800, Derrick Stolee via GitGitGadget wrote:

> From: Derrick Stolee <dstolee@microsoft.com>
> 
> A git push process runs several processes during its run, but one
> includes git send-pack which calls git pack-objects and passes
> the known have/wants into stdin using object ids. However, the
> default setting for core.warnAmbiguousRefs requires git pack-objects
> to check for ref names matching the ref_rev_parse_rules array in
> refs.c. This means that every object is triggering at least six
> "file exists?" queries.  When there are a lot of refs, this can
> add up significantly! I observed a simple push spending three
> seconds checking these paths.

Thanks, this fills out the details from the cover letter a bit better.

> The fix here is similar to 4c30d50 "rev-list: disable object/refname
> ambiguity check with --stdin". Save the value of the global
> warn_on_object_refname_ambiguity variable (which is usually set to
> the boolean config variable core.warnAmbiguousRefs) and change the
> state to false. Do this only during the get_object_list() method
> which reads the objects from stdin.

I think this parenthetical isn't quite right. We never change the value
of warn_on_object_refname_ambiguity based on user config. It's just that
if warn_ambiguous_refs is off, we do not even bother looking at the
object_refname variant.

So we'd never expect to see anything except "1" in our save_warning
variable. Doing a save/restore is just about code hygiene and
maintainability.

Other than that minor nit, this whole thing looks good to me.

-Peff
Junio C Hamano Nov. 7, 2018, 1:54 a.m. UTC | #2
Jeff King <peff@peff.net> writes:

> So we'd never expect to see anything except "1" in our save_warning
> variable. Doing a save/restore is just about code hygiene and
> maintainability.

Here is what I plan to queue.  Thanks, both.

-- >8 --
From: Derrick Stolee <dstolee@microsoft.com>
Date: Tue, 6 Nov 2018 12:34:47 -0800
Subject: [PATCH] pack-objects: ignore ambiguous object warnings

A git push process runs several processes during its run, but one
includes git send-pack which calls git pack-objects and passes
the known have/wants into stdin using object ids. However, the
default setting for core.warnAmbiguousRefs requires git pack-objects
to check for ref names matching the ref_rev_parse_rules array in
refs.c. This means that every object is triggering at least six
"file exists?" queries.  When there are a lot of refs, this can
add up significantly! I observed a simple push spending three
seconds checking these paths.

The fix here is similar to 4c30d50 "rev-list: disable object/refname
ambiguity check with --stdin".  While the get_object_list() method
reads the objects from stdin, turn warn_on_object_refname_ambiguity
flag (which is usually true) to false.  Just for code hygiene, save
away the original at the beginning and restore it once we are done.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/pack-objects.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index d1144a8f7e..f703e6df9b 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2988,6 +2988,7 @@ static void get_object_list(int ac, const char **av)
 	struct rev_info revs;
 	char line[1000];
 	int flags = 0;
+	int save_warning;
 
 	init_revisions(&revs, NULL);
 	save_commit_buffer = 0;
@@ -2996,6 +2997,9 @@ static void get_object_list(int ac, const char **av)
 	/* make sure shallows are read */
 	is_repository_shallow(the_repository);
 
+	save_warning = warn_on_object_refname_ambiguity;
+	warn_on_object_refname_ambiguity = 0;
+
 	while (fgets(line, sizeof(line), stdin) != NULL) {
 		int len = strlen(line);
 		if (len && line[len - 1] == '\n')
@@ -3022,6 +3026,8 @@ static void get_object_list(int ac, const char **av)
 			die(_("bad revision '%s'"), line);
 	}
 
+	warn_on_object_refname_ambiguity = save_warning;
+
 	if (use_bitmap_index && !get_object_list_from_bitmap(&revs))
 		return;
Junio C Hamano Nov. 7, 2018, 8:52 a.m. UTC | #3
Junio C Hamano <gitster@pobox.com> writes:

> The fix here is similar to 4c30d50 "rev-list: disable object/refname
> ambiguity check with --stdin".  While the get_object_list() method
> reads the objects from stdin, turn warn_on_object_refname_ambiguity
> flag (which is usually true) to false.  Just for code hygiene, save
> away the original at the beginning and restore it once we are done.

I actually think this is a bit too broad.  The calling process of
this program does know that it is feeding list of raw object names
(prefixed with ^ for negative ones), but the codepath this patch
touches does not know who is calling it with what.  It would be
safer to introduce a mechanism for the caller to tell this codepath
not to bother checking refnames, as it knows it is feeding the raw
object names and not refnames.

After all, you can feed things like "refs/heads/master" from the
standard input of "git pack-objects --revs", no?

> Helped-by: Jeff King <peff@peff.net>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>  builtin/pack-objects.c | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
> index d1144a8f7e..f703e6df9b 100644
> --- a/builtin/pack-objects.c
> +++ b/builtin/pack-objects.c
> @@ -2988,6 +2988,7 @@ static void get_object_list(int ac, const char **av)
>  	struct rev_info revs;
>  	char line[1000];
>  	int flags = 0;
> +	int save_warning;
>  
>  	init_revisions(&revs, NULL);
>  	save_commit_buffer = 0;
> @@ -2996,6 +2997,9 @@ static void get_object_list(int ac, const char **av)
>  	/* make sure shallows are read */
>  	is_repository_shallow(the_repository);
>  
> +	save_warning = warn_on_object_refname_ambiguity;
> +	warn_on_object_refname_ambiguity = 0;
> +
>  	while (fgets(line, sizeof(line), stdin) != NULL) {
>  		int len = strlen(line);
>  		if (len && line[len - 1] == '\n')
> @@ -3022,6 +3026,8 @@ static void get_object_list(int ac, const char **av)
>  			die(_("bad revision '%s'"), line);
>  	}
>  
> +	warn_on_object_refname_ambiguity = save_warning;
> +
>  	if (use_bitmap_index && !get_object_list_from_bitmap(&revs))
>  		return;
Jeff King Nov. 7, 2018, 9:21 a.m. UTC | #4
On Wed, Nov 07, 2018 at 05:52:05PM +0900, Junio C Hamano wrote:

> Junio C Hamano <gitster@pobox.com> writes:
> 
> > The fix here is similar to 4c30d50 "rev-list: disable object/refname
> > ambiguity check with --stdin".  While the get_object_list() method
> > reads the objects from stdin, turn warn_on_object_refname_ambiguity
> > flag (which is usually true) to false.  Just for code hygiene, save
> > away the original at the beginning and restore it once we are done.
> 
> I actually think this is a bit too broad.  The calling process of
> this program does know that it is feeding list of raw object names
> (prefixed with ^ for negative ones), but the codepath this patch
> touches does not know who is calling it with what.  It would be
> safer to introduce a mechanism for the caller to tell this codepath
> not to bother checking refnames, as it knows it is feeding the raw
> object names and not refnames.
> 
> After all, you can feed things like "refs/heads/master" from the
> standard input of "git pack-objects --revs", no?

Keep in mind that this is not disallowing "refs/heads/master", nor even
disabling the ambiguity checking if we feed a "foo" that exists as both
a tag and a branch.

It is only disabling the check that when the caller feeds a 40-hex sha1,
we do not double-check to make sure that there is not a ref of the same
name (which is not even really an ambiguity, but just an informational
message for the user; the object id has always and must always take
precedence).

So yes, the caller does know better "I am only going to feed object
ids". But you can likewise look from the callee's perspective: "I am
going to take a lot of inputs, and spending time for an informational
message for each one is not worth doing".

So I think it's justifiable from that point of view. And from a
practical point of view, it's much simpler: we do not have teach every
caller to specify its input, or risk being slowed down by a low-value
informational check.

-Peff
diff mbox series

Patch

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index d1144a8f7e..f703e6df9b 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2988,6 +2988,7 @@  static void get_object_list(int ac, const char **av)
 	struct rev_info revs;
 	char line[1000];
 	int flags = 0;
+	int save_warning;
 
 	init_revisions(&revs, NULL);
 	save_commit_buffer = 0;
@@ -2996,6 +2997,9 @@  static void get_object_list(int ac, const char **av)
 	/* make sure shallows are read */
 	is_repository_shallow(the_repository);
 
+	save_warning = warn_on_object_refname_ambiguity;
+	warn_on_object_refname_ambiguity = 0;
+
 	while (fgets(line, sizeof(line), stdin) != NULL) {
 		int len = strlen(line);
 		if (len && line[len - 1] == '\n')
@@ -3022,6 +3026,8 @@  static void get_object_list(int ac, const char **av)
 			die(_("bad revision '%s'"), line);
 	}
 
+	warn_on_object_refname_ambiguity = save_warning;
+
 	if (use_bitmap_index && !get_object_list_from_bitmap(&revs))
 		return;