diff mbox series

setlocalversion: Add workaround for "git describe" performance issue

Message ID 309549cafdcfe50c4fceac3263220cc3d8b109b2.1730337435.git.jpoimboe@kernel.org (mailing list archive)
State New
Headers show
Series setlocalversion: Add workaround for "git describe" performance issue | expand

Commit Message

Josh Poimboeuf Oct. 31, 2024, 1:20 a.m. UTC
If HEAD isn't associated with an annotated tag, a bug (or feature?) in
"git describe --match" causes it to search every commit in the entire
repository looking for additional match candidates.  Instead of it
taking a fraction of a second, it adds 10-15 seconds to the beginning of
every kernel build.

Fix it by adding an additional dummy match which is slightly further
away from the most recent one, along with setting the max candidate
count to 1 (not 2, apparently another bug).

Before:

  $ git checkout c1e939a21eb1
  $ time make kernel/fork.o -s

  real	0m12.403s
  user	0m11.591s
  sys	0m0.967s

After:

  $ time make kernel/fork.o -s

  real	0m1.119s
  user	0m0.658s
  sys	0m0.659s

Link: https://lore.kernel.org/git/20241030044322.b5n3ji2n6gaeo5u6@treble.attlocal.net/
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
---
 scripts/setlocalversion | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

Comments

Rasmus Villemoes Oct. 31, 2024, 10:37 a.m. UTC | #1
On Wed, Oct 30 2024, Josh Poimboeuf <jpoimboe@kernel.org> wrote:

> If HEAD isn't associated with an annotated tag, a bug (or feature?) in
> "git describe --match" causes it to search every commit in the entire
> repository looking for additional match candidates.  Instead of it
> taking a fraction of a second, it adds 10-15 seconds to the beginning of
> every kernel build.
>
> Fix it by adding an additional dummy match which is slightly further
> away from the most recent one, along with setting the max candidate
> count to 1 (not 2, apparently another bug).
>

cc += git list

Hm, I tried looking at the git describe source code, and while I can't
claim I understand it very well, I think the main problem is around
this part:

			if (!tags && !all && n->prio < 2) {
				unannotated_cnt++;
			} else if (match_cnt < max_candidates) {
				struct possible_tag *t = &all_matches[match_cnt++];
				t->name = n;
				t->depth = seen_commits - 1;
				t->flag_within = 1u << match_cnt;
				t->found_order = match_cnt;
				c->object.flags |= t->flag_within;
				if (n->prio == 2)
					annotated_cnt++;
			}
			else {
				gave_up_on = c;
				break;
			}

So in the case where one doesn't pass any --match, we get something like

git describe --debug 5f78aec0d7e9
describe 5f78aec0d7e9
No exact match on refs or tags, searching to describe
 annotated        243 v4.19-rc5
 annotated        485 v4.19-rc4
 annotated        814 v4.19-rc3
 annotated       1124 v4.19-rc2
 annotated       1391 v4.19-rc1
 annotated      10546 v4.18
 annotated      10611 v4.18-rc8
 annotated      10819 v4.18-rc7
 annotated      11029 v4.18-rc6
 annotated      11299 v4.18-rc5
traversed 11400 commits
more than 10 tags found; listed 10 most recent
gave up search at 1e4b044d22517cae7047c99038abb444423243ca
v4.19-rc5-243-g5f78aec0d7e9

and that "gave up" commit is v4.18-rc4, the eleventh commit
encountered. That also explains why you have to add a "dummy" second
--match to make --candidates=1 have the expected behaviour.

Perhaps the logic should instead be that as soon as match_cnt hits
max_candidates (i.e. all the tags we're going to consider have actually
been visited), we break out. That is, the last "else" above should
instead be replaced by

  if (match_cnt == max_candidates) {
    ... /* ? , gave_up_on is now a misnomer */
    break;
  }

Then as a further DWIM aid, wherever the initialization logic is could
be updated so that, after expanding all the --match= wildcards, if the
number of tags is less than max_candidates, automatically lower
max_candidates to that number (which in the setlocalversion case will
always be 1 because we're not actually passing a wildcard).

Or, we could explicitly on the kernel side pass that --candidates=1, but
yes, I agree it looks like a bug that the loop requires encountering +1
tag to hit that break;.

Rasmus
Jeff King Oct. 31, 2024, 11:42 a.m. UTC | #2
On Thu, Oct 31, 2024 at 11:37:27AM +0100, Rasmus Villemoes wrote:

> and that "gave up" commit is v4.18-rc4, the eleventh commit
> encountered. That also explains why you have to add a "dummy" second
> --match to make --candidates=1 have the expected behaviour.
> 
> Perhaps the logic should instead be that as soon as match_cnt hits
> max_candidates (i.e. all the tags we're going to consider have actually
> been visited), we break out. That is, the last "else" above should
> instead be replaced by
> 
>   if (match_cnt == max_candidates) {
>     ... /* ? , gave_up_on is now a misnomer */
>     break;
>   }

Yes, I agree that is the right direction. Replacing the "else" entirely
feels a little weird, because it is part of the:

  if (!tags && !all && n->prio < 2)
	...
  else if (match_cnt < max_candidates)
	...
  else
	...

So we'd now run that check even if we triggered the first block. But I
don't think it should matter in practice. We only increment match_cnt in
the else-if here. So the "else" block could go away, and the check for
giving up could go inside the else-if.

It does seem like gave_up_on is now pointless, but I'm not sure I
understand all of the code here. I assumed that it was only used to
report "this is where we gave up", and to give you the extra bit of
information that there _were_ other candidates that we omitted (and not
just exactly max_candidates). Of course we don't show that without
--debug. So it seems silly to spend a bunch of extra CPU for that.

But the plot thickens.

What I was going to suggest is that if we wanted to retain that one bit
of information, what we could do instead is: independent of
max_candidates, see if we've found all of the possible names we expanded
from --match. Then max_candidates would work as it does now, but we'd
avoid fruitlessly searching when there are no more names to find.

Counting the number of expanded names is a little weird. We use them to
annotate the commits, but of course multiple names can point to a single
commit, and there's a priority override system. I think the final number
we can find is the number of entries in the "names" hash.

So I expected this to work:

diff --git a/builtin/describe.c b/builtin/describe.c
index 7330a77b38..70a11072de 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -380,6 +380,9 @@ static void describe_commit(struct object_id *oid, struct strbuf *dst)
 				c->object.flags |= t->flag_within;
 				if (n->prio == 2)
 					annotated_cnt++;
+
+				if (match_cnt == hashmap_get_size(&names))
+					break;
 			}
 			else {
 				gave_up_on = c;

but it's still slow! If we set "gave_up_on = c", then it gets fast. I'm
not sure why that is. Later we do:

        if (gave_up_on) {
                commit_list_insert_by_date(gave_up_on, &list);
                seen_commits--;
        }
        seen_commits += finish_depth_computation(&list, &all_matches[0]);

but I don't at all understand why adding gave_up_on lets that finish
sooner. So I'm worried we're missing something about how it is used.

One hack is to just, like the max_candidates case, let us look at one
_more_ commit before bailing. Like this:

diff --git a/builtin/describe.c b/builtin/describe.c
index 7330a77b38..177c8232f6 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -365,6 +365,11 @@ static void describe_commit(struct object_id *oid, struct strbuf *dst)
 		struct commit_list *parents = c->parents;
 		struct commit_name **slot;
 
+		if (match_cnt == hashmap_get_size(&names)) {
+			gave_up_on = c;
+			break;
+		}
+
 		seen_commits++;
 		slot = commit_names_peek(&commit_names, c);
 		n = slot ? *slot : NULL;


That works, but I have a feeling that figured out what the heck is going
on with gave_up_on might produce a more elegant solution.

> Then as a further DWIM aid, wherever the initialization logic is could
> be updated so that, after expanding all the --match= wildcards, if the
> number of tags is less than max_candidates, automatically lower
> max_candidates to that number (which in the setlocalversion case will
> always be 1 because we're not actually passing a wildcard).

Yeah, I had the same thought (though if we do a separate hashmap check
as above, it wouldn't be needed).

-Peff
Masahiro Yamada Oct. 31, 2024, 11:43 a.m. UTC | #3
On Thu, Oct 31, 2024 at 11:37 AM Rasmus Villemoes <ravi@prevas.dk> wrote:
>
> On Wed, Oct 30 2024, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
>
> > If HEAD isn't associated with an annotated tag, a bug (or feature?) in
> > "git describe --match" causes it to search every commit in the entire
> > repository looking for additional match candidates.  Instead of it
> > taking a fraction of a second, it adds 10-15 seconds to the beginning of
> > every kernel build.
> >
> > Fix it by adding an additional dummy match which is slightly further
> > away from the most recent one, along with setting the max candidate
> > count to 1 (not 2, apparently another bug).
> >
>
> cc += git list
>
> Hm, I tried looking at the git describe source code, and while I can't
> claim I understand it very well, I think the main problem is around
> this part:
>
>                         if (!tags && !all && n->prio < 2) {
>                                 unannotated_cnt++;
>                         } else if (match_cnt < max_candidates) {
>                                 struct possible_tag *t = &all_matches[match_cnt++];
>                                 t->name = n;
>                                 t->depth = seen_commits - 1;
>                                 t->flag_within = 1u << match_cnt;
>                                 t->found_order = match_cnt;
>                                 c->object.flags |= t->flag_within;
>                                 if (n->prio == 2)
>                                         annotated_cnt++;
>                         }
>                         else {
>                                 gave_up_on = c;
>                                 break;
>                         }
>
> So in the case where one doesn't pass any --match, we get something like
>
> git describe --debug 5f78aec0d7e9
> describe 5f78aec0d7e9
> No exact match on refs or tags, searching to describe
>  annotated        243 v4.19-rc5
>  annotated        485 v4.19-rc4
>  annotated        814 v4.19-rc3
>  annotated       1124 v4.19-rc2
>  annotated       1391 v4.19-rc1
>  annotated      10546 v4.18
>  annotated      10611 v4.18-rc8
>  annotated      10819 v4.18-rc7
>  annotated      11029 v4.18-rc6
>  annotated      11299 v4.18-rc5
> traversed 11400 commits
> more than 10 tags found; listed 10 most recent
> gave up search at 1e4b044d22517cae7047c99038abb444423243ca
> v4.19-rc5-243-g5f78aec0d7e9
>
> and that "gave up" commit is v4.18-rc4, the eleventh commit
> encountered. That also explains why you have to add a "dummy" second
> --match to make --candidates=1 have the expected behaviour.
>
> Perhaps the logic should instead be that as soon as match_cnt hits
> max_candidates (i.e. all the tags we're going to consider have actually
> been visited), we break out. That is, the last "else" above should
> instead be replaced by
>
>   if (match_cnt == max_candidates) {
>     ... /* ? , gave_up_on is now a misnomer */
>     break;
>   }
>
> Then as a further DWIM aid, wherever the initialization logic is could
> be updated so that, after expanding all the --match= wildcards, if the
> number of tags is less than max_candidates, automatically lower
> max_candidates to that number (which in the setlocalversion case will
> always be 1 because we're not actually passing a wildcard).
>
> Or, we could explicitly on the kernel side pass that --candidates=1, but
> yes, I agree it looks like a bug that the loop requires encountering +1
> tag to hit that break;.
>
> Rasmus
>

I still do not understand the logic either.

git traverses all the way back to
d8470b7c13e11c18cf14a7e3180f0b00e715e4f0.



$ git describe --match=v6.12-rc5 --debug c1e939a21eb1
describe c1e939a21eb1
No exact match on refs or tags, searching to describe
finished search at d8470b7c13e11c18cf14a7e3180f0b00e715e4f0
 annotated         44 v6.12-rc5
traversed 1310005 commits
v6.12-rc5-44-gc1e939a21eb1



Or, more simply,


$ git describe --match=v6.12-rc5 --debug e42b1a9a2557
describe e42b1a9a2557
No exact match on refs or tags, searching to describe
finished search at d8470b7c13e11c18cf14a7e3180f0b00e715e4f0
 annotated          5 v6.12-rc5
traversed 1309966 commits
v6.12-rc5-5-ge42b1a9a2557




e42b1a9a2557 merges v6.12-rc5 and 25f00a13dccf.

The latter is obviously, v6.12-rc2 + 4 commits.
v6.12-rc2 is an ancestor of v6.12-rc5.

I do not understand why git traverses 1.3 million commits.



--
Best Regards

Masahiro Yamada
Jeff King Oct. 31, 2024, 12:24 p.m. UTC | #4
On Thu, Oct 31, 2024 at 07:42:10AM -0400, Jeff King wrote:

> That works, but I have a feeling that figured out what the heck is going
> on with gave_up_on might produce a more elegant solution.

OK, I think I might have made some sense of this.

In finish_depth_computation(), we traverse down "list" forever, passing
flags up to our parents, until we find a commit that is marked with the
same "within" flag as our candidate. And then if everything left has
that same "within" flag set, we can bail.

So I _think_ the point is to basically count up what we'd get from this
traversal:

  $tag..$commit

where "$tag" is the candidate tag we found, and "$commit" is what we're
trying to describe (so imagine "git describe --match=$tag $commit").

We can't just use the depth we found while traversing down to $tag,
because there might be side branches we need to count up, too. And we
don't start a new traversal, because we'd be repeating the bits we
already went over when finding $tag in the first place.

And we feed that "list" from the original traversal state. So if we
break out of the traversal early but don't set gave_up_on, then we have
nothing in that state that holds the "within" flag. So we just walk all
of the commits down to the root, because nobody is propagating the flag
to them.

We have to feed at least one commit with the "within" flag into the
traversal so that it can let us end things. But I don't think it really
matters if that commit is the one we found, or if it's a parent of one
that we happened to pass "within" bits down to.

So I think we can just set "gave_up_on" to the final element we found
(whether from max_candidates or from finding every possible name). I.e.,
what I showed earlier, or what you were proposing.


I was also a bit puzzled how this works when there are multiple tags.
We feed only one "best" candidate to finish_depth_computation(), but
gave_up_on does not necessarily have its flag set. But I think that case
the point is that _some_ commit in the list does, and we literally add
every commit to that list.

I'm actually a bit skeptical that any of this is faster than simply
starting over a new traversal of $tag..$commit to find the depth, since
we are considering each commit anew. And there's a bunch of accidentally
quadratic bits of finish_depth_computation(). But frankly I'm somewhat
afraid to touch any of this more than necessary.

-Peff
Jeff King Oct. 31, 2024, 2:43 p.m. UTC | #5
On Thu, Oct 31, 2024 at 08:24:56AM -0400, Jeff King wrote:

> We have to feed at least one commit with the "within" flag into the
> traversal so that it can let us end things. But I don't think it really
> matters if that commit is the one we found, or if it's a parent of one
> that we happened to pass "within" bits down to.
> 
> So I think we can just set "gave_up_on" to the final element we found
> (whether from max_candidates or from finding every possible name). I.e.,
> what I showed earlier, or what you were proposing.

Hmph. So I don't think this is quite true, but now I'm puzzled again.

It is accurate to say that we must make sure _some_ commit with the
those flag bits set remains in "list". And I don't think it matters if
it's the candidate we found, or its parent.

But there's other stuff happening in that loop, after we process that
max candidate (where we'd proposed to break) but before we hit the next
possible candidate. Stuff like adding onto the depth of the other
candidates. Josh's example doesn't show that because it only has one
candidate, but I could imagine a case where it does matter (though I
didn't construct one).

So I'd have thought that this:

diff --git a/builtin/describe.c b/builtin/describe.c
index 7330a77b38..b0f645c41d 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -366,6 +366,12 @@ static void describe_commit(struct object_id *oid, struct strbuf *dst)
 		struct commit_name **slot;
 
 		seen_commits++;
+
+		if (match_cnt == max_candidates) {
+			gave_up_on = c;
+			break;
+		}
+
 		slot = commit_names_peek(&commit_names, c);
 		n = slot ? *slot : NULL;
 		if (n) {
@@ -381,10 +387,6 @@ static void describe_commit(struct object_id *oid, struct strbuf *dst)
 				if (n->prio == 2)
 					annotated_cnt++;
 			}
-			else {
-				gave_up_on = c;
-				break;
-			}
 		}
 		for (cur_match = 0; cur_match < match_cnt; cur_match++) {
 			struct possible_tag *t = &all_matches[cur_match];

would do it, by just finishing out the loop iteration and bailing on the
next commit. After all, that commit _could_ be a candidate itself. But
it causes a test in t6120 to fail. We have a disjoint history like this:

                 B
                 o
                  \
    o-----o---o----x
          A

and we expect that "x" is described as "A-3" (because we are including
the disjoint B). But after the patch above and with --candidates=2
(since there are only two tags and part of our goal is to limit
candidates to the number of tags), we find "B-4". Which is worse (at
least by some metrics).

I think this comes from 30b1c7ad9d (describe: don't abort too early when
searching tags, 2020-02-26). And given the problem description there, I
can see how quitting early in a disjoint history will give you worse
answers. But the patch above is triggering a case that already _could_
trigger.

So it feels like 30b1c7ad9d is incomplete. Without any patches, if I
limit it to --candidates=2 but make A^ a tag, then it gets the same
wrong answer (for the exact same reason). And I don't see a way to make
it correct without losing the ability to break out of the traversal
early when we hit max_candidates (which is obviously a very important
optimization in general). But maybe I'm missing something.

I do think my patch above is not introducing a new problem that wasn't
already there. It's just that the toy repo, having so few tags, means
any logic to reduce max_candidates will trigger there.

+cc the author of 30b1c7ad9d for any wisdom

-Peff
diff mbox series

Patch

diff --git a/scripts/setlocalversion b/scripts/setlocalversion
index 38b96c6797f4..bb8c0bcb7368 100755
--- a/scripts/setlocalversion
+++ b/scripts/setlocalversion
@@ -57,6 +57,8 @@  scm_version()
 		return
 	fi
 
+	githack=" --match=v6.11 --candidates=1"
+
 	# mainline kernel:  6.2.0-rc5  ->  v6.2-rc5
 	# stable kernel:    6.1.7      ->  v6.1.7
 	version_tag=v$(echo "${KERNELVERSION}" | sed -E 's/^([0-9]+\.[0-9]+)\.0(.*)$/\1\2/')
@@ -67,7 +69,7 @@  scm_version()
 	tag=${file_localversion#-}
 	desc=
 	if [ -n "${tag}" ]; then
-		desc=$(git describe --match=$tag 2>/dev/null)
+		desc=$(git describe --match=$tag $githack 2>/dev/null)
 	fi
 
 	# Otherwise, if a localversion* file exists, and the tag
@@ -76,13 +78,13 @@  scm_version()
 	# it. This is e.g. the case in linux-rt.
 	if [ -z "${desc}" ] && [ -n "${file_localversion}" ]; then
 		tag="${version_tag}${file_localversion}"
-		desc=$(git describe --match=$tag 2>/dev/null)
+		desc=$(git describe --match=$tag $githack 2>/dev/null)
 	fi
 
 	# Otherwise, default to the annotated tag derived from KERNELVERSION.
 	if [ -z "${desc}" ]; then
 		tag="${version_tag}"
-		desc=$(git describe --match=$tag 2>/dev/null)
+		desc=$(git describe --match=$tag $githack 2>/dev/null)
 	fi
 
 	# If we are at the tagged commit, we ignore it because the version is