diff mbox series

[05/10] p5303: measure time to repack with keep

Message ID b3b2574d4d9d10f226b52d81fe0e6bf1f761504e.1611098616.git.me@ttaylorr.com (mailing list archive)
State Superseded
Headers show
Series repack: support repacking into a geometric sequence | expand

Commit Message

Taylor Blau Jan. 19, 2021, 11:24 p.m. UTC
From: Jeff King <peff@peff.net>

This is the same as the regular repack test, except that we mark the
single base pack as "kept" and use --assume-kept-packs-closed. The
theory is that this should be faster than the normal repack, because
we'll have fewer objects to traverse and process.

And indeed, it is much faster in the single-pack case (all timings
measured on the kernel):

  5303.5: repack (1)                 57.29(54.88+10.39)
  5303.6: repack with keep (1)       1.25(1.19+0.05)

and in the 50-pack case:

  5303.10: repack (50)               89.71(132.78+6.14)
  5303.11: repack with keep (50)     6.92(26.93+0.58)

but our improvements vanish as we approach 1000 packs.

  5303.15: repack (1000)             217.14(493.76+15.29)
  5303.16: repack with keep (1000)   209.46(387.83+8.42)

That's because the code paths around handling .keep files are known to
scale badly; they look in every single pack file to find each object.
Our solution to that was to notice that most repos don't have keep
files, and to make that case a fast path. But as soon as you add a
single .keep, that part of pack-objects slows down again (even if we
have fewer objects total to look at).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 t/perf/p5303-many-packs.sh | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

Comments

Junio C Hamano Jan. 29, 2021, 3:40 a.m. UTC | #1
Taylor Blau <me@ttaylorr.com> writes:

> From: Jeff King <peff@peff.net>

Not a fault of this series at all, but before the precontext of the
first hunk, there is  


> diff --git a/t/perf/p5303-many-packs.sh b/t/perf/p5303-many-packs.sh
> index 277d22ec4b..85b077b72b 100755
> --- a/t/perf/p5303-many-packs.sh
> +++ b/t/perf/p5303-many-packs.sh
> @@ -27,8 +27,11 @@ repack_into_n () {

this construct:

	... |
	sed -n '1~5p' |
	head -n "$1" |
        ...

which is a GNUism.  Peff often says that very small population
actually run our perf suite, and this seems to corroborate the
conjecture.

>  	>pushes &&
>  
>  	# create base packfile
> -	head -n 1 pushes |
> -	git pack-objects --delta-base-offset --revs staging/pack &&
> +	base_pack=$(
> +		head -n 1 pushes |
> +		git pack-objects --delta-base-offset --revs staging/pack
> +	) &&
> +	test_export base_pack &&
>  
>  	# and then incrementals between each pair of commits
>  	last= &&
> @@ -87,6 +90,15 @@ do
>  		  --reflog --indexed-objects --delta-base-offset \
>  		  --stdout </dev/null >/dev/null
>  	'
> +
> +	test_perf "repack with keep ($nr_packs)" '
> +		git pack-objects --keep-true-parents \
> +		  --honor-pack-keep --assume-kept-packs-closed \
> +		  --keep-pack=pack-$base_pack.pack \
> +		  --non-empty --all \
> +		  --reflog --indexed-objects --delta-base-offset \
> +		  --stdout </dev/null >/dev/null
> +	'
>  done
>  
>  # Measure pack loading with 10,000 packs.
Jeff King Jan. 29, 2021, 7:32 p.m. UTC | #2
On Thu, Jan 28, 2021 at 07:40:40PM -0800, Junio C Hamano wrote:

> > diff --git a/t/perf/p5303-many-packs.sh b/t/perf/p5303-many-packs.sh
> > index 277d22ec4b..85b077b72b 100755
> > --- a/t/perf/p5303-many-packs.sh
> > +++ b/t/perf/p5303-many-packs.sh
> > @@ -27,8 +27,11 @@ repack_into_n () {
> 
> this construct:
> 
> 	... |
> 	sed -n '1~5p' |
> 	head -n "$1" |
>         ...
> 
> which is a GNUism.  Peff often says that very small population
> actually run our perf suite, and this seems to corroborate the
> conjecture.

Oops. Looks like I was the one who introduced that. Nobody seems to have
complained, so I'm somewhat tempted to leave it. But it would not be too
hard to replace with perl, I think.

-Peff
Junio C Hamano Jan. 29, 2021, 8:38 p.m. UTC | #3
Jeff King <peff@peff.net> writes:

> On Thu, Jan 28, 2021 at 07:40:40PM -0800, Junio C Hamano wrote:
>
>> > diff --git a/t/perf/p5303-many-packs.sh b/t/perf/p5303-many-packs.sh
>> > index 277d22ec4b..85b077b72b 100755
>> > --- a/t/perf/p5303-many-packs.sh
>> > +++ b/t/perf/p5303-many-packs.sh
>> > @@ -27,8 +27,11 @@ repack_into_n () {
>> 
>> this construct:
>> 
>> 	... |
>> 	sed -n '1~5p' |
>> 	head -n "$1" |
>>         ...
>> 
>> which is a GNUism.  Peff often says that very small population
>> actually run our perf suite, and this seems to corroborate the
>> conjecture.
>
> Oops. Looks like I was the one who introduced that. Nobody seems to have
> complained, so I'm somewhat tempted to leave it. But it would not be too
> hard to replace with perl, I think.

Yeah, but would it be worth it?  I am actually OK to say that you
need GNU sed if you want to run perf.  We already rely on GNU time
to run perf tests, no?
Jeff King Jan. 29, 2021, 10:10 p.m. UTC | #4
On Fri, Jan 29, 2021 at 12:38:08PM -0800, Junio C Hamano wrote:

> > Oops. Looks like I was the one who introduced that. Nobody seems to have
> > complained, so I'm somewhat tempted to leave it. But it would not be too
> > hard to replace with perl, I think.
> 
> Yeah, but would it be worth it?  I am actually OK to say that you
> need GNU sed if you want to run perf.  We already rely on GNU time
> to run perf tests, no?

True. This one is a little worse because it's subtle, and somebody might
copy it unknowingly into the regular test suite.

I am happy to leave it, or for you to pick up the patch I sent earlier
(which I did verify produces identical output).

-Peff
Junio C Hamano Jan. 29, 2021, 11:12 p.m. UTC | #5
Jeff King <peff@peff.net> writes:

> On Fri, Jan 29, 2021 at 12:38:08PM -0800, Junio C Hamano wrote:
>
>> > Oops. Looks like I was the one who introduced that. Nobody seems to have
>> > complained, so I'm somewhat tempted to leave it. But it would not be too
>> > hard to replace with perl, I think.
>> 
>> Yeah, but would it be worth it?  I am actually OK to say that you
>> need GNU sed if you want to run perf.  We already rely on GNU time
>> to run perf tests, no?
>
> True. This one is a little worse because it's subtle, and somebody might
> copy it unknowingly into the regular test suite.
>
> I am happy to leave it, or for you to pick up the patch I sent earlier
> (which I did verify produces identical output).

Yeah, I would be very unhappy if somebody copied-and-pasted it, but
somehow I didn't think too many people moved code in that direction
;-)

Will apply the portability fix, then.

Thanks.
diff mbox series

Patch

diff --git a/t/perf/p5303-many-packs.sh b/t/perf/p5303-many-packs.sh
index 277d22ec4b..85b077b72b 100755
--- a/t/perf/p5303-many-packs.sh
+++ b/t/perf/p5303-many-packs.sh
@@ -27,8 +27,11 @@  repack_into_n () {
 	>pushes &&
 
 	# create base packfile
-	head -n 1 pushes |
-	git pack-objects --delta-base-offset --revs staging/pack &&
+	base_pack=$(
+		head -n 1 pushes |
+		git pack-objects --delta-base-offset --revs staging/pack
+	) &&
+	test_export base_pack &&
 
 	# and then incrementals between each pair of commits
 	last= &&
@@ -87,6 +90,15 @@  do
 		  --reflog --indexed-objects --delta-base-offset \
 		  --stdout </dev/null >/dev/null
 	'
+
+	test_perf "repack with keep ($nr_packs)" '
+		git pack-objects --keep-true-parents \
+		  --honor-pack-keep --assume-kept-packs-closed \
+		  --keep-pack=pack-$base_pack.pack \
+		  --non-empty --all \
+		  --reflog --indexed-objects --delta-base-offset \
+		  --stdout </dev/null >/dev/null
+	'
 done
 
 # Measure pack loading with 10,000 packs.