mbox series

[v2,0/7] pack-bitmap: permute existing namehash values

Message ID cover.1631657157.git.me@ttaylorr.com (mailing list archive)
Headers show
Series pack-bitmap: permute existing namehash values | expand

Message

Taylor Blau Sept. 14, 2021, 10:05 p.m. UTC
Here is a small-ish rerool of my series to carry forward values from an existing
hash-cache when generating multi-pack reachability bitmaps.

Since last time, the performance tests in p5326 were cleaned up in order to
produce timings for generating a pack when using a MIDX bitmap with the
hash-cache extension.

The test-tool implementation in the second patch was also fixed to print the
correct OID for each name-hash. In the previous version, the order we looked up
read the hash-cache was according to pack order, when it should be in index
order. This bug still allowed the tests to pass, because the two orderings are
permutations of one another, so the expected and actual orderings were wrong in
the same way.

It is based on the `tb/multi-pack-bitmaps` topic, which graduated to master.
Thanks in advance for your review.

Taylor Blau (7):
  t/helper/test-bitmap.c: add 'dump-hashes' mode
  pack-bitmap.c: propagate namehash values from existing bitmaps
  midx.c: respect 'pack.writeBitmapHashcache' when writing bitmaps
  p5326: create missing 'perf-tag' tag
  p5326: don't set core.multiPackIndex unnecessarily
  p5326: generate pack bitmaps before writing the MIDX bitmap
  t5326: test propagating hashcache values

 Documentation/config/pack.txt      |  4 +++
 builtin/multi-pack-index.c         | 21 +++++++++++++++
 midx.c                             |  6 ++++-
 midx.h                             |  1 +
 pack-bitmap.c                      | 41 +++++++++++++++++++++++++-----
 pack-bitmap.h                      |  1 +
 t/helper/test-bitmap.c             | 10 +++++++-
 t/perf/p5326-multi-pack-bitmaps.sh |  8 +++---
 t/t5326-multi-pack-bitmaps.sh      | 32 +++++++++++++++++++++++
 9 files changed, 113 insertions(+), 11 deletions(-)

Range-diff against v1:
1:  918f9b275a ! 1:  4f2b8d9530 t/helper/test-bitmap.c: add 'dump-hashes' mode
    @@ pack-bitmap.c: int test_bitmap_commits(struct repository *r)
     +{
     +	struct bitmap_index *bitmap_git = prepare_bitmap_git(r);
     +	struct object_id oid;
    -+	uint32_t i;
    ++	uint32_t i, index_pos;
     +
     +	if (!bitmap_git->hashes)
     +		goto cleanup;
     +
     +	for (i = 0; i < bitmap_num_objects(bitmap_git); i++) {
    -+		nth_bitmap_object_oid(bitmap_git, &oid, i);
    ++		if (bitmap_is_midx(bitmap_git))
    ++			index_pos = pack_pos_to_midx(bitmap_git->midx, i);
    ++		else
    ++			index_pos = pack_pos_to_index(bitmap_git->pack, i);
    ++
    ++		nth_bitmap_object_oid(bitmap_git, &oid, index_pos);
     +
     +		printf("%s %"PRIu32"\n",
    -+		       oid_to_hex(&oid), get_be32(bitmap_git->hashes + i));
    ++		       oid_to_hex(&oid), get_be32(bitmap_git->hashes + index_pos));
     +	}
     +
     +cleanup:
2:  fa9f5633f6 = 2:  2cd2f3aa5e pack-bitmap.c: propagate namehash values from existing bitmaps
3:  be8f47e13c ! 3:  f0d8f106c2 midx.c: respect 'pack.writeBitmapHashcache' when writing bitmaps
    @@ builtin/multi-pack-index.c: static struct option *add_common_options(struct opti
     +	}
     +
     +	/*
    -+	 * No need to fall-back to 'git_default_config', since this was already
    -+	 * called in 'cmd_multi_pack_index()'.
    ++	 * We should never make a fall-back call to 'git_default_config', since
    ++	 * this was already called in 'cmd_multi_pack_index()'.
     +	 */
     +	return 0;
     +}
-:  ---------- > 4:  a8c6e845ad p5326: create missing 'perf-tag' tag
-:  ---------- > 5:  191922c8f2 p5326: don't set core.multiPackIndex unnecessarily
-:  ---------- > 6:  040bb40548 p5326: generate pack bitmaps before writing the MIDX bitmap
4:  acf3ec60cb ! 7:  fdf71432b3 t5326: test propagating hashcache values
    @@ Commit message
         the configuration to enable the hashcache at all, test that both of
         these function correctly by hardening their behavior with a test.
     
    +    Like the hash-cache in classic single-pack bitmaps, this helps more
    +    proportionally the more up-to-date your bitmap coverage is. When our
    +    bitmap coverage is out-of-date with the ref tips, we spend more time
    +    proportionally traversing, and all of that traversal gets the name-hash
    +    filled in.
    +
    +    But for the up-to-date bitmaps, this helps quite a bit. These numbers
    +    are on git.git, with `pack.threads=1` to help see the difference
    +    reflected in the overall runtime.
    +
    +        Test                            origin/tb/multi-pack-bitmaps   HEAD
    +        -------------------------------------------------------------------------------------
    +        5326.4: simulated clone         1.87(1.80+0.07)                1.46(1.42+0.03) -21.9%
    +        5326.5: simulated fetch         2.66(2.61+0.04)                1.47(1.43+0.04) -44.7%
    +        5326.6: pack to file (bitmap)   2.74(2.62+0.12)                1.89(1.82+0.07) -31.0%
    +
         Signed-off-by: Taylor Blau <me@ttaylorr.com>
     
      ## t/t5326-multi-pack-bitmaps.sh ##

Comments

Jeff King Sept. 16, 2021, 10:52 p.m. UTC | #1
On Tue, Sep 14, 2021 at 06:05:59PM -0400, Taylor Blau wrote:

> Here is a small-ish rerool of my series to carry forward values from an existing
> hash-cache when generating multi-pack reachability bitmaps.
> 
> Since last time, the performance tests in p5326 were cleaned up in order to
> produce timings for generating a pack when using a MIDX bitmap with the
> hash-cache extension.

I had seen most of these in prototype form before, but I hadn't yet
carefully looked at your cleanups for upstream. I gave them all a
careful read, and it all looks good to me.

I did have a few comments on patch 6 that I think are worth following up
on. We can also do so on top if you prefer.

-Peff