mbox series

[00/23] pack-bitmap: bitmap generation improvements

Message ID cover.1605123652.git.me@ttaylorr.com (mailing list archive)
Headers show
Series pack-bitmap: bitmap generation improvements | expand

Message

Taylor Blau Nov. 11, 2020, 7:41 p.m. UTC
This series contains some patches that GitHub has been using in its fork
for the past few months to improve generating reachability bitmaps,
particularly in pathological cases, such as the repo containing all
forks of chromium/chromium.

The patches that follow are organized into five parts:

  - The first nine patches do some basic clean-up and fix a bug that we
    were able to exercise in tests while writing these patches.

  - The next two patches reimplements bitmap writing in order to avoid making
    multiple passes over the object graph. This approach ends up
    regressing both the time and memory used to generate bitmaps on the
    kernel's fork-network, but ends up being a useful stepping stone for
    further improvements.

  - The six patches that follow that culminates in a patch to build
    fewer intermediate bitmaps during the walk in order to reduce both
    memory and time for reasonably-sized repositories. (Which
    intermediate bitmaps are considered "important" is discussed in
    detail in the seventeenth patch).

  - The next several patches make reusing previously generated reachability
    bitmaps purely an optimization for generating new bitmaps. Importantly, that
    allows the bitmap selection process to pick better commits to bitmap going
    forward, rather than blindly reusing previously selected ones. They also
    include some light refactoring, and a patch to avoid tree walks when
    existing bitmaps suffice.

  - The final two patches address a trade-off in the prior patches between
    walking a wide history only once with high memory cost, and walking the same
    history multiple times with lower memory cost. Here, the walk is reduced to
    only cover the first-parent history. The final patch treats existing bitmaps
    as maximal in order to make it more difficult for a different set of
    selected commits to "walk around" the previously selected commits and force
    a large number of new bitmaps to be computed.

In the end, no block-buster performance improvements are attained on
normal-to-large sized repositories, but the new bitmap generation routine helps
substantially on enormous repositories, like the chromium/chromium fork-network.

Individual performance numbers are available in the patches throughout.

This series is a prerequisite to a list of other bitmap-related patches in
GitHub's fork, including multi-pack bitmaps.

Derrick Stolee (9):
  pack-bitmap-write: fill bitmap with commit history
  bitmap: add bitmap_diff_nonzero()
  commit: implement commit_list_contains()
  t5310: add branch-based checks
  pack-bitmap-write: rename children to reverse_edges
  pack-bitmap-write: build fewer intermediate bitmaps
  pack-bitmap-write: use existing bitmaps
  pack-bitmap-write: relax unique rewalk condition
  pack-bitmap-write: better reuse bitmaps

Jeff King (11):
  pack-bitmap: fix header size check
  pack-bitmap: bounds-check size of cache extension
  t5310: drop size of truncated ewah bitmap
  rev-list: die when --test-bitmap detects a mismatch
  ewah: factor out bitmap growth
  ewah: make bitmap growth less aggressive
  ewah: implement bitmap_or()
  ewah: add bitmap_dup() function
  pack-bitmap-write: reimplement bitmap writing
  pack-bitmap-write: pass ownership of intermediate bitmaps
  pack-bitmap-write: ignore BITMAP_FLAG_REUSE

Taylor Blau (3):
  ewah/ewah_bitmap.c: grow buffer past 1
  pack-bitmap: factor out 'bitmap_for_commit()'
  pack-bitmap: factor out 'add_commit_to_bitmap()'

 builtin/pack-objects.c  |   1 -
 commit.c                |  11 +
 commit.h                |   2 +
 ewah/bitmap.c           |  54 ++++-
 ewah/ewah_bitmap.c      |   2 +-
 ewah/ewok.h             |   3 +-
 pack-bitmap-write.c     | 452 +++++++++++++++++++++++++---------------
 pack-bitmap.c           | 130 +++++-------
 pack-bitmap.h           |   8 +-
 t/t5310-pack-bitmaps.sh | 164 ++++++++++++---
 10 files changed, 548 insertions(+), 279 deletions(-)

--
2.29.2.156.gc03786897f