mbox series

[RFC,0/8] repack: avoid MIDX'ing cruft pack(s) where possible

Message ID cover.1744413969.git.me@ttaylorr.com (mailing list archive)
Headers show
Series repack: avoid MIDX'ing cruft pack(s) where possible | expand

Message

Taylor Blau April 11, 2025, 11:26 p.m. UTC
This is a short-ish series I wrote today while thinking through an idea
that Peff and I were talking about yesterday that allows us to avoid
MIDX'ing any cruft pack(s) in a repository when repacking.

The core of the idea is to introduce a variant of the '--stdin-packs'
option in 'pack-objects'. The existing behavior is to create a pack
whose contents is the set difference between the specified included and
exclude packs. The new mode (which I'm calling --stdin-packs=follow)
tweaks the namehash traversal we do at the end of --stdin-packs to also
pick up and pack objects which were reachable from commits in the above
set difference, but don't appear in the included or excluded pack.

If you repack consistently using this strategy, you can guarantee that
the union of geometrically-repacked packs are closed under reachability
without having to keep track of any cruft pack(s) in the MIDX.

I'm pretty sure that this is all sound, having played with it for the
better part of the day and not being able to come up with any
counter-examples. I'm sending this as an RFC because I'm not sure if
there's an obvious case that I am missing that makes this whole idea
bogus.

Code-review is welcome, but I think at this stage it may be more useful
to center the discussion around whether or not the idea makes sense
first.

Thanks in advance :-).

Taylor Blau (8):
  pack-objects: use standard option incompatibility functions
  pack-objects: limit scope in 'add_object_entry_from_pack()'
  pack-objects: factor out handling '--stdin-packs'
  pack-objects: declare 'rev_info' for '--stdin-packs' earlier
  pack-objects: perform name-hash traversal for unpacked objects
  pack-objects: introduce '--stdin-packs=follow'
  repack: keep track of existing MIDX'd packs
  repack: exclude cruft pack(s) from the MIDX where possible

 Documentation/git-pack-objects.adoc |   8 +-
 builtin/pack-objects.c              | 193 +++++++++++++++++-----------
 builtin/repack.c                    |  97 +++++++++++---
 t/t5331-pack-objects-stdin.sh       | 103 ++++++++++++++-
 t/t7704-repack-cruft.sh             |  70 ++++++++++
 5 files changed, 376 insertions(+), 95 deletions(-)


base-commit: 485f5f863615e670fd97ae40af744e14072cfe18