mbox series

[v2,00/10] gc: enable cruft packs by default

Message ID cover.1681850424.git.me@ttaylorr.com (mailing list archive)
Headers show
Series gc: enable cruft packs by default | expand

Message

Taylor Blau April 18, 2023, 8:40 p.m. UTC
Here is a very tiny reroll of my series to graduate `gc.cruftPacks` out
of `feature.experimental` and enables it by default.

A complete summary of the topic is available in the original cover
letter[1], and the changes since last time are limited to test clean-up,
patch reorganization, and some touch-ups on the patch messages
themselves.

As always, a range-diff is below for convenience.

Thanks for all of the review thus far, and in advance for any review on
this round, too.

[1]: https://lore.kernel.org/git/cover.1681764848.git.me@ttaylorr.com/

Taylor Blau (10):
  pack-write.c: plug a leak in stage_tmp_packfiles()
  builtin/repack.c: fix incorrect reference to '-C'
  builtin/gc.c: ignore cruft packs with `--keep-largest-pack`
  t/t5304-prune.sh: prepare for `gc --cruft` by default
  t/t6501-freshen-objects.sh: prepare for `gc --cruft` by default
  t/t6500-gc.sh: refactor cruft pack tests
  t/t6500-gc.sh: add additional test cases
  t/t9300-fast-import.sh: prepare for `gc --cruft` by default
  builtin/gc.c: make `gc.cruftPacks` enabled by default
  repository.h: drop unused `gc_cruft_packs`

 Documentation/config/feature.txt |   3 -
 Documentation/config/gc.txt      |  12 +--
 Documentation/git-gc.txt         |  12 +--
 Documentation/gitformat-pack.txt |   4 +-
 builtin/gc.c                     |   8 +-
 builtin/repack.c                 |   2 +-
 pack-write.c                     |  14 ++--
 repo-settings.c                  |   4 +-
 repository.h                     |   1 -
 t/t5304-prune.sh                 |  28 +++----
 t/t6500-gc.sh                    | 135 ++++++++++++++++---------------
 t/t6501-freshen-objects.sh       |  10 +--
 t/t9300-fast-import.sh           |  13 +--
 13 files changed, 120 insertions(+), 126 deletions(-)

Range-diff against v1:
 1:  65ac7ed9b8 =  1:  c477b754e7 pack-write.c: plug a leak in stage_tmp_packfiles()
 2:  fbc8d15032 =  2:  52fb61fa9c builtin/repack.c: fix incorrect reference to '-C'
 3:  796df920ad !  3:  63b9ee8e2e builtin/gc.c: ignore cruft packs with `--keep-largest-pack`
    @@ Commit message
           - The same is true for `gc.bigPackThreshold`, if the size of the cruft
             pack exceeds the limit set by the caller.
     
    -    Ignore cruft packs in the common implementation for both of these
    -    options, and add a pair of tests to prevent any future regressions here.
    +    In the future, it is possible that `gc.bigPackThreshold` could be used
    +    to write a separate cruft pack containing any new unreachable objects
    +    that entered the repository since the last time a cruft pack was
    +    written.
    +
    +    There are some complexities to doing so, mainly around handling
    +    pruning objects that are in an existing cruft pack that is above the
    +    threshold (which would either need to be rewritten, or else delay
    +    pruning). Rewriting a substantially similar cruft pack isn't ideal, but
    +    it is significantly better than the status-quo.
    +
    +    If users have large cruft packs that they don't want to rewrite, they
    +    can mark them as `*.keep` packs. But in general, if a repository has a
    +    cruft pack that is so large it is slowing down GC's, it should probably
    +    be pruned anyway.
    +
    +    In the meantime, ignore cruft packs in the common implementation for
    +    both of these options, and add a pair of tests to prevent any future
    +    regressions here.
     
         Signed-off-by: Taylor Blau <me@ttaylorr.com>
     
    @@ Documentation/git-gc.txt: be performed as well.
     -	All packs except the largest pack and those marked with a
     -	`.keep` files are consolidated into a single pack. When this
     -	option is used, `gc.bigPackThreshold` is ignored.
    -+	All packs except the largest pack, any packs marked with a
    -+	`.keep` file, and any cruft pack(s) are consolidated into a
    -+	single pack. When this option is used, `gc.bigPackThreshold` is
    -+	ignored.
    ++	All packs except the largest non-cruft pack, any packs marked
    ++	with a `.keep` file, and any cruft pack(s) are consolidated into
    ++	a single pack. When this option is used, `gc.bigPackThreshold`
    ++	is ignored.
      
      AGGRESSIVE
      ----------
 4:  44006da959 =  4:  905eeb9027 t/t5304-prune.sh: prepare for `gc --cruft` by default
 8:  4ccc525c39 !  5:  fa6eafb1fe t/t6501-freshen-objects.sh: prepare for `gc --cruft` by default
    @@ Commit message
         cover the case of freshening loose objects not using cruft packs.
     
         We could run this test twice, once with `--cruft` and once with
    -    `--no-cruft`, but doing so is unnecessary, since the object rescuing and
    -    freshening behavior is already extensively tested via t5329.
    +    `--no-cruft`, but doing so is unnecessary, since we already test object
    +    rescuing, freshening, and dealing with corrupt parts of the unreachable
    +    object graph extensively via t5329.
     
         Signed-off-by: Taylor Blau <me@ttaylorr.com>
     
 6:  56a965e517 =  6:  e6270d75fa t/t6500-gc.sh: refactor cruft pack tests
 7:  6957e54f51 !  7:  9db3fa9e36 t/t6500-gc.sh: add additional test cases
    @@ Commit message
         which enumerates all possible combinations of arguments that will
         produce (or not produce) a cruft pack.
     
    -    This prepares us for the following commit, which will change the default
    +    This prepares us for a future commit which will change the default value
         of `gc.cruftPacks` by ensuring that we understand which invocations do
         and do not change as a result.
     
    @@ t/t6500-gc.sh: do
      done
      
      for argv in \
    -+	"gc --no-cruft" \
    ++	"gc" \
     +	"-c gc.cruftPacks=false gc" \
     +	"-c gc.cruftPacks=true gc --no-cruft" \
      	"-c feature.expiremental=true -c gc.cruftPacks=false gc" \
 5:  1b07eb83fe !  8:  894cf176ea t/t9300-fast-import.sh: prepare for `gc --cruft` by default
    @@ Metadata
      ## Commit message ##
         t/t9300-fast-import.sh: prepare for `gc --cruft` by default
     
    -    In a similar fashion as the previous commit, adjust the fast-import
    -    tests to prepare for "git gc" generating a cruft pack by default.
    +    In a similar fashion as previous commits, adjust the fast-import tests
    +    to prepare for "git gc" generating a cruft pack by default.
     
         This adjustment is slightly different, however. Instead of relying on us
         writing out the objects loose, and then calling `git prune` to remove
    @@ Commit message
         one `git gc --prune`, which handles pruning both loose objects, and
         objects that would otherwise be written to a cruft pack.
     
    +    Likely this pattern of "git gc && git prune" started all the way back in
    +    03db4525d3 (Support gitlinks in fast-import., 2008-07-19), which
    +    happened after deprecating `git gc --prune` in 9e7d501990 (builtin-gc.c:
    +    deprecate --prune, it now really has no effect, 2008-05-09).
    +
    +    After `--prune` was un-deprecated in 58e9d9d472 (gc: make --prune useful
    +    again by accepting an optional parameter, 2009-02-14), this script got a
    +    handful of new "git gc && git prune" instances via via 4cedb78cb5
    +    (fast-import: add input format tests, 2011-08-11). These could have been
    +    `git gc --prune`, but weren't (likely taking after 03db4525d3).
    +
         Signed-off-by: Taylor Blau <me@ttaylorr.com>
     
      ## t/t9300-fast-import.sh ##
 9:  bfda40a21d !  9:  b6784ddfe2 builtin/gc.c: make `gc.cruftPacks` enabled by default
    @@ t/t6500-gc.sh: assert_no_cruft_packs () {
      }
      
      for argv in \
    +-	"gc --cruft" \
     +	"gc" \
    - 	"gc --cruft" \
      	"-c gc.cruftPacks=true gc" \
     -	"-c gc.cruftPacks=false gc --cruft" \
     -	"-c feature.experimental=true gc" \
    @@ t/t6500-gc.sh: assert_no_cruft_packs () {
      do
      	test_expect_success "git $argv generates a cruft pack" '
      		test_when_finished "rm -fr repo" &&
    -@@ t/t6500-gc.sh: done
    +@@ t/t6500-gc.sh: do
    + done
    + 
      for argv in \
    - 	"gc --no-cruft" \
    +-	"gc" \
    ++	"gc --no-cruft" \
      	"-c gc.cruftPacks=false gc" \
     -	"-c gc.cruftPacks=true gc --no-cruft" \
     -	"-c feature.expiremental=true -c gc.cruftPacks=false gc" \
10:  fa15125454 = 10:  c67ee7c2ff repository.h: drop unused `gc_cruft_packs`